Title
#fleet
b

benbass

03/10/2021, 3:24 PM
I am running into an issue and I am not sure where to start looking and in which log files. I have a query that runs fine on individual machines and ran fine across almost 1500 machines last night, but is getting a few hosts in (12, 21, and 5) and then the query hangs and you get nothing. We have let it run for 15+ minutes, where last night it finished in about 7 total with the last 3 or 4 minutes results dribbling in. My gut says something is hanging, and I am not sure if it is on the fleet server, or one of the clients causing issues.
zwass

zwass

03/10/2021, 3:32 PM
Are you able to identify those hosts and run osquery with
--verbose --tls_dump
to check whether they receive the query from Fleet? Will they respond to other queries?
b

benbass

03/10/2021, 3:33 PM
Thing is I am not sure which hosts are being targeted when it hangs. I am targeting the macOS label and have about 1545 hosts online atm.
3:34 PM
I tried this from both fleet servers behind our load balancer, and from a dedicated admin console that is not getting host check ins, but is part of the same instance (same db/redis setup)
zwass

zwass

03/10/2021, 3:36 PM
How does a
select 1
live query work against that same set of hosts?
b

benbass

03/10/2021, 3:40 PM
seems to be the same. I got 7 responses in the first few seconds and then nothing.
zwass

zwass

03/10/2021, 3:43 PM
Targeted towards the 'All Hosts' label? How many hosts are online?
b

benbass

03/10/2021, 3:43 PM
to the built in macOS label
3:44 PM
1584 hosts.
zwass

zwass

03/10/2021, 3:46 PM
If you've only got about 7 responding, can you pick one of the nonresponders and get a shell on that box where you'll be able to run
--verbose --tls_dump
for osquery?
b

benbass

03/10/2021, 3:47 PM
Thing is if I scope it to one of the machines that hasn’t responded it works just fine. My mac is not responding to that, but if I run it directly to mine it works.
3:48 PM
It feels like something is blocking the progress.
3:56 PM
Moved to DM’s.