Title
#kolide
r

Ryan

07/14/2020, 3:48 PM
Hi everyone - I’m trying to debug why a query pack isn’t being scheduled on some hosts in Fleet. We have 1021 hosts online, I’ve scheduled a query pack with one query in it to run on all of them once per day. I’m only seeing 823 hosts returning results in
/tmp/osquery_result
. I tried running the same query on-demand against some of the missing hosts and they worked fine, but if I run
SELECT * FROM osquery_schedule
they return successfully, but with no records. Does anyone have any suggestions? Thanks 🙂
4:02 PM
I’m going to try the tip here to add the
SELECT * FROM time
query to a single pack and see if that works. https://github.com/kolide/fleet/blob/master/docs/infrastructure/faq.md#troubleshooting
1:52 PM
So, result of my test there, getting the same number of hosts returning data here, 823, and if I run an adhoc query of
SELECT * FROM osquery_schedule
I get the same result. What could cause a query pack to fail to be scheduled by osquery on certain nodes?
1:54 PM
CC @zwass and @Macear (hope you don’t mind, but I saw you added a ‘thumbs up’ to my original message 😄)
zwass

zwass

07/15/2020, 4:00 PM
How are you targeting the pack? Does the query have any platform set? Shard?
r

Ryan

07/16/2020, 9:34 AM
I’ve set it to All Hosts, no platform or shard is set.
9:36 AM
I did the targeting in the Pack itself though, not the Query as that seems to be only for running it on demand.
9:36 AM
Query is this:
9:37 AM
and this is the Pack:
Macear

Macear

07/16/2020, 9:40 AM
@Ryan have the same behavior. Here is my issue https://github.com/kolide/fleet/issues/2260 Then I try to update pack, and logs from some servers went. But on some servers there are no scheduled queries so far. I did create new pack targeted only on problematic servers, will see
10:54 AM
In my case restart of osqueryd helps, but it’s very uncomfortable. I tested on centos6 (osquery installed).
r

Ryan

07/16/2020, 11:05 AM
you mean restarting the daemon makes the scheduled packs work? does it need restarting periodically, or is this only needed after you schedule a new query?
Macear

Macear

07/16/2020, 11:29 AM
For me it works after only one restart of the daemon. Strange
r

Ryan

07/16/2020, 4:07 PM
Yeah that is weird!
4:08 PM
So you would create the query pack, then restart the daemon, and from then on it will work fine with no need for further restarts?
zwass

zwass

07/16/2020, 4:24 PM
It would be good to run osquery with
--verbose --tls_dump
on an effected host and see if the host is checking in for configs and Fleet is sending the correct packs to that host. What you describe with the restart of osqueryd has me suspecting perhaps an osquery bug.
r

Ryan

07/16/2020, 4:25 PM
Sure thing, what’s the best way to get osquery to log something locally to try this?
4:25 PM
without breaking the logging back to fleet I mean
zwass

zwass

07/16/2020, 4:25 PM
--verbose --tls_dump
in the osquery flags
4:26 PM
then it will log requests and responses to stderr
r

Ryan

07/16/2020, 4:26 PM
aha
4:26 PM
ok great
4:26 PM
I will give this a try tomorrow 👍
1:47 PM
Ok so an update from my experimentation on this issue - with
--verbose
and
--tls_dump
in place I don’t see any particular errors, but nor do I see anything that mentions scheduled queries. I can see the distributed queries used for our labels coming in on the affected host, but it doesn’t appear to be receiving any packs to run on a schedule. I tried upgrading it from
4.3.0
to
4.4.0
and fully restarting it, but no joy sadly. I’m running Fleet version
2.6.0
. Any other suggestions will be greatly appreciated 😃
Macear

Macear

07/22/2020, 1:56 PM
@Ryan check the following option: --pack_refresh_interval=3600 And also check the “shard” parameter of the queries in your pack. Drop it’s value if it’s set Maybe this will help
r

Ryan

07/22/2020, 2:15 PM
The pack refresh option isn’t set at all in my flags file.
2:15 PM
I can add it explicitly if that’ll help?
2:16 PM
Just verified, the shard parameter is not set.
Macear

Macear

07/22/2020, 3:29 PM
Honestly, have no idea what should fix this issue. You can add this flag or also you may try to check your configuration by this command: osqueryd —flagfile <path_to_flags> --config_dump
3:30 PM
Ideally, your queries should be in output included. Otherwise I advise you to create new issue
r

Ryan

07/22/2020, 3:30 PM
Ok, thanks, I’ll give this a try too 🙂
3:31 PM
I’ll let it run overnight with the additional flag above and come back tomorrow. Thanks for the help so far!
zwass

zwass

07/22/2020, 4:22 PM
Do you see the config requests and responses? That interval is configured by
config_refresh
. You'd want to look for whether the returned config includes the expected packs.
r

Ryan

07/22/2020, 4:40 PM
I’ll check that tomorrow as I’m wrapping up now, thanks for the tip!
Macear

Macear

07/28/2020, 12:57 PM
r

Ryan

07/28/2020, 1:02 PM
Cool thanks @Macear