Title
#fleet
n

nyanshak

03/08/2021, 11:51 PM
Label / Additional host query question <thread>
11:53 PM
Currently (as I understand it):
osquery_label_update_interval
and
osquery_detail_update_interval
both default to
1h
. An idea to improve this: • support per-query intervals (e.g., in the list of label queries, allow queryA to run once per hour and queryB runs once every 24h) I know that intervals here are highly correlated to load, so this might not actually be feasible.
11:58 PM
Probably separate idea: • For label / detail queries, it would be nice to support
run-once
semantics (to only collect the data one time). It would be good to be able to have different intervals for labels for different Teams. I expect at least the interval part of this will already be supported when teams comes out 🤔 • For scheduled queries / if fleet adds auto-updating client (like launcher), it would be good if there were a way for the launcher to support
run-immediately
(optionally?) where scheduled queries are run once right away, as well as at every
interval
afterward. This is a perpetual annoyance of mine at osquery, and I have some ideas around why it's not supported right now 🤷‍♀️
Noah Talerman

Noah Talerman

03/16/2021, 3:36 PM
Hey @nyanshak why would it be helpful to support per-query update intervals? Are both the details and label use cases related to the updating the groups of hosts that are targeted? Is the motivation for
run-once
semantics for labels / details similar to the motivation for manual labels? From my understanding this motivation is tied to the result set never updating Lastly, why would it be helpful to have the
run-immediately
option? What annoyance would this help you get rid of.
n

nyanshak

03/16/2021, 3:47 PM
why would it be helpful to have the 
run-immediately
 option?
There's some queries that we want to run regularly, but infrequently. Because osquery waits until the first
interval
has passed to execute the query the first time, this means that ephemeral hosts will never execute some queries. Even though we don't want these queries to run super-frequently (every 10 minutes, every hour, whatever), we do want them to run.
3:49 PM
run-once
^ There are certain attributes in some environments that we never change. To be more specific, in this env, hosts are created from an AMI and are immutable. If there is ever any change, a new AMI would be build, the original instances destroyed, and new ones spun up to replace the original (old versioned) instances.
3:49 PM
And we could reduce the overhead of running osquery by only collecting the information once, as that's the only time it would be necessary.
3:52 PM
why would it be helpful to support per-query update intervals?
Mostly that there are some attributes that never change or very infrequently change, so it's not useful to run at the same interval as other queries.
Are both the details and label use cases related to the updating the groups of hosts that are targeted?
Sort of related, not exactly the same. For details: collecting metadata about a host, some of which could only infrequently change (and needs different intervals). For labels: this is definitely for targeting hosts, but some attributes very rarely change, and we want to reduce overhead where we can.
Noah Talerman

Noah Talerman

03/16/2021, 5:09 PM
Even though we don’t want these queries to run super-frequently (every 10 minutes, every hour, whatever), we do want them to run.
This makes sense. Zach is planning on bringing up this use case during today’s osquery office hours (I believe office hours just started). The thought is that it makes sense for the solution you proposed, or a similar solution, to make its way into osquery
n

nyanshak

03/16/2021, 5:10 PM
👍 can't attend but may watch the recording after
Noah Talerman

Noah Talerman

03/16/2021, 5:11 PM
Sweet!
5:18 PM
I also now better understand the pain point for unnecessarily updating attributes. Thank you for the explanation. Is solving the “reducing osquery overhead” problem closer to a nice-to-have? A somewhat related question: Does providing proof of reducing osquery overhead allow for increased confidence in osquery’s performance and thus an easier time convincing other individuals of installing the agent? Or is there a different ultimate goal for the reducing of overhead?
n

nyanshak

03/16/2021, 5:21 PM
Does providing proof of reducing osquery overhead allow for increased confidence in osquery’s performance and thus an easier time convincing other individuals of installing the agent?
Yes
is there a different ultimate goal for the reducing of overhead?
Ultimately, CPU cycles and memory used by osquery are costs incurred by thousands of machines, and may be the difference between (for example) running a t2.micro and t2.small instance (or whatever equivalent single upgrade instance class would be). It also costs more in network & storage costs, etc. While on an individual query basis, the cost may be small, but in aggregate with many queries across many hosts, the cost becomes larger and more meaningful.
Noah Talerman

Noah Talerman

03/16/2021, 7:09 PM
While on an individual query basis, the cost may be small, but in aggregate with many queries across many hosts, the cost becomes larger and more meaningful.
Got it. This is all new to me so I have some reading to do on costs incurred by machines. This is an awesome breadcrumb to start that research