<#C08V7KTJB|general> Can someone explain me the ex...
# general
d
#general Can someone explain me the expected behavior when events_expiry flag is used in osquery config for daemon mode? we have set the value to 1800 sec and expected that the events store in tables like file_events, hardware_events to expire within in 30 min after the event has been read. But this is not happening in our system. Which events does this flag cater to ?
l
According to the docs, the
events_expiry
flag controls the lifetime of buffered events. https://osquery.readthedocs.io/en/stable/development/pubsub-framework/#query-and-table-usage
According to the docs, the
events_expiry
flag controls the lifetime of buffered events. By default, it's set to 1 day. Have you invoked any further optimizations? https://osquery.readthedocs.io/en/latest/development/pubsub-framework/#query-and-table-usage
s
The documentation is currently slightly imprecise regarding which value is used. The
events_expiry
value is used (in scheduled queries) only when the value is higher than an internal calculation which is per evented table and takes all the scheduled queries intervals on evented tables, for each evented table takes the maximum interval, multiplies it by 3 and then goes to the next multiple of a minute. If that values is higher than the
events_expiry
, then it will use that calculated value, otherwise
events_expiry
will be used. Practical example: You have 3 queries, 2 on
hardware_events
and 1 on
process_events
, the queries intervals are 30, 45 and 60 seconds respectively. Now between 30 and 45, 45 is chosen. Then 45 * 3 = 135. Then the next multiple of 60 is 180, so it is set to 180. For the
harware_events
table, any value of
events_expiry
lower than 180 won't be taken. For the
process_events
table the value ends up being the same because 60 * 3 = 180. Finally you can see in the logs which one is used because you should have a message like this when the
events_expiry
value has not been used:
Copy code
I0404 23:53:50.431721 865360 eventfactory.cpp:352] The minimum events expiration timeout for hardware_events has been adjusted: 180
Now, given the high values of expiration, I assume you do not have a query interval that with that internal calculation ends up being higher than that. You have to consider though that there isn't a timer for the expiration, but expiration happens when a query on that table happens, not just when the events is read once. So, you need to keep querying the table, and at each query, osquery will check if it can expire events. That been said, there have also been a couple of bugs with the event expiration in the past, @Divya which osquery version are you using?
l
@Stefano Bonicatti, thank you for the wealth of information. Given there are more details about
events_expiry
, can I raise a PR to update the docs?
d
@Stefano Bonicatti thanks for explaining it in detail. 1. version of package: We are using 4.8.0-1 version of osquery(linux). As per the docs the bugs have been fixed in this version. Correct me if my understanding is wrong! 2. We query the tables at 15 min interval, But our queries are not scheduled using the schedule section in the .conf file. We have a docker container written in golang using the "github.com/kolide/osquery-go" package. we run individual queries using the query function and use go-cron to schedule them. Will this make the
events_expiry
not get respected? Another general question, I am confused as to when to use
events_max
as well. From a few posts I have read, I have an understanding looks like both the flags should be used together for the expiry to work. Is this true?
s
@Linda Zhou Updates to the docs are always appreciated, please go ahead! There are other things that are not described in detail though. I should again stress that what I described is only true for scheduled queries, meaning that that calculation and the fact that expiration happens on
SELECT
is only for scheduled queries. An additional detail though is that expiration happens after all the queries against a table have been executed. Obviously scheduled queries go into a infinite loop, so everytime all the queries against an evented table have been run, expiration of old events happens. But there are also other two places where events are expired. One is at osquery startup, the other is when new events are being added. Every 256 event batches (more on batches later), expiration of old events is triggered Then there's
events_optimize
, which is described as
apply optimizations when SELECTing from events-based tables, enabled by default.
This is true, but misses details like the fact that it only works with scheduled queries again, and how it works is that for each scheduled query, it keeps track of the event tiime of the most recent queried event, so that if that same scheduled query runs again, then only newer events are returned. This might be maybe confused with event expiration, but events in the database are still there, so if a different query on the same table runs, it will return "old" events again, once. For
events_max
, the default is actually 50k and they are not events as in rows, but event batches. Depending on how an event publisher has been written and depending on how many events were being collected by the publisher at a certain point in time, multiple events could be written, as an optimization, as a single batch. So what that values exactly means is a bit variable today.. it doesn't always map to 50k events. Expiration of event batches that have gone beyond the limit happens at startup or when new events come (every 256).
So @Divya, this should answer your question I believe around
events_max
which is not a must for event expiration to work. But if you're using a distributed query mechanism then events expiration (of old events, or events that have gone beyond the max threshold), will happen only if events keep coming, or if you restart osquery. So something like
hardware_events
might keep its events for a very long time, beyond what was configured in
events_expiry
Sorry* now I should correct myself (I was retesting a couple of things, because there are some unexpected interactions that makes the shell behave differently than the daemon). The expiration on
SELECT
does work for non scheduled queries, but if there's a scheduled query on the same table too, then expiration will happen only if that scheduled query runs. Meaning that if you're in the shell, running queries against a table that also has a scheduled query, given that in the shell the scheduler does not run, then events will never expire.
So for queries coming through an extension, those should trigger expiration of old events, as long as either there aren't scheduled queries for the same table, or the scheduled query has already run. @Divya With the explanation above,
events_max
is not needed for events expiration, it's just another way to control the amount of events in the database. As for the osquery version, there was an additional fix in 4.9.0 which was preventing expiration when new events would come.
@Divya going back to your problem, as far as I understand you don't have scheduled queries on the same table at all? Also, how are you detecting that events are not being expired; are you receiving old events through the query well after those 30mins have expired?
d
yes for both your questions. We are querying the time stamp when event happens and we saw events from previous days also
s
I see, this is indeed unexpected. Maybe I missed something in the logic, but I would open an issue with all the information you can. I also double checked the fixes on event expiration, and to mitigate the issue there's actually an additional fix in 5.1.0, so that you should be at least covered by event expiration when events are gathered. In general we suggest not to use outdated versions, because we also update libraries which can have CVEs. Be sure to read through the notes though since 5.0.1 has some breaking changes
In any case though, expiration on
SELECT
not happening, especially if there's no scheduled query sounds like a bug
d
Thanks, will gpo through the 5.1.0 docs
Should I open an issue in github?
s
Yes please!
d
l
"because we also update libraries which can have CVEs" What is a CVE?
s
“Common Vulnerabilities and Exposures”, it’s a method to let people know that some versions of a software/library have security issues. Normally each security issue gets a CVE id. For instance some of the CVEs coming from libexpat (the 2021 and up) that osquery will fix in 5.3.0 https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=expat
einstein 1
d
@Deepak
d
Hi @Stefano Bonicatti, is 4.x maintained as a separate release line than 5.x or would all bug fixes only be on the latter? Because, as we wait for issue https://github.com/osquery/osquery/issues/7546, we wanted to know if migrating to 5.x is the only option especially when there are breaking changes.
s
Hello @Deepak, no the releases are always incremental
so upgrading to a newer version is required