I’m wanting to build a query where I add the uptim...
# general
d
I’m wanting to build a query where I add the uptime to a query as a column, irregardless of how many rows. My use-case is the file_events table, so that I can ignore those where uptime < n minutes in SIEM. I’m kinda not great at building such a query, could someone give me a hand?
select * from file_events …. join something from uptime…
?
t
So given uptime is just returned at the point of query time you'd need to do a bit of math on the join to calculate the uptime offset for each file event. This assumes in seconds
Copy code
SELECT fe.*, 
       (u.total_seconds - (unix_timestamp() - fe.event_time)) as event_uptime_offset
FROM file_events fe
CROSS JOIN uptime u;
p
I did this just recently, which ended up looking like the following:
Copy code
SELECT file_events.*, uptime.total_seconds AS uptime, users.username
    FROM file_events, uptime
    LEFT JOIN users
      ON file_events.uid = users.uid
t
@Peter your approach correctly adds the current system uptime as a column to each file event, but it doesn't account for the system uptime at the exact time each file event occurred. Instead, it attaches the same uptime value (the current uptime at the point of query execution) to all file events at the time the query runs. Is that what you intended? Any idea how to align it with @Daniel Cross use case?
p
Yeah, in this case our FIM events are emitted as frequently as we can with osquery and flushed to our log pipeline. It does mean there the potential for a delta between the uptime annotated on the event and when the event occurred, but this is only used for coarse filtering of events which occurred while the resource was still within N minutes of power-in - in order to eliminate boot-time churn we expect to see for certain workloads. As you mention, if you want exact uptime entries for each event, you’ll need to do the math either at query time to ensure that the time on each event is accurate for the individual events 🙂
Great call out though! 😄
t
I used to run queries like this every 30s or less so it wouldn't matter if the uptime mismatch was that far off. Not sure if thats doable for eiteher of you
👍 1
s
This feel like the kind of thing you might want to use decorators for
👍 1
p
Fair, though it'll also depend on event load, as each decorated field will add overhead. As a concrete example, we avoid decorators where absolutely not required due to this. Currently we generate somewhere between 650 to 900 million osquery events a day, so any new decorators are very carefully considered as they add overhead to every one of those events. Our rubric is that if it doesn't yield use outside of a single set of detections, or similar use cases, it gets a new column instead.
I mainly mention this as it's easy to lose track of how much decorators are costing to store and process very easily. We made that mistake in the past, so it's good to be cognisant of so you don't end up like we did 😀
s
It’s a good point — though a lot depends on the load, and what processing those events entails.
💯 1
d
Thanks for the query @Peter , this is perfect. Great to hear some insight about the use of decorators also.