Or just literally the process_event types… ie, not...
# general
j
Or just literally the process_event types… ie, not any of the notify types (like login/authorization?)
s
For all the event publishers the types are the ones that have a relative table. So right now we have process events and FIM, therefore those are the only events.
j
Ah, I seem to have misunderstood and thought osquery was a direct way to log any/all ES events
Oh, sorry the added bit is we slurp up a local output of queries to our SIEM
Well, i guess a follow up is, is osquery interested in being a generic “query any and all ES events” thing, or is it more “osquery has a clear set of goals, and we are only using the parts of ES that further that” (which is fine)
s
The quick answer is, osquery would definitely like to have additional events being gathered, but the current architecture and ideas of how data have to presented do not want for osquery to do that in a generic way. Or basically to have osquery just be a direct event collector, with no data processing. There have been discussions about changing some of that, but they are still ongoing, and ultimately it's also a matter of someone having the time to do changes, if a different path is decided. The long answer is a bit involved, that are many aspects at play here. First of all osquery until now is of the idea that the data should be presented/gathered via tables and SQL is used to query those tables. Additionally those tables should not just be something very generic where you can dump everything (because you could obviously think of having for instance a
type
and
data
columns which can contain every type of event), but a schema has to be a bit more specialized to be immediately useful. Also, depending on the event type, where possible they ideally have to be augmented with other information that may come from previous events. For instance if you have a
fork
event and you want to always see the command line arguments of the process that's forking, you need to keep a cache of that data that you can for instance collect from previous `exec`s, keyed by
pid
, so that you can recall it. Finally each event type might need a slightly different logic to be collected, so all in all we need code to do that, and the event types that are currently present are the ones that have been contributed. I think definitely osquery would like to have increased visibility. Back on the topic of being a more generic/vanilla collector, and how event data is presented, there have been some discussions, mostly due to the performance of the current implementation, which often doesn't permit osquery to collect events in high traffic machines. Events have to be stored in an intermediate place so that they can be queried, and they gets processed multiple times, which adds to the overhead. There's no conclusion yet, and while personally I (and others in the community) think that osquery definitely needs a performance improvement in that area, I'm not officially representing the roadmap. Normally these discussions happen during office hours.
@sharvil has worked the most on the macOS side, and was also the one that introduced the ES tables/events. He recently augmented the ES FIM table so that you can specify specific files/paths to be monitored. He may have some ideas about this too (shameless ping).
j
Yeah, we are talking a little about this over on the macadmins slack
I appreciate the complexities here
The ES events really really want to be self-joined against themselves to be able to grab info about parent processes, etc
Using the pid as sort of an ephemeral primary key
s
I should mention, since we are discussing this, that I'm working on adding parent cmdline, cmdline count and path to the
es_process_events
table. Also thanks to that cache, I'm augmenting the
fork
and
exit
events to get the cmdline and count for the process, since currently they were empty. That comes free thanks to the cache.
j
I appreciate that very much