I am facing an issue with process events table when a pid is osquery #general

I am facing an issue with process_events table, wh...

vaar

06/19/2020, 9:03 PM

I am facing an issue with process_events table, when a pid is reused the join with different tables can provide wrong results, did you ever think to implement an additional random (larger than uint16) pid? most of the EDRs implement this to avoid pid reuse

zwass

06/19/2020, 9:13 PM

This is an interesting one... I'd be curious to know more about how other tools deal with it. Can you open an issue on GitHub with some description of that?

seph

06/21/2020, 12:56 PM

I'm curious how that works in practice, and how it helps.

Mike Myers

06/22/2020, 9:56 PM

Maybe osquery could tag the process events with an additional randomly generated ID for uniqueness in the logs, but I don't think it would avoid the issue you're seeing, which I bet is a race condition between two point-in-time queries that constitute the JOIN. This is a design limitation, I think. @alessandrogario might know differently

seph

06/22/2020, 9:57 PM

I’m hesitant to suggest add a ulid, without understanding the problem it would solve. I don’t think it would help correlate between osquery tables. Best case. it provides a unique identifier for external systems. but said external systems should be able to make their own unique identifiers

alessandrogario

06/22/2020, 10:01 PM

How would an additional ID solve this issue? We need a way to map pid -> internal-pid-that-is-never-reused -> back to pid

alessandrogario

06/22/2020, 10:02 PM

we can come up with a way to generate it, but when joining we always end up using a standard pid once again

seph

06/22/2020, 10:02 PM

Yes, exactly. And it’s extra state to track

alessandrogario

06/22/2020, 10:02 PM

i.e.: we decide to use SPECIAL_UUID, convert it to pid and then access /proc/<pid>

alessandrogario

06/22/2020, 10:02 PM

but that pid may have been reused anyway

alessandrogario

06/22/2020, 10:03 PM

there should be a setting named max_pid that can be tweaked

alessandrogario

06/22/2020, 10:03 PM

(in linux)

alessandrogario

06/22/2020, 10:03 PM

best option would be to stop reusing them, but I'm not sure if the kernel can be configured to avoid that

alessandrogario

06/22/2020, 10:06 PM

maybe there's something we can do, but i have to test some stuff first

seph

06/22/2020, 10:07 PM

AFAIK the underlying os apis use pid for correlation. So I don’t see what there is to do

Mike Myers

06/22/2020, 10:13 PM

I somewhat recall PID-reuse behavior being OS-specific too

vaar

06/24/2020, 1:41 PM

yeah, it is not easy to solve with osquery, some EDRs build an internal mapping with process start time and pid, to have an unique identifier for process in case of pid reuse, so in osquery can be easy to have it in processes table, but not for other tables with pid field.

alessandrogario

06/24/2020, 1:52 PM

Yeah I actually had an idea for this the other day: 1. We add support for a secondary process id (internally we can use pid.timestamp so that we don't have state to track) 2. Add support for using that ID in SQL 3. Update our utilities that scan /proc so that: opendir on the pid folder under proc in order to lock it, then fstat to check for the timestamp, return as ENOENT if they don't match cc @theopolis

👍 2

alessandrogario

06/24/2020, 1:54 PM

if it's interesting, we can open a blueprint issue so that people can weight the pros and cons in implementing something like this

seph

06/24/2020, 5:40 PM

Is the intent there so that if you join between two tables, something implied (like the pid timestamp) will prevent joins from breaking? That seems clever. Though I wonder what problem we’re solving.

seph

06/24/2020, 5:41 PM

Is it joins in a short time interval? Or is it archival data in some SIEM

alessandrogario

06/24/2020, 5:43 PM

How short the interval is depends on max_pid, osquery event expiration, how often events are generated, how often scheduled queries are hitting those tables

alessandrogario

06/24/2020, 5:45 PM

but yeah a blueprint could give us more feedback from users

alessandrogario

06/24/2020, 5:45 PM

it doesn't seem like archival is required now (unless I'm wrong)

6 Views

Open in Slack

Previous Next