Hi Team!
Could please anyone explain inner working of ebpf implementation inside osquery/events/linux/bpf/ - as far as i can see, there is a system state tracker, which is also used to track several file operations (and connect them to corresponding PID). Does this file operations are necessary for process/socket tracing or they just used to enrich state/tables?
a
alessandrogario
03/09/2021, 11:13 AM
Hey! I can provide some information on this; whenever a binary is executed, we possibly have to deal with some unknowns
1. execve(absolute_path, ...)
2. execve(relative_path, ...)
3. execveat(AT_FDCWD, relative_path, ...)
4. execveat(fd=somedir, relative_path, ...)
5. execveat(fd=?, absolute_path)
alessandrogario
03/09/2021, 11:14 AM
Here are the possible conditions we will find ourselves into:
1. absolute path (easy)
2. path relative to the current working directory
3. path relative to a directory specified with a file descriptor
alessandrogario
03/09/2021, 11:15 AM
2 implies that we need to know when the process is changing the cwd; there are three different ways to do it:
• chdir(absolute_path)
• chdir(relative_path)
• fchdir(file descriptor of a directory)
alessandrogario
03/09/2021, 11:16 AM
3 has the same problem, we need to be able to easily translate fd -> path
alessandrogario
03/09/2021, 11:17 AM
So the way to solve this, was to have osquery start the event collection, then take a procfs snapshot once and keep it up to date with the incoming events
alessandrogario
03/09/2021, 11:18 AM
This minimizes the need to access procfs which is not synch'd with the event stream we receive from BPF (i.e.: we could get an event for a process that is no longer visible from procfs because it has terminated)
alessandrogario
03/09/2021, 11:19 AM
there are however many ways to work with file descriptors; they can be created, destroyed, duplicated..; we have to track it all if we want to be able to reliably translate file descriptors to file/folder paths
alessandrogario
03/09/2021, 11:21 AM
For socket events the situation is similar; all those system calls such as connect() or listen() accept file descriptors
alessandrogario
03/09/2021, 11:21 AM
being able to track them means we can sometimes tell if a socket was bind() before listen() so we can show more information in the table
alessandrogario
03/09/2021, 11:22 AM
the general theme here is that BPF is a pure system call tracer in this case, so we can only see what happens to be passed (or returned) by the syscall we are monitoring
alessandrogario
03/09/2021, 11:23 AM
compared to Audit, where each event has many decorators that provide everything we need for free (like fd -> path mapping with AUDIT_PATH records)
alessandrogario
03/09/2021, 11:23 AM
we really don't have much in the event stream, only what the applications are passing to the functions
alessandrogario
03/09/2021, 11:23 AM
Let me know if this answer the your questions! I'm happy to keep chatting about BPF 🙂