Hi Team! Could please anyone explain inner working...
# ebpf
h
Hi Team! Could please anyone explain inner working of ebpf implementation inside osquery/events/linux/bpf/ - as far as i can see, there is a system state tracker, which is also used to track several file operations (and connect them to corresponding PID). Does this file operations are necessary for process/socket tracing or they just used to enrich state/tables?
a
Hey! I can provide some information on this; whenever a binary is executed, we possibly have to deal with some unknowns 1. execve(absolute_path, ...) 2. execve(relative_path, ...) 3. execveat(AT_FDCWD, relative_path, ...) 4. execveat(fd=somedir, relative_path, ...) 5. execveat(fd=?, absolute_path)
Here are the possible conditions we will find ourselves into: 1. absolute path (easy) 2. path relative to the current working directory 3. path relative to a directory specified with a file descriptor
2 implies that we need to know when the process is changing the cwd; there are three different ways to do it: • chdir(absolute_path) • chdir(relative_path) • fchdir(file descriptor of a directory)
3 has the same problem, we need to be able to easily translate fd -> path
So the way to solve this, was to have osquery start the event collection, then take a procfs snapshot once and keep it up to date with the incoming events
This minimizes the need to access procfs which is not synch'd with the event stream we receive from BPF (i.e.: we could get an event for a process that is no longer visible from procfs because it has terminated)
there are however many ways to work with file descriptors; they can be created, destroyed, duplicated..; we have to track it all if we want to be able to reliably translate file descriptors to file/folder paths
For socket events the situation is similar; all those system calls such as connect() or listen() accept file descriptors
being able to track them means we can sometimes tell if a socket was bind() before listen() so we can show more information in the table
the general theme here is that BPF is a pure system call tracer in this case, so we can only see what happens to be passed (or returned) by the syscall we are monitoring
compared to Audit, where each event has many decorators that provide everything we need for free (like fd -> path mapping with AUDIT_PATH records)
we really don't have much in the event stream, only what the applications are passing to the functions
Let me know if this answer the your questions! I'm happy to keep chatting about BPF 🙂