Title
#ebpf
h

hubabuba

03/07/2021, 10:49 PM
Hi Team! Could please anyone explain inner working of ebpf implementation inside osquery/events/linux/bpf/ - as far as i can see, there is a system state tracker, which is also used to track several file operations (and connect them to corresponding PID). Does this file operations are necessary for process/socket tracing or they just used to enrich state/tables?
a

alessandrogario

03/09/2021, 11:13 AM
Hey! I can provide some information on this; whenever a binary is executed, we possibly have to deal with some unknowns1. execve(absolute_path, ...) 2. execve(relative_path, ...) 3. execveat(AT_FDCWD, relative_path, ...) 4. execveat(fd=somedir, relative_path, ...) 5. execveat(fd=?, absolute_path)
11:14 AM
Here are the possible conditions we will find ourselves into:1. absolute path (easy) 2. path relative to the current working directory 3. path relative to a directory specified with a file descriptor
11:15 AM
2 implies that we need to know when the process is changing the cwd; there are three different ways to do it: • chdir(absolute_path) • chdir(relative_path) • fchdir(file descriptor of a directory)
11:16 AM
3 has the same problem, we need to be able to easily translate fd -> path
11:17 AM
So the way to solve this, was to have osquery start the event collection, then take a procfs snapshot once and keep it up to date with the incoming events
11:18 AM
This minimizes the need to access procfs which is not synch'd with the event stream we receive from BPF (i.e.: we could get an event for a process that is no longer visible from procfs because it has terminated)
11:19 AM
there are however many ways to work with file descriptors; they can be created, destroyed, duplicated..; we have to track it all if we want to be able to reliably translate file descriptors to file/folder paths
11:21 AM
For socket events the situation is similar; all those system calls such as connect() or listen() accept file descriptors
11:21 AM
being able to track them means we can sometimes tell if a socket was bind() before listen() so we can show more information in the table
11:22 AM
the general theme here is that BPF is a pure system call tracer in this case, so we can only see what happens to be passed (or returned) by the syscall we are monitoring
11:23 AM
compared to Audit, where each event has many decorators that provide everything we need for free (like fd -> path mapping with AUDIT_PATH records)
11:23 AM
we really don't have much in the event stream, only what the applications are passing to the functions
11:23 AM
Let me know if this answer the your questions! I'm happy to keep chatting about BPF 🙂