Title
#ebpf
zwass

zwass

07/25/2022, 6:29 PM
What's the status of container support for the bpf evented tables?
a

alessandrogario

07/26/2022, 11:26 AM
I'm slowly improving it, but I currently don't have as much time as I would like to work on it
11:27 AM
but it's going to be an improvement on both cpu/memory usage and also less limitations on how many parameters we can get
11:27 AM
more characters in the paths (like, working dir/binary path)
11:28 AM
and then container names
11:28 AM
I have a PoC but it's not yet ready for a PR
11:32 AM
It is based on this library: https://github.com/trailofbits/btfparse
11:32 AM
it lets us import kernel types dynamically from the /sys pseudo-dir
11:33 AM
so it will always be up to date, and requires no dependencies on the system (like kernel header packages)
3:12 AM
This updates the library and the execsnoop example. I am not entirely sure it is working, but I can see it is capturing the cgroup names:
timestamp: 9119873753834 thread_id: 13561 process_id: 13561 uid: 0 gid: 0 cgroup_id: 22467 exit_code: 0 probe_error: 0 duration: 224574 cgroup_name: system.slice
  execve(filename: /usr/lib/NetworkManager/dispatcher.d/20-chrony-onoffline, argv: { /usr/lib/NetworkManager/dispatcher.d/20-chrony-onoffline, podman0, down })
3:14 AM
This is using the BTF to LLVM bridge class that I wrote as a test, which is not pretty
3:14 AM
and the btfparse library seeks around a lot while parsing and there is no cache in front of it so it's slow
zwass

zwass

08/03/2022, 4:16 PM
cc @Artemis Tosini
a

Artemis Tosini

08/03/2022, 4:27 PM
Thanks! I need to do some setup but I'll try to get this running today to test
a

alessandrogario

08/03/2022, 5:12 PM
I'm building it this way:1. I've placed the osquery-toolchain in /opt/osquery-toolchain 2. Set the TOOLCHAIN_PATH env var to point to it 3. Configured with
cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain.cmake -G Ninja -DCMAKE_BUILD_TYPE=Debug -DEBPFPUB_BUILD_EXAMPLES=true
4. To build and test, I am using:
cmake --build build && sudo ./build/examples/execsnoop/execsnoop
a

Artemis Tosini

08/03/2022, 6:04 PM
I'm getting cgroup names but for some reason it's not getting all processes so I only get
user.slice
stuff
6:07 PM
Not sure why, AArch64 linux still only has execve and execveat
a

alessandrogario

08/03/2022, 7:40 PM
ah I see, I guess we'll have to debug it! thanks for trying it out
8:36 PM
So it is working, but only on Ubuntu
8:40 PM
Here's a recording on my Ubuntu system
8:41 PM
This is not ideal, I will try to debug this further
9:45 PM
What distro and container tech was used to perform the test? I wonder if I can easily replicate it locally
9:46 PM
Were other breakages noticed?
a

Artemis Tosini

08/04/2022, 2:21 PM
I was using Ubuntu Server AArch64 and docker, I didn't see any other breakages
a

alessandrogario

08/04/2022, 5:03 PM
I should have more time for this after work, so I'll try it again on different distros; I'll focus on Intel first then look at AArch64
5:26 PM
Sorry for the delay! I wonder if I am reading the wrong field here
5:27 PM
going to try with
cgroups.subsys[0].cgroup.kn.parent.name
instead of
cgroups.subsys[0].cgroup.kn.name
5:28 PM
oooh it seems to have fixed it here on my Fedora VM!
timestamp: 375109372265 thread_id: 2705 process_id: 2705 uid: 0 gid: 0 cgroup_id: 8574 exit_code: 0 probe_error: 0 duration: 201273 cgroup_name: libpod-1aa46f32bbc6e946c757359f
  execve(filename: /usr/bin/date, argv: { date })
5:29 PM
I'm using something weird here, it's not docker but a replacement that is auto-suggested by Fedora
5:29 PM
Seems to be called "podman-docker"
5:30 PM
I'll commit the change and try it again on Ubuntu with docker CE
5:35 PM
I pushed the change!
5:46 PM
I'm in a meeting but as soon as I'm done, I'll try this out on Ubuntu x86
a

Artemis Tosini

08/05/2022, 6:46 PM
Now I'm not getting any cgroup name
timestamp: 433808650127 thread_id: 1994 process_id: 1994 uid: 0 gid: 0 cgroup_id: 4815 exit_code: 18446744073709551614 probe_error: 0 duration: 9875 cgroup_name: 
  execve(filename: /usr/local/sbin/gzip, argv: { gzip, -d })
a

alessandrogario

08/05/2022, 6:47 PM
it seems like it's in two different places depending on the system
a

Artemis Tosini

08/05/2022, 6:47 PM
Uggh
a

alessandrogario

08/05/2022, 6:47 PM
I got to reinstall my ubuntu vm because for some reason both of the ones I have just give me back 404 for most of the repositories
6:48 PM
but i was able to try the fix on another ubuntu 21.10
6:48 PM
with the fix, fedora 36 works but ubuntu 21.10 breaks
6:48 PM
without the fix, it's the opposite
a

Artemis Tosini

08/05/2022, 6:51 PM
I guess this isn't part of the "don't break userspace" rule
a

alessandrogario

08/05/2022, 6:53 PM
Maybe it's my code that sucks and has something broken or a wrong assumption
6:53 PM
I'm reinstalling my VM, maybe 21.10 was not supported and they both died? how weird
a

Artemis Tosini

08/05/2022, 7:11 PM
journald has a way of doing this and it might be possible to do what they do, though I have another bug to work on today
7:16 PM
Okay, here's what they do: https://github.com/systemd/systemd/blob/main/src/basic/cgroup-util.c#L700 It seems like they're reading it out of /proc which at least works
a

alessandrogario

08/05/2022, 8:45 PM
super interesting
8:46 PM
this helped a lot!
8:46 PM
we could get the whole tree
8:47 PM
on my Fedora VM:
cat /proc/2002/cgroup
0::/machine.slice/libpod-e8b7ef8bd7f6f63ade2c862891ccf424ded27d35c20e536990285d79af7309a4.scope/container
8:47 PM
on ubuntu
cat 10401/cgroup 
0::/system.slice/docker-6dfb8c85100710cf6ac7ac15f311cd37fea4e97b7e05251d120b855aa39ae84b.scope
a

Artemis Tosini

08/05/2022, 8:47 PM
Yeah, that also worked on my Ubuntu ARM VM
a

alessandrogario

08/05/2022, 8:47 PM
so it probably has to do with the backend used
8:47 PM
i could just take the whole tree instead of just one entry
a

Artemis Tosini

08/05/2022, 8:49 PM
I'm guessing Fedora gave you podman
a

alessandrogario

08/05/2022, 9:01 PM
Yes, it seems like it's a docker CLI emulator
9:32 PM
I (force) pushed a new commit
9:32 PM
It is now capturing a slice of both the current and parent kernfs nodes
9:32 PM
So we get something like this:
timestamp: 2834861341276 thread_id: 3305 process_id: 3305 uid: 0 gid: 0 cgroup_id: 8997 exit_code: 0 probe_error: 0 duration: 167751
cgroup_name: libpod-46e9fc7dd73d128e58f34561, container

  execve(filename: /usr/bin/date, argv: { date })
9:35 PM
On ubuntu + docker, it will look like this
timestamp: 56308112557 thread_id: 2617 process_id: 2617 uid: 0 gid: 0 cgroup_id: 8271 exit_code: 0 probe_error: 0 duration: 182578
cgroup_name: system.slice, docker-456ab67280a837f4f36514c8

  execve(filename: /usr/bin/date, argv: { date })
9:37 PM
We could switch one of the probes (maybe the execve/execveat ones are good?) to enable cgroup names and update how the system state tracker works to propagate it
9:37 PM
so that we don't add the additional 64 bytes to all possible events