I'm looking at adding some containerd tables. Unfo...
# linux
a
I'm looking at adding some containerd tables. Unfortunately the API is via grpc so we'd require grpc as a dependancy and either pregenerate the C++ with protoc or use protoc at build time, does anyone have opinions?
z
I think it's reasonable to use the grpc API (it's connecting only over a unix domain socket, not a true remote API). Between pregenerate and build time, I don't have much preference... Whatever is cleaner/easier sgtm.
cc @alessandrogario @Stefano Bonicatti
a
I seem to recall what was the hiccup, I think the grpc library is rather heavy and it comes with several additional libraries
They would all need to be converted to CMake
Going by memory: at least absl, cares, protobuf and probably a couple of additional ones
They have a 22k lines CMakeLists.txt file
We have an impl of that in our https://github.com/trailofbits/osquery fork, though it's from 2020 and was no longer updated
I think a PR was never made because the amount of code and dependencies it added was quite big
z
Are you opposed to adding it? Or just warning that it's nontrivial?
a
I think it's going to generate some maintenance work if we want to do it properly
compared to C++, golang could implement this more easily
I think the major fault here is the grpc C++ SDK which is quite messy and even lacks features (async was only half-implemented)
s
I suspect that the go side still pulls in a lot of libs, but the build system hides that mess from you.
a
Yeah, that's amost certainly true. It's probably still pulling in protoc
z
We could definitely build these tables as a Go extension, but I suspect folks might be disappointed it doesn't go into core osquery?
Or, how crazy would it be to try to build a library in Go and then static link and call that from osquery?
a
I'm not sure how much Go would like that, it needs to run its gc
z
That would open the floodgates to tons of other tables being built in Go, which would certainly make building osquery core tables much more accessible.
a
I think it depends on how much cgo is present in there
z
Cursory searching indicates Go can be called from C++ but I have no idea how that works with GC.
a
if there's cgo involved, then we would have to convert everything anyway for portable builds
s
That would be new ground. It feels like an uncomfortable compromise. We’d do it in core, but the dependancies are a hassle. It could be an extension, but there’s value in core.
a
meaning: cgo would have to go through the osquery-toolchain, otherwise the build won't work on all systems
If it is pure golang, then I think it should be quite a lot easier
s
I’m not sure that’s a good idea (static linking Go libraries to osquery, not even sure it’s possible). If anything I recall that Go doesn’t like to have binaries be stripped
a
ah interesting, is that for the reflection?
z
Everyone else in this thread is more qualified than me to comment on what the best way to handle the build system issues is, but I do think getting this into core is important for osquery's narrative around containers.
a
What would be your opinion on shipping core extensions?
we could even think about having a dedicated startup logic that does not require using --extension or estensions autoload
z
As long as it works out of the box in osquery I think it works. But extensions IME have been really flaky in a thousand different ways.
s
This might not be the right thread, but how have you found them flakey?
z
registration/deregistration issues (often "duplicate blah blah blah" errors), permissions issues, processes not starting/exiting together I've seen all sorts of issues across Launcher, Orbit, and other custom extensions.
s
In the last year, I’ve probably hammered out a bunch of the launcher ones. It’s not perfect, but I found an old, deep race condition.
If we start from the premise that we want
containerd
tables in osquery, and the only API they provide is grpc, then it feels like an unpleasant choice between: 1. pulling in a lot of extensions. And TBH this is hard with our cmake setup 2. Cross language stuff in core. 3. extensions I honestly have no idea what I think. (2) and (3) both unlock some really interesting possibilities. But I think we’d need to think about how to make them feel right.
a
I can fairly easily parse the JSON state files that runc produces but it's a somewhat disjoint set of information from what I can find from the containerd api
a
It has always seemed like apache thrift is the major culprit, but I am not sure
s
Thrift always feels janky. But launcher had internal race conditions
a
And I always wondered how much of an effort would it be to essentially reimplement what we have in the extensions folder, and replace Thrift with something else
s
A large ex-employer had a lot of issues with thrift at scale.
we could use grpc. It’s both awful, and would solve the dependency issue! 😆
z
grpc is by far the standard these days
a
I don't think it would have to be super fancy, sending JSON through a pipe would not be too hard
and we would no longer have issues with dealing with libraries which only support one language or the other
z
Yeah, we don't necessarily need any of the features of grpc or thrift
s
Actually, I’m not sure. grpc was new and exciting a bit ago. But I feel like I’m seeing people backing off from it.
a
we had projects in C++ using it internally, and we are going away from it due to how buggy the C++ sdk is
and how bad the build system they use is
But on what seph mentioned: I agree that being able to ship core features in other languages would be nice
s
I think the benefit that things like grpc, thrift, protobufs, etc provide is a structured wire protocol. Clear fields and versioning. That’s hard to do with json
a
That is true, though I don't remember when the protocol was changed last time
z
Back to the topic at hand... Are folks opposed to bringing containerd tables into core with grpc? We can easily build osquery extensions and bundle them with orbit, but I do feel that it will be a miss for the osquery brand.
a
it's at least 5 new big dependencies that we have to rewrite in Cmake; I somehow tend to avoid adding new dependencies but I can also see how containerd could be interesting
s
I feel like (1) is okay, but work for people who aren’t me. If y’all are okay with the dependency foot print, I’m okay with it. I think (2) is really interesting and powerful. But I’m not sure how to approach it or what the tradeoffs are. It certainly opens doors.
a
I've dealt with the build system before and can probably figure out (1), though it's a bit more work than I originally bargained for
I've ported @Stefano Bonicatti’s patch forward to osquery HEAD. It's definitely not in a mergable state but I should be able to clean it up and submit it if all the authors are fine with that (I'd want confirmation so that we don't have copyright uncertainties) https://github.com/artemist-work/osquery/tree/containerd-events
a
sorry for the late reply; i don't think there's a copyright/license issue (cc @Mike Myers). co-authored-by can be added to include all the original contributors
a
I included that, though there is still the issue of the proto files. I'll work on pulling those from a submodule since I think Apache-2 is incompatible with Apache-2 + GPL