hey team, something that I've run into personally ...
# core
hey team, something that I've run into personally is the desire to export • osqueryd details (version, uptime, etc.) • osquery results (from packs and whatnot) to a prometheus endpoint. I see an outstanding feature request for the former here: https://github.com/osquery/osquery/issues/5541 I also see this as a possible alternative to osquery's remote enrollment + logging API Was wondering if I had a desire to do this, 1. would this be useful for others, 2. what kind of data should be exported, 3. and is there starting off point for how I can code this support into osquery? (i.e. adding another library/ dependency, making that a flag, adding tests, language preference, etc. I'm going over CONTRIBUTING.md in the meantime)
Hi! 👋 This seems like it could be a pretty interesting thing to explore. But, there are a couple of pretty big hurdles to figure out along the way. I’m excited to see what you come up with… I’ve seen a handful of people interested in some kind of osquery/prometheus link. So you’re not alone…
Adding libraries is certainly possible, but there are a lot of moving parts. https://github.com/osquery/osquery/pull/7160 is a recently PR. I don’t remember if we have any docs.
How it works seems like a big open question. As I understand it. prometheus operates on a pull model. Endpoints track metrics, generally simple numbers. And prometheus scrapes them. In contrast, osquery operates on a push model, and what it pushes is pretty arbitrary json. This highlights two mismatches. First, the push vs pull one. Osquery generally follows a push model, vs prometheuses pull model. We are generally going to be very hesitant to move osquery to a listening daemon. That creates a host of security problems. Second, the data typing. I think of prometheus as something like a straight metrics system. Not a log processor. But osquery generates something closer to json logs. While I can imagine writing queries that produced output that more closely matches prometheus, I think it’s a fairly small niche. Both of these issues may be better solved by using an intermediary to handle the log ingest, and then export to prometheus. Not totally sure though, if there’s a common export endpoint, it might be pretty small work to create a log plugin that can push data to it. There’s clearly interesting research work here…
Thanks for the context! I've definitely got a lot of learn and read up on for sure. If one were to pursue a proof of concept to get the hang of working with the osquery codebase, what would be the "most minimum but representative" extent of work? re:
using an intermediary to handle the log ingest, and then export to prometheus
I guess something of that sort exists in https://github.com/zwopir/osquery_exporter -- but unfortunately is very de-coupled from osquery and osqueryd itself. Essentially it just runs queries and pushes that to a port which prometheus is configured to scrape. We've used that at work for a time by just publishing values and generating alerts off thresholds relating to that. While useful, the package does seem a bit defunct and having osquery accomplish this without the need of another package may be preferred I also do understand where the osquery team is coming from regarding a push (osquery) vs pull (prometheus) model, hence the preference for an intermediary. Curious if osquery can publish results in a prometheus-friendly manner (file? pipe? stream?) and then a intermediary can serve as the middle-layer of presenting that info to a port
Glancing at that really quickly… It’s invoking
as to run a single query, parse the answer, and stash it in an exporter.
Which is, IMO, a reasonable beginning. but would be better done as either a log destination, TLS servers, or osquery extension.
👍 1
Curious if osquery can publish results in a prometheus-friendly manner
I think that’s a viable approach. but is a bit of a research project. What intermediary? What format? Etc… I could imagine that as a logger plugin. I could also imagine that as a TLS server.
I might suggest a good way to start is just to get a feel for osquery, and the kind of data people usually collect.
I’m not sure what the right way to get into the code is. Probably depends a bit on what you’re familiar with. I write a lot of extension code in go. And I do some table work in the core c++
also re: intermediary publisher it's also worth asking, if osquery itself is sending a prometheus-friendly output somewhere just for it to be published by another program, what is the virtue/resistance of not just sending that to a network socket directly
There was a bit of a conversation about this at osquery office hours a bit ago. The minutes from that are https://hackmd.io/OpICAxxJTkGNAH7rzX5Wjw and you could review the youtube recording
this is excellent, thanks! will take a moment to review
osquery has a variety of logging plugins that send to a network socket — kinesis for exampe
Writing anothger one would be reasonable enough.
I think a bit piece is just working through the theoretical differences outlined earlier