We can restrict it on a service level in the OS, b...
# core
j
We can restrict it on a service level in the OS, but haven’t found a way to limit it properly in osquery.
s
I think I know where is the issue, osquery is using the normal (clean) shutdown workflow for killing a misbehaving worker, which has longer timings. You can lower the reaction to 6 seconds if you set
--alarm_timeout=10
. Unfortunately you can’t get lower than that; I’ll make a fix which should land in 5.2.2.
j
Thanks! That’s great to hear.
s
@seph, @J I’ve opened a PR: https://github.com/osquery/osquery/pull/7474
🙌 1
@J just to clarify one thing though, I see that in the logs you’ve given it is logging the attempt to forcefully kill the process via a kill signal (although pretty late), so the machine/VM isn’t dead yet. Was that taken on a machine that was surviving due to swap or? I mean I could maybe still see the that the kill signal itself may be handled too late to interrupt the high memory usage, but otherwise I’m not aware of issues beyond that where the watcher is not killing the worker when it finally sends a SIGKILL instead of a SIGTERM.
I’m asking to understand if there might be a second problem or not
j
@Hans Care to elaborate here? You know this better than I do!
But as far as I know, this test machine did not suffer the same fate as our production machines, in that core services were not killed due to memory exhaustion. It is indeed the issue of the kill signal being handled too late when memory is already exhausted to a point where it will destabilize other services running on the server.
By the time it sends SIGKILL the damage is already done and the production software (not osquery) has already gone down due to memory exhaustion.
s
I see, totally makes sense!
j
We’ve only been able to reproduce it with massive queries like the one in the example, where so much is going on that memory builds too quickly for SIGKILL to save the situation.
Great!
h
Hi, Our crashes occured because of out of memory (and no swap) when running a heavy query (log4j, as presented by @J earlier in this thread). The above log entries are from another server where we are doing the debugging to find out the entire event list to avoid messing with production. The watchdog triggers as expected, but with the log4j query we ended up with a race condition where it opened several hundred JAR files and ran out of memory. After 10 seconds of attempting to take down the process nicely, a kill takes place and that works as expected - but those 10 seconds under this (and probably other) conditions is too much. We've added some configuration to prevent this from happening and give the watchdog enough time to react between "nice stop" and killing the process (see below), but it makes any query using Yara take forever. # Yara config --yara_malloc_trim=true --yara_delay=10000 --alarm_delay=6 But we still end up with 10 seconds delay between nice and forced kill.
It seems that Yara starts working through the JAR files it finds (several hundred), and does not release the memory when done with the files. With the default 50msec yara_delay, we ran out of RAM extremely quickly and the servers crashed. We're replicating this on our staging node, and even with yara_malloc_trim and yara_delay being this high - the memory usage just keeps building up.
s
@Hans: I see; I’m not sure where
alarm_delay
comes from but
alarm_timeout
can be brought down from 15 seconds to 10 seconds via
--alarm_timeout=10
. This means that due to some internal calculations osquery will wait 6 seconds instead of 11 to do a SIGKILL.
alarm_timeout
is really meant to control the clean shutdown logic and provide a deadline to that, unfortunately it is currently tied also to the resource limit kill. But with the PR above I changed so that osquery kills the process as soon as the watchdog notices the limit being hit.
h
Will test the alarm_timeout as well.
There might be two things here that are not working as expected though. The watchdog is too slow between nice stop and kill, since in 10 seconds, we've loaded up several gigs of RAM. This is hopefully fixed in v5.2.2 if you get your changes in. Fingers crossed. But, the specific log4j query and the insane memory usage that occurs confuses me a bit. Does it try to load all JARs into RAM all at once, or could this be a separate thing?
I added the --yara_malloc_trim config option to see if that made any difference, but no change - so without particular insight in the code, I am guessing we're just not releasing memory between each JAR load from this particular query? The PR you made would get us protected from killed servers in the future, if combined with --yara_delay - which is what we want to avoid so we are very grateful for that. But, it won't let us run this (or simular) query due to excessive memory usage. ponders
s
My expectation would be that it scans them serially, but I’m not too sure what is causing it, would need to dig into it. Might also be a leak. Could you open an issue for this?
h
I would expect the same actually which confused me a bit. Happy to open an issue.
s
thanks!
h
Thank you for the quick PR for the watchdog @Stefano Bonicatti! 🙂
Just following up, issue created: https://github.com/osquery/osquery/issues/7475