It seems like the watchdog logic continues to caus...
# core
It seems like the watchdog logic continues to cause unexpected behavior from a high level “how this this all work” perspective. Thoughts on disabling it by default and at the same time emitting a warning that it is off with a link to how to turn it on (a flag) and the implications?
I want to understand some of the implications of this. So I think the change is:
Copy code
current default: --disable_watchdog=false

proposed default: --disable_watchdog=true
1. What are scenarios where you wouldn't want to use the watchdog vs tuning it? 2. Is it possible to change this setting from TLS config? Like, if I start with the new default (watchdog disabled), then connect to get TLS config that tells me to enable watchdog, does that actually work? I've found a handful of settings don't work properly when you try to configure them remotely.
Correct, that is my proposal. Other options we can compare / contrast include (a) keeping the watchdog enabled but making the default limits more relaxed; (b) adding more notification and awareness around what the watchdog is and when it "steps in", (c) adding a better API between watchdog and the 'worker' process to achieve more meaningful logs
There is varying amounts of work for each of course.
To answer (2), it's not really possible since one goal of the watchdog is to do as much work as possible in the 'worker'.
Re: (2) - then I'd be pretty worried about making this change in default behavior. I'd be more in favor of changes that overall improve the watchdog interface / behavior / logging / documentation. And possibly relaxing some of the defaults 🤷 But it would be worrying to run osquery by default without any checks if any queries interact poorly with a given host.
If the issue is that the watchdog behavior is confusing, and possible needs improvement, then I suspect that toggling the default doesn’t help. It will still be buggy and confusing when people enable it.