When coding the new tls_logger backoff feature, I ...
# core
v
When coding the new tls_logger backoff feature, I realized that the following configs are not being updated dynamically via
config_tls_endpoint
. They are only set once at startup and cannot be changed afterwards.
Copy code
--logger_tls_endpoint
--logger_tls_period
--logger_tls_max_lines
Is this a bug? Or is the fact that these settings are set once and cannot be updated until osquery restart documented somewhere? I'd like to see the ability to remotely restart osquery if a setting cannot be applied -- either as a separate config
--osquery_restart=true
or automatically if an updated setting via config_tls_endpoint cannot be immediately applied.
s
I don't think that can be considered a bug per se. It's just a choice of the moment when writing the code. The loggers are constructed with those parameters and so they do not react when changed. Also, osquery has the concept of CLI_FLAG, which are flags that can be given only through the process arguments or the flagfile, and some are marked as that because they can only be changed once at startup, others because one wants to enforce that they can't be changed remotely easily. Right now is all quite inconsistent, because there definitely should be more CLI_FLAG, even only to mark which flags will not react to change post startup.
As for remote restart, I haven't thought deeply about this, but wouldn't that be the job of the service handler?
or launcher?
Especially because the communication from a fleet manager will be received by the worker, not the watchdog, which is the one with the ability of restarting the worker.
To be honest though I think that people might want to have the logger_tls_endpoint be fixed and not changeable remotely, to have some security; the other two I see them more amenable to remote change; meaning that the code can be changed so that it always reads the period and max lines from the flags.
v
Maybe there should be a concept of DYNAMIC_FLAG, which can be overwritten remotely at any time?
s
There is already, FLAG
FLAGs are those flags that should react when the config is changed. If you want the bug is that they are not marked as CLI_FLAG. I was just mentioning though that unfortunately the current state of the flags is a bit inconsistent. In the past I changed some flags from FLAG->CLI_FLAG, for consistency and also actually enforced that CLI_FLAG could not be changed via config file (this was always documented but not enforced). But we have to be careful because this breaks people configs ^^'
v
As I user, if I find a flag in osquery documentation and try to set it remotely, how do I know if it will actually be applied or not?
s
This is not something that's communicated in the readthedocs doc about each flag, but can be seen in the
osquery --help
which always has been the ultimate truth. Again FLAGs will react to remote config file changes if the code has been written to do so, and that flag is not erroneously marked as FLAG CLI_FLAGs will not react to remote config file changes
in the
--help
the "osquery command line flags" are the CLI_FLAGS, the "osquery configuration options (set by config or CLI flags):" are the FLAGs
Marking something as FLAG is not enough to induce the behavior of "the code always reacts to the flag changing", marking something as CLI_FLAG will always induce the behavior of "cannot be set remotely via config file"
v
Doesn't seem like a good user experience. If I find a flag in the documentation that I want to set dynamically, I need to go look in the code to see if the code has been written to pick up the flag change.
☝️ 1
s
I agree, but fixing it is done via marking each flag correctly.
And I definitely would like to have each flag marked correctly, just had not had time to go through them all and also did not want to cause a massive breakage of configs. Something like this would likely require some osquery versions to just warn about the change.
last time we did the change we only did one version, but that version also had other issues, so likely no one tried it.
v
The behavior is a bit different here. These flags are set remotely initially, but then afterwards they cannot be updated. So they should not be labeled as CLI_FLAGS. That's why I was proposing a new label for them.
s
Well, but now they do not get picked up and modified, before they are read. That been said if the "modify once but remotely" behavior is what you want, then yes we would need a new variation of a flag. Although at this point I wonder if it makes sense to create another category or it's confusing. As I mentioned earlier, period and max_lines could be kept as FLAG and moved in the code where they are read each time, so when the config file changes them, then they are picked up. Is there are reason to why you want to change them only once?
v
I'm just thinking of this new label for the future (for other flags). These flags should be modified dynamically. I made this fix as part of my tls_logger backoff PR: https://github.com/osquery/osquery/pull/8230
s
And talking about the documentation being out of sync, that's mostly because it's a manual procedure. The macros could be grepped and a script be written to list them all, presenting them with the info they have. Couple of issues with that is that until now for the flags that are documented on readthedocs, they have a longer explanation too which is manually written in the docs, and cannot be therefore extracted from the flags themselves. I think if we want to ensure that everything stays in sync, code should be the truth and FLAGS macros should support a short and long description, so that then that can be parsed and used for the documentation too (I'm mainly referring to generate a page like this: https://osquery.readthedocs.io/en/latest/installation/cli-flags/, instead of manually updating it everytime after adding a new flag. I also just realized the misnomer of the page itself, which clashes with the concept of CLI_FLAGs, since there you have both types). We also have other pages that have longer, in context explanation of some flags, but those obviously will stay manually written.
If anyone wants to go ahead and make things more consistent automatically, I think that's always welcomed.
s
I’m coming in a little late here, and I’m not sure what the state of this discussion is… As I understand it, today things are setup around two flags styles. CLI_FLAG and FLAG. The former must be specified on the command line, or flags file. The latter may also show up in the config, This can primarily be seen in the
--help
output. This difference is partly around security/privacy, and partly around what makes sense implementation-wise As Stefano said — there’s a lot of manual updating around this, so there may well be bug in how the docs are written. And even in which flags show up where. There’s almost certainly place to PR fixes… It sounds like there is a separate question around some flags which can be set once via a remote config but then cannot be changed. I cannot decide if this is a bug, or if it’s a category of flag we should embrace. I’m curious where people are leaning.
s
For the new category, some of those flags are like that due to construction and, if you want, limitations of the current code. Like some of the flags connected to event publishers, because we currently cannot disable or enable multiple times a publisher at runtime, without restarting.
Some of those flags are already CLI_FLAG, because it also somewhat makes sense, but other are like that "just" to reflect the facts that you have to give them "at startup"
s
I’m kinda nosing around whether we should make a new flag type, or just add a note to the flag descriptions. If we consider it a bug, a note feels more appropriate. If we think of it as intentional, than a new flag catagory
s
I think the bug right now is the incorrect classification on being a FLAG. Until now the intention of CLI_FLAG was both to say "something you have to give at startup, via flagfile or process args" and "something you can't change later". Therefore where people remembered and wrote code where the logic did not re-read the value, they would mark it as CLI_FLAG.
I think it's just a result of not having enforced consistency given the "rule", which was documented, but also not actually enforced in code (for those flags who could change and affect the code, but where we didn't want to), until recently.
1
s
I thought we added a warning if a CLI_FLAG shows up in the config?
s
Yeah, sorry, I'm talking about two kinds of enforcing here. 1. Although the guideline of marking FLAG flags that we want to change from a remote config, and that can affect the code, because they are re-read, and the other guideline for CLI_FLAG I mentioned above existed since a while, enforcing the correct labeling of the flags did not happen often, result: some flags are mislabeled. 2. To further "prove" the inconsistency, I was bringing the CLI_FLAG issue where previously we would not enforce that CLI_FLAGs could not be changed via remote config (and we would not even warn about it). We later changed that, and now is enforced, and it also prints warnings.
So to me, we should first maybe make a list of all FLAGs and, even if we are not fixing it in code, just document which ones do not cause change in behavior, even if you can change their value in memory/via remote config. Those would be all the candidates for being CLI_FLAG, for me. Or the code has to be changed so that they can actually behave as FLAGs. The additional discussion is around, "but what if I want something similar to CLI_FLAG, in that I can set it once, but remotely?" I think for those kind of flags there are couple of questions: 1. What's the underlying reason for modifying them just once? And via remote config specifically? 2. What "once" really means? Meaning, where we would store the fact that they have been set? In memory? In RocksDB (which it unfortunately easily corrupts?)
s
Yeah, I agree.
But I also have a bias to inaction
s
I'm less sure about the security separation between CLI_FLAG and FLAGs. Meaning, I think it's good to have some flags that cannot be changed easily via a remote config, without too much noise; this was also part of the reason I wanted to actually enforce that CLI_FLAGs could not be changed remotely (and a customer asked too ^^'). But this also maybe hinged a bit on the backend not also having other functionality that can simply push files to machines and restart osquery when they want. Basically their question was "what if backend X gets owned?". In any case I guess having to make "more noise" to apply some settings is better.
And sorry, entered a slight tangent, all of this to say, I'm unsure where on the spectrum of security those "set once but remotely" flags would land.
v
I'm on board with just having CLI_FLAG and FLAG, and that there are bugs where some FLAG should actually be CLI_FLAG. I fixed the above-mentioned logger_tls configs to have FLAG functionality in my PR.
s
set once but remotely
Feels like it would only make sense with a lot of additional monitoring. And it’s feels like it’s getting a bit far outside our usual use case