Denis
11/17/2023, 10:18 AMDenis
11/17/2023, 10:27 AMStefano Bonicatti
11/17/2023, 11:08 AM--logger_tls_period
.Stefano Bonicatti
11/17/2023, 11:10 AMStefano Bonicatti
11/17/2023, 11:18 AMDenis
11/17/2023, 11:24 AMStefano Bonicatti
11/17/2023, 11:25 AMDenis
11/17/2023, 11:29 AMWhen everything is working, what's the bandwidth used?About 10-20Mbit/sec
Stefano Bonicatti
11/17/2023, 11:29 AM--logger_tls_max_lines
that can help with the amount of data sent, which by default is 1024
.Stefano Bonicatti
11/17/2023, 11:32 AMAbout 10-20Mbit/secI see, so I suspect that you might be in the situation I mentioned, if you haven't already reduced that other parameter. My guess again is that you may send 1/100th of that, because that's the rate at which logs gets generated by osquery normally, but when things don't work, they accumulate in the DB, and so each batch gets bigger because the batch max size has not been tuned.
Denis
11/17/2023, 12:25 PMDenis
11/17/2023, 12:41 PMStefano Bonicatti
11/17/2023, 12:41 PMlogger_tls_max_linesize
, which by default is 1MiB. But this might make less sense, because while it's true you could theorically get each line to be of that size, 1MiB is big (and unlikely I believe)
Finally be careful to not backlog your clients, especially if you increase the period and/or reduce the amount of lines sent too much, because then the client RocksDB database will slowly grow and become slower.
There's a limit even there, buffered_log_max
, which if hit the DB will start dropping old logs. It's quite high currently (1M entries).
Keep in mind though that dropping logs might also cause further slow downs, because the DB has to do work to do so, so this mechanism is only ok if the threshold being hit is only temporary.Stefano Bonicatti
11/17/2023, 1:28 PMadded
or removed
line (so one line is one row that's different, be it added or removed. This is normal for events). For snapshot queries instead a line is the whole query result.
While there's such a huge range of size, I would tune the amount of lines thinking more to the events/differential (which are the ones that can grow on that axis).
While the line size can indeed get big but for snapshot queriesStefano Bonicatti
11/17/2023, 1:31 PMprocesses
table can become bigger because there are more processes at that time, or maybe because the paths to the binaries are longer (but there would need to be a lot of them).
That still requires a bit of knowledge and collaboration on who's writing the queries.Denis
11/17/2023, 1:39 PMStefano Bonicatti
11/17/2023, 1:42 PMlogger_tls_max_lines
is correct for events, can be problematic for snapshot queries.
In hindsight osquery should've had a single flag with a bandwidth target to reach, and osquery addressing how much "lines" to fit in the batch automatically.Stefano Bonicatti
11/17/2023, 1:58 PMStefano Bonicatti
11/17/2023, 1:59 PM