Hello fleet community I did a fleet upgrade 3 weeks ago from osquery #fleet

Hello fleet community, I did a fleet upgrade 3 wee...

Clément Bouchard

10/25/2022, 9:47 AM

Hello fleet community, I did a fleet upgrade 3 weeks ago (from version 4.6.1 to 4.20.1) and am experiencing a lot of OOM since. I will attach some stderr logs in the thread. I tried to add memory, to bump to 4.22.0 and still facing the issue. Is anyone having the same behavior? Few things about config : • 30 fleet instances with 8GB RAM each, running behind LB (haproxy) • DB is MariaDB (I know it is not officially supported) • Managing ~40k servers (mostly Linux with few Windows) using osquery from 4.6 to last version • Software Inventory is disabled

Clément Bouchard

10/25/2022, 9:51 AM

Stderr logs

Copy code

{"component":"http","err":"write tcp ip:port->ip:port: i/o timeout","level":"info","path":"/api/v1/osquery/config","ts":"2022-10-25T09:31:14.587630969Z"}
2022/10/25 09:31:14 http: superfluous response.WriteHeader call from <http://github.com/prometheus/client_golang/prometheus/promhttp.(*responseWriterDelegator).WriteHeader|github.com/prometheus/client_golang/prometheus/promhttp.(*responseWriterDelegator).WriteHeader> (delegator.go:65)

Clément Bouchard

10/25/2022, 9:54 AM

Erratum - Initial version was 4.2.3 (no issue on this one) and not 4.6.1

roberto

10/25/2022, 1:33 PM

hey there! thanks for all the details, we did coincidentally load test 4.22.0 with ~40k hosts and we didn't notice this, could you share a couple more details with us? 1. We have this guide on debugging, could you provide as many details listed there as you possibly are able to? 2. Are there any other logs around the one you posted? or it's mainly

/api/v1/osquery/config

the problem? 3. You mentioned Software inventory is disabled, however it might be enabled per team. Do you have teams set up?

Clément Bouchard

10/25/2022, 2:40 PM

Hello Roberto, Thank you for your reply. 1. I will gather and provide you as much data as I can 2. Mostly the provided logs, with sometimes "error in query ingestion" one. I did some search regarding the log about prometheus client and it is an issue corrected in higher version of prometheus go client. Any thoughts about bumping ? 3. It is explicitly disabled in the config (pushed via fleetctl apply YAML file) and we don"t use teams. If that's help, UI is mentioning that it is disable on the software tab. Additional info, we are also running fleet on a preprod environnement with few host, and it is running without issue.

roberto

10/25/2022, 2:50 PM

Thank you! 1. sounds good, thanks again 🙂 2. interesting, I will create an issue to bump prometheus to avoid filling up the logs with that, but as far as I can tell that shouldn't be the source of your problems. Are you able to share more logs? It'd be interesting to see what other stuff is happening around the errors you see. Even a couple of hours of logging should be enough 3. understood, thanks!

roberto

10/25/2022, 3:55 PM

@Clément Bouchard another question: are you using packs? or additional queries? do you happen to have a lot of packs or packs with a lot of queries?

Clément Bouchard

10/25/2022, 5:11 PM

4-5 packs and about 20 customs queries. I tried to remove all of it, install it on a new fresh DB, same issue.

Clément Bouchard

10/26/2022, 6:32 AM

Well, I wanted to ensure that the database wasn't the issue and was about to migrate to one I manage. Prior to that, I removed the db duplica configuration, and it seems to solve the issue. See the memory graph below

Clément Bouchard

10/26/2022, 6:37 AM

Weird thing is that I did enable db duplica a week ago, in order to improve perf and solve the OOM issue. Meanwhile, I did upgraded from 4.20.1 to 4.22.0.

roberto

10/26/2022, 7:09 PM

wow, interesting. Do you mind if I create an issue with this information? (possibly including the screenshot?) this is definitely something I'd like to investigate using MySQL instad of MariaDB

Clément Bouchard

10/27/2022, 6:53 AM

Sure, let me know if I can help you reproducing the issue.

2 Views

Open in Slack

Previous Next