I've seen this error, does it indicate the fleet u...
# fleet
d
I've seen this error, does it indicate the fleet used all the RAM available on the machine?
Copy code
Jun 01 14:23:00 XXX fleet[3998530]: runtime: out of memory: cannot allocate 16777216-byte block (15477866496 in use)
Jun 01 14:23:00 XXX fleet[3998530]: fatal error: out of memory
k
It looks that way, @Daniel Bretón Suárez. How much RAM is available to Fleet?
d
Thank you Kathy, the whole system has 16 GB and all 16 GB are available to Fleet. So I'm assuming ~15 GB taking into consideration other processes
j
If you cannot add more RAM to your machine, you might want to try ZRAM-Swap. It comes with a slight (really just slight) performance penalty, but it's orders of magnitudes faster than swapping to disk and it's even better than having your Fleet server reaped by the OOM killer.
d
If I could add more RAM to my machine, Is there software limit to the memory fleet service can use? If so, How can I change it?
j
None that I'm aware of or have seen in the documentation, but I'm just a pretty newb user. ¯\_(ツ)_/¯
k
There isn't anything on the Fleet side, but vulnerability processing requires 4GB of available memory. Is this machine dedicated to Fleet? It sounds like there may be other things eating up that memory.
b
15GB would be a lot of memory for fleet. How many hosts are enrolled?
Also do you notice it OOM’ing during a specific operation? Like a live query against the entire fleet?
m
If you cannot add more RAM to your machine, you might want to try ZRAM-Swap. It comes with a slight (really just slight) performance penalty, but it's orders of magnitudes faster than swapping to disk and it's even better than having your Fleet server reaped by the OOM killer.
@Jörg Sachse TIL. Cool tip, thanks!
Also do you notice it OOM’ing during a specific operation? Like a live query against the entire fleet?
+1 @Daniel Bretón Suárez Just double-checking you got this resolved
d
We've added 16 GB more to the machine and the problem is more or less resolved. There are 11K agents and 3 fleet instances with 32 GB RAM and 8 cores. The RAM is high, like ~40% of that 32 GB. The CPUs are saturated >700% for the 8 cores. The scheduled queries are disabled, so I'm thinking there is only the possibility of a very consuming live query we executed for testing. Is there a way to cancel live queries on all the enrolled agents?
k
You can disable live queries entirely for a time to see what your usage looks like at that point. Are those queries currently being executed by a script? What version of Fleet are you running?
b
@Daniel Bretón Suárez where is TLS termination happening?
ping ^