Hi, it's me again from <https://osquery.slack.com/...
# fleet
t
Hi, it's me again from https://osquery.slack.com/archives/C01DXJL16D8/p1665682341784749 We rollbacked our db to ensure the db was not causing the issue - it was not. I upgrade our PR fleet server to 4.21.0. No errors when running
systemctl status fleet.service
after upgrading from 4.17.0 to 4.21.0. The OSquery web UI was even working briefly after the upgrade! However, the fleet service continues to use up all memory on server until it kills itself. I don't know what to do next! This issue is not happening in our NP environment and they have almost identical hosts and queries on both
So I suppose my question is, is there any way to curb Fleet from using up all the memory? Is there any way I can set a max usage allowance? Right now it's set to 8GB. Our NP server is using less than 2% of that whereas our PR is using almost 90% and nothing is running
k
There isn't a way to cap that usage on the Fleet side, but we can definitely keep looking in to what might be causing the freeze up some more! Are you still seeing
context cancelled
errors in the Fleet logs, or have things changed there?
t
Hi again, yes but I think the "context cancelled" errors are unrelated. The NP server has those same errors too, but NP is functional. Here's our theory - The query that guy ran that broke the PR is still trying to run every time the Fleet service starts up. I don't know much about redis, but when I run "monitor", the terminal blows up as soon as I start the fleet service. I ran "flushdb" and "flushall" but restarting the fleet service still causes it to blow up. Not sure if there's a way to view the current queue or restart redis?
Starting fleet causes this to happen in redis (I have no idea what I'm looking at):
But in NP, starting fleet while running monitor just does this:
z
Can you shut down the Fleet servers, run the Redis
flushdb
command, then run
monitor
and start the Fleet server back up? It would be helpful to see what the first few Redis commands are when the server starts so that we can try to understand where that query is coming from.
t
Done - this is what it looks like but then a few seconds afterwards, it spams that gibberish you saw in the first screenshot