Title
#fleet
c

crimsonknave

04/09/2021, 5:53 PM
I'm seeing the following errors in my logs
"err":"retrieve live queries: receive sql: redigo: nil returned"
. They appear to be caused either be running a query live via
fleetctl
or ending one of those queries early. However, the errors keep popping up in the logs long after any queries are being run. They seem to spike up in regular 5 minute intervals, which is my
logger_tls_period
. Is there something I can do to fix these?
zwass

zwass

04/09/2021, 5:59 PM
Is 5 minutes also your distributed_interval?
c

crimsonknave

04/09/2021, 6:06 PM
No, that's 30 seconds
6:09 PM
But, I haven't run a live query in over an hour and I'm still seeing these messages come in
6:14 PM
Well, they're less regular in 5 minute increments now, but that could be due to more than one query stuck like this
6:15 PM
From what I can tell the redis is empty (but redis is a bit of a mystery to me)
zwass

zwass

04/09/2021, 6:16 PM
keys '*'
returns nothing?
c

crimsonknave

04/09/2021, 6:17 PM
I'm using redis commander and there aren't any entries in the GUI. I got an error when I tried to run that in the commandline bit. Let me dig a bit more.
6:20 PM
The tree view should show keys, but it's empty. Which is what I recall when I had to go in and clean up the live queries before the cleanup logic was implemented.
6:20 PM
Also, we just updated to 3.10.1 from 3.1 (I think, it was old).
6:21 PM
I added a test key and can see it in the tree, so I'm pretty sure it's empty aside from that key.
zwass

zwass

04/09/2021, 6:25 PM
Okay thanks for all that info. I'm going to see if we can do some better cleanup for this in the next release.
c

crimsonknave

04/09/2021, 6:40 PM
Thanks! Two quick questions, are these sort of errors something I should be worried about? Are the expected if I run or cancel a query?
zwass

zwass

04/09/2021, 6:41 PM
I don't think you need to worry about them. We're being overly noisy about cleaning up older queries.
6:41 PM
But it is strange that you still see them even though Redis is empty... You shouldn't be able to hit that code path if Redis is empty.
c

crimsonknave

04/09/2021, 6:47 PM
Anything else I can do to debug right now?
zwass

zwass

04/09/2021, 6:49 PM
Is it still happening even after verifying Redis empty?
c

crimsonknave

04/09/2021, 6:49 PM
Yup, I also redeployed our fleet (in kubernetes)
zwass

zwass

04/09/2021, 6:49 PM
Are you seeing those errors in the Fleet server logs or in the logs that osquery clients are writing to Fleet?
6:49 PM
Can you paste a full log line?
c

crimsonknave

04/09/2021, 6:50 PM
That was from Fleet, one sec. I may have misspoken.
6:51 PM
I am still seeing them. 9k in the last 15 minutes.
6:52 PM
{
  "component": "service",
  "err": "retrieve live queries: receive sql: redigo: nil returned",
  "ip_addr": "10.125.6.0:19173",
  "level": "info",
  "method": "GetDistributedQueries",
  "took": "12.15416ms",
  "ts": "2021-04-09T18:50:47.401563414Z",
  "x_for_ip_addr": "10.127.50.32"
}
zwass

zwass

04/09/2021, 6:54 PM
Is it possible your Redis UI is connected to a different DB than Fleet? Can you verify by running a live query and seeing that some keys appear?
c

crimsonknave

04/09/2021, 6:56 PM
Running one now. I see
livequery
and
sql
6:56 PM
Canceled it and those went away
zwass

zwass

04/09/2021, 7:06 PM
And does live query actually work? I'm just looking at the code and it seems like it should not be possible to hit that line if there are no keys at all.
c

crimsonknave

04/09/2021, 7:08 PM
Yup, last time I let it run for a while I got
⠓ 59% responded (100% online) | 7169/12159 targeted hosts (7169/7205 online)
before it felt like it wasn't going to get any more.
7:08 PM
And a bunch of data flowed in
7:09 PM
There are keys when the query runs, when I cancel it they get cleaned up.