Title
#kolide
d

daniel319b

01/07/2019, 7:09 PM
@nyanshakRegarding the problem with querying lots of hosts, I've found my problem. The problem was with Redis. I noticed in the logs that Redis was dropping the client's subscription a few minutes after I launched a live query and that's what caused it to stuck. I had to increase the "output buffer limit" in the Redis configuration and that fixed the problem.
zwass

zwass

01/07/2019, 9:12 PM
Glad to hear you worked that out! How high is the throughput of results you are working with?
n

nyanshak

01/07/2019, 9:49 PM
I upgraded ElastiCache node type to try to get a larger buffer size for Redis, but it didn't help at all. In my case, it doesn't even really look like redis is doing anything, as the CPU / network traffic is so low.
9:49 PM
Didn't fix my problem with querying
9:49 PM
@daniel319b what redis version / other config do you have?
10:00 PM
@zwass
d

daniel319b

01/08/2019, 9:53 PM
@zwass hmm do you mean the throughput in redis?
9:56 PM
@nyanshak Yeah I noticed that the CPU is very low on my redis as well. I'm using 3 fleet servers behind an NLB, 1 MySQL server, latest version, 1 Redis server, latest version. I played around with sql max connections I think its around 2000 now The Redis output buffer size limit is 512MB
9:57 PM
Have you checked your Redis logs?
9:57 PM
It's still not working as fluidly or fast as I'd like but it gets the job done
n

nyanshak

01/08/2019, 10:00 PM
I'm using elasticache and haven't checked redis logs, but I did bump the node size to where output buffer size was much larger (max size supported by amazon), but didn't see any improvement
10:00 PM
@daniel319b how many hosts do you have connected?
10:01 PM
and what are your settings for config_refresh and distributed_interval
d

daniel319b

01/08/2019, 10:04 PM
I have about 9K hosts I use the default values. When you query, do you select a label or "All Hosts"? Because I've noticed when I run a query on a label, its slower
n

nyanshak

01/08/2019, 10:06 PM
all hosts
10:06 PM
the query i'm running should be straightforward as well: select uuid from osquery_info;
d

daniel319b

01/08/2019, 10:10 PM
Whats your sql max conns? And how much RAM/CPU do your servers have? Maybe you need to upgrade?
10:12 PM
Oh never mind I saw the github issue
n

nyanshak

01/08/2019, 10:22 PM
Yeah... It's a db.m5.12xlarge and for fleet instances, c5.4xlarge
10:23 PM
I've tried max conns at 50, 256, 512, 2048, ...
10:24 PM
didn't seem to make much of a difference