https://github.com/osquery/osquery logo
#kolide
Title
# kolide
d

daniel319b

01/07/2019, 7:09 PM
@nyanshakRegarding the problem with querying lots of hosts, I've found my problem. The problem was with Redis. I noticed in the logs that Redis was dropping the client's subscription a few minutes after I launched a live query and that's what caused it to stuck. I had to increase the "output buffer limit" in the Redis configuration and that fixed the problem.
🤞 1
z

zwass

01/07/2019, 9:12 PM
Glad to hear you worked that out! How high is the throughput of results you are working with?
n

nyanshak

01/07/2019, 9:49 PM
I upgraded ElastiCache node type to try to get a larger buffer size for Redis, but it didn't help at all. In my case, it doesn't even really look like redis is doing anything, as the CPU / network traffic is so low.
Didn't fix my problem with querying
@daniel319b what redis version / other config do you have?
@zwass
d

daniel319b

01/08/2019, 9:53 PM
@zwass hmm do you mean the throughput in redis?
@nyanshak Yeah I noticed that the CPU is very low on my redis as well. I'm using 3 fleet servers behind an NLB, 1 MySQL server, latest version, 1 Redis server, latest version. I played around with sql max connections I think its around 2000 now The Redis output buffer size limit is 512MB
Have you checked your Redis logs?
It's still not working as fluidly or fast as I'd like but it gets the job done
n

nyanshak

01/08/2019, 10:00 PM
I'm using elasticache and haven't checked redis logs, but I did bump the node size to where output buffer size was much larger (max size supported by amazon), but didn't see any improvement
@daniel319b how many hosts do you have connected?
and what are your settings for config_refresh and distributed_interval
d

daniel319b

01/08/2019, 10:04 PM
I have about 9K hosts I use the default values. When you query, do you select a label or "All Hosts"? Because I've noticed when I run a query on a label, its slower
n

nyanshak

01/08/2019, 10:06 PM
all hosts
the query i'm running should be straightforward as well: select uuid from osquery_info;
d

daniel319b

01/08/2019, 10:10 PM
Whats your sql max conns? And how much RAM/CPU do your servers have? Maybe you need to upgrade?
Oh never mind I saw the github issue
n

nyanshak

01/08/2019, 10:22 PM
Yeah... It's a db.m5.12xlarge and for fleet instances, c5.4xlarge
I've tried max conns at 50, 256, 512, 2048, ...
didn't seem to make much of a difference