is there any way to get all labels to show up unde...
# fleet
j
is there any way to get all labels to show up under pack targets? the search times out, and I can't select my custom label as a pack target
s
Hi Jocelyn, are you still experiencing this issue? If so, would you be able to provide us what you are seeing in the browser network tab or with any additional info that might help with debugging?
j
the search still times out, but I figured out a workaround by adding all the labels as targets, it reveals labels further down the list, and then I go back and remove the labels I don't want to target later. It would be nice if it would display more than 6 labels at a time
I'm assuming the timeout is from trying to regex match with 160k hosts
there aren't any customizable timeout settings for database searches, which would also be nice to have for large deployments like ours
also a way to shard across multiple database hosts
our read replicas get no traffic, but our primary writer gets very hot
t
hi Jocelyn, I believe we addressed not all labels appearing as part of this PR: https://github.com/fleetdm/fleet/pull/1857 and also a rework we are doing of that UX. I recommend you try 4.3.0 (which will be released this week) and let us know if the issue persists
👍 1
the search still times out
do you happen to have the request that times out with more details?
our read replicas get no traffic, but our primary writer gets very hot
we added read replicas very recently (in fact, it's unreleased yet), can you tell me a bit more about your setup?
j
we're using an AWS RDS aurora global cluster, using db.r5.8xlarge instances
CPU for that is at 31%, and that's with all our queries currently off
but with 160k hosts enrolled in FleetDM
SelectThroughput is at ~30000 all the time
doing something like adding a label takes that CPU up to 90+ and tears up the DB for 5-10 minutes
t
gotcha. You'll probably benefit a lot from the read replicas addition in 4.3.0. Unless you use something like proxysql and automatically redirect queries to one db host or another, fleet doesn't currently (4.2.4) support read replicas. What
osquery.detail_update_interval
do you have setup in fleet serve?
j
detail_update_interval: 1440m
t
sounds good. Could you tell me a bit more about the rest of the infrastructure? i.e. how many instances, size, etc
j
we run on c5.xlarge in two regions, 40 in one and 70 in the other, and we use kinesis firehose for logging, sending to an on-prem splunk.
t
that's very helpful, thank you. How did you end up with those numbers? Are you autoscaling based on CPU usage or something like that? or were those empirical numbers you got to as you scaled the amount of hosts?
j
we've been running into an issue where periodically Fleet will run into some kind of problem, and eat 100% of the instance's memory. it's hard to troubleshoot, because once it happens, you can't even log on to the host. we added the AWS cloudwatch agent and started alarming on mem util, and found it stopped happening if we scaled out enough. I was able to capture a stack trace from a host I happened to already be on, but it's 30k lines long.
Copy code
Aug  3 18:34:34 ip-10-12-24-164 fleet: fatal error: runtime: out of memory
normal mem util is about 16%