is there any way to get all labels to show up under pack tar osquery #fleet

is there any way to get all labels to show up unde...

Jocelyn Bothe

09/01/2021, 7:46 PM

is there any way to get all labels to show up under pack targets? the search times out, and I can't select my custom label as a pack target

Sarah Gillespie

09/02/2021, 10:56 PM

Hi Jocelyn, are you still experiencing this issue? If so, would you be able to provide us what you are seeing in the browser network tab or with any additional info that might help with debugging?

Jocelyn Bothe

09/07/2021, 3:56 PM

the search still times out, but I figured out a workaround by adding all the labels as targets, it reveals labels further down the list, and then I go back and remove the labels I don't want to target later. It would be nice if it would display more than 6 labels at a time

Jocelyn Bothe

09/07/2021, 3:56 PM

I'm assuming the timeout is from trying to regex match with 160k hosts

Jocelyn Bothe

09/07/2021, 3:57 PM

there aren't any customizable timeout settings for database searches, which would also be nice to have for large deployments like ours

Jocelyn Bothe

09/07/2021, 3:57 PM

also a way to shard across multiple database hosts

Jocelyn Bothe

09/07/2021, 3:58 PM

our read replicas get no traffic, but our primary writer gets very hot

Tomas Touceda

09/07/2021, 5:59 PM

hi Jocelyn, I believe we addressed not all labels appearing as part of this PR: https://github.com/fleetdm/fleet/pull/1857 and also a rework we are doing of that UX. I recommend you try 4.3.0 (which will be released this week) and let us know if the issue persists

👍 1

Tomas Touceda

09/07/2021, 6:02 PM

the search still times out

do you happen to have the request that times out with more details?

our read replicas get no traffic, but our primary writer gets very hot

we added read replicas very recently (in fact, it's unreleased yet), can you tell me a bit more about your setup?

Jocelyn Bothe

09/07/2021, 7:05 PM

we're using an AWS RDS aurora global cluster, using db.r5.8xlarge instances

Jocelyn Bothe

09/07/2021, 7:05 PM

CPU for that is at 31%, and that's with all our queries currently off

Jocelyn Bothe

09/07/2021, 7:05 PM

but with 160k hosts enrolled in FleetDM

Jocelyn Bothe

09/07/2021, 7:06 PM

SelectThroughput is at ~30000 all the time

Jocelyn Bothe

09/07/2021, 7:07 PM

doing something like adding a label takes that CPU up to 90+ and tears up the DB for 5-10 minutes

Tomas Touceda

09/07/2021, 7:17 PM

gotcha. You'll probably benefit a lot from the read replicas addition in 4.3.0. Unless you use something like proxysql and automatically redirect queries to one db host or another, fleet doesn't currently (4.2.4) support read replicas. What

osquery.detail_update_interval

do you have setup in fleet serve?

Jocelyn Bothe

09/07/2021, 7:19 PM

detail_update_interval: 1440m

Tomas Touceda

09/07/2021, 8:28 PM

sounds good. Could you tell me a bit more about the rest of the infrastructure? i.e. how many instances, size, etc

Jocelyn Bothe

09/08/2021, 4:58 PM

we run on c5.xlarge in two regions, 40 in one and 70 in the other, and we use kinesis firehose for logging, sending to an on-prem splunk.

Tomas Touceda

09/08/2021, 9:10 PM

that's very helpful, thank you. How did you end up with those numbers? Are you autoscaling based on CPU usage or something like that? or were those empirical numbers you got to as you scaled the amount of hosts?

Jocelyn Bothe

09/09/2021, 2:21 PM

we've been running into an issue where periodically Fleet will run into some kind of problem, and eat 100% of the instance's memory. it's hard to troubleshoot, because once it happens, you can't even log on to the host. we added the AWS cloudwatch agent and started alarming on mem util, and found it stopped happening if we scaled out enough. I was able to capture a stack trace from a host I happened to already be on, but it's 30k lines long.

Copy code

Aug  3 18:34:34 ip-10-12-24-164 fleet: fatal error: runtime: out of memory

Jocelyn Bothe

09/09/2021, 2:21 PM

stacktrace

Jocelyn Bothe

09/09/2021, 2:26 PM

normal mem util is about 16%

2 Views

Open in Slack

Previous Next