is there a parameter I can pass to configs to address this i osquery #fleet

is there a parameter I can pass to configs to addr...

Brandon

03/25/2021, 12:18 AM

is there a parameter I can pass to configs to address this issue:

Copy code

save enroll failed: host identified by 1234123-1234-1234-1234-C3C04F373533 enrolling too often

Also, seeing

Copy code

authentication error: missing node key

and

Copy code

enroll failed: no matching secret found

and finally

Copy code

failed to mark host seen: marking host seen: Error 1205: Lock wait timeout exceeded; try restarting transaction

Brandon

03/25/2021, 12:21 AM

These errors make up less than 0.8% total traffic from osquery to our elk stack

zwass

03/25/2021, 12:28 AM

Which version of Fleet are you on?

zwass

03/25/2021, 12:30 AM

This usually means you have multiple hosts with the same UUIDs. The issue can be addressed by setting

--host_identifier=instance

in your osquery flagfile, or in Fleet 3.9.0 you can configure it within Fleet itself: https://github.com/fleetdm/fleet/blob/master/docs/2-Deployment/2-Configuration.md#osquery_host_identifier

Brandon

03/25/2021, 12:46 AM

3.9.0 is the version

Scott Lampert

03/25/2021, 3:26 PM

As an aside to this I noticed that if you already have hosts showing up with duplicate ID’s, changing to

host_identifier=instance

doesn’t help because the hosts already have the duplicated id stored in their osquery backing store and won’t regenerate a new one. Only new hosts that pick up that config change will have newly generated id’s.

Scott Lampert

03/25/2021, 3:27 PM

@zwass this

Brandon

03/25/2021, 3:52 PM

Would redeploying to the hosts fix it?

zwass

03/25/2021, 3:54 PM

It sounds like using the setting in Fleet would probably be your easiest option.

zwass

03/25/2021, 3:54 PM

@Scott Lampert are you talking about setting

host_identifier=instance

from the osquery options within Fleet?

Scott Lampert

03/29/2021, 5:32 PM

@zwass Both. Once osquery boots up with any sort of config that stores its uuid in the osquery backing store it won’t change unless you either remove the backing store and restart with

instance

enabled in the flags or use

ephemeral

in the flags. The issue on the fleet side is that if you have a bunch of nodes trying to enroll with the same id already you really need to use the cooldown or the database will get thousand of lock contentions and fall over (we have 120,000+ nodes checking into fleet). If a large portion of those nodes are stuck with a non-unique id they never get to enroll since the rate of nodes trying to enroll will always trigger the cooldown. This means you can’t really count on any osquery config changes in fleet to be picked up related to uuid. This might not be an issue until a certain scale.

zwass

03/29/2021, 5:40 PM

@Scott Lampert is it possible that what you are seeing is that an already-enrolled osquery database was copied over to multiple hosts? Otherwise that sounds like a bug in osquery, as

instance_identifier

should be generated separately for any installation, regardless of existence/value of UUID.

Scott Lampert

03/29/2021, 5:43 PM

The symptom we saw is that osquery was misconfigured locally to not have any

host_identifier

settings on a few thousand hosts exhibiting the above behavior. We found that even ssh’ing into the host and re-running with

--host_identifier=instance

fleet would still see the original duplicate hw uuid regardless of that setting. If we set it to

ephemperal

it would also work correctly. If we deleted the backing store and restarted it with

instance

it also would show up correctly.

Scott Lampert

03/29/2021, 5:44 PM

Just changing it to

instance

would not seem to generate a new uuid in the osquery info.

Scott Lampert

03/29/2021, 5:44 PM

once it already had one.

zwass

03/29/2021, 5:55 PM

instance_id

is the column Fleet would use if you configure https://github.com/fleetdm/fleet/blob/master/docs/2-Deployment/2-Configuration.md#osquery_host_identifier. That should be unique per osquery database and if it's not that's a bug (please file an issue).

Scott Lampert

03/29/2021, 6:12 PM

It is but only if you initially used

instance

instance

stores the id in the backing store once it’s generated. Otherwise you would want ephemeral.

Scott Lampert

03/29/2021, 6:12 PM

This is by design in osquery

Scott Lampert

03/29/2021, 6:12 PM

instance uses an instance-unique UUID generated at process start, persisted in the backing store.

Scott Lampert

03/29/2021, 6:13 PM

So once it has an id in the backing store changing it to

instance

will not generate a new uuid

Scott Lampert

03/29/2021, 6:14 PM

You would have to use

instance

before osquery makes its db the first time.

45 Views

Open in Slack

Previous Next