Title
#fleet
a

Ahmed

06/21/2021, 6:02 PM
i rolled out osquery to some immutable hosts in gcp, i found that i’m getting only 1 machine in fleet, i did run osqueryd with options
/usr/bin/osqueryd --flagfile /etc/osquery/osquery.flags --verbose --tls_dump
i noticed that they have the same node_key, and UUID, what would be the solution for this ?
6:41 PM
this is my flags file
--enroll_secret_path=/etc/osquery/osquery_enroll_secret
--tls_server_certs=/etc/osquery/osquery_cert.pem
--tls_hostname=<http://fleet.example.com:443|fleet.example.com:443>
--host_identifier=hostname
--enroll_tls_endpoint=/api/v1/osquery/enroll
--config_plugin=tls
--config_tls_endpoint=/api/v1/osquery/config
--config_tls_refresh=360
--config_tls_max_attempts=360
--disable_distributed=false
--disable_logging=false
--distributed_plugin=tls
--distributed_interval=60
--distributed_tls_max_attempts=3
--distributed_tls_read_endpoint=/api/v1/kolide/distributed/read
--distributed_tls_write_endpoint=/api/v1/kolide/distributed/write
--logger_plugin=filesystem
--logger_path=/var/log/osquery/logs
--database_path=/var/log/osquery/db/osquery.db
--schedule_splay_percent=10
--pack_refresh_interval=360
--watchdog_level=0
--config_refresh=360
--utc
--force=true
i noticed there is something like
--tls_client_cert
 and 
--tls_client_key
that could be used (link) but havenot used that before, hopefully you have some suggestions and also would tls client be usefull and how to generate that cert/key to be accepted by fleet/osquery
a

Avi Norowitz

06/21/2021, 6:56 PM
The same node key and UUID indicates to me that the RocksDB database in /var/osquery/osquery.db has been duplicated between the GCP instances. Is osquery part of your GCP instance image? On already running instances, you could do a one time erase of the RocksDB with:
sudo service osqueryd stop; sudo osqueryctl clean; sudo service osqueryd start
On then when recreating your GCP image, erase the RocksDB before creating the image:
sudo service osqueryd stop; sudo osqueryctl clean
a

Ahmed

06/21/2021, 7:18 PM
after running that command, i see i’m getting new UUID, but i’m still getting the same node_key from fleet for both machines.
how can we get a new node_key?
and for the creation of image would it be sufficient to delete the db file or make sure it doesnot exist before starting ? i’n my current setup, i had a check in puppet to stop the service if its in the build process
$service_ensure = $::built_by_packer ? {
    true    => 'stopped',
    default => $osquery::agent::service,
  }
i guess that didnot prevent the db from creation,
a

Avi Norowitz

06/21/2021, 7:31 PM
I don't know why this would lead to a duplicate node key. You could try deleting the host from the Fleet UI and see if that helps. If osqueryd starts even once, it will probably generate a RocksDB. So I'd suggest doing a service stop and removal of the RocksDB before creating your image. You could do this in Puppet.
7:32 PM
Also, I see your RocksDB path is not standard:
--database_path=/var/log/osquery/db/osquery.db
So I'm not sure if that
osqueryctl clean
command would work.
a

Ahmed

06/21/2021, 7:32 PM
i found that deleting the entire db, fixed that issue and now its appearing in fleet
7:33 PM
i would just need to figure out now, how to not create that db in the build process
7:34 PM
the clean works, it get you a new UUID, but not a new node_key which comes in the reponse from fleet (i saw that when i did tls dump) but deleting the db completely and restart osqueryd it then get a new node_key
zwass

zwass

06/22/2021, 4:49 PM
Awesome, thank you for helping out @Avi Norowitz!
a

Avi Norowitz

06/22/2021, 4:49 PM
🙂
zwass

zwass

06/22/2021, 4:50 PM
Best practice is to delete the osquery DB before cloning images. Otherwise multiple instances will share the same node key and Fleet sees them all as the "same" instance.
a

Ahmed

06/22/2021, 6:45 PM
why doesn’t the
osqueryctl clean
is not cleaning the node key as well? i have been trying to delete the db in the build process when packer runs, but still getting the same. if you have tried that before and could point me to something similar that would be great.
a

Avi Norowitz

06/22/2021, 8:05 PM
osqueryctl clean
removes
/var/osquery/osquery.db
. But in your config, you have:
--database_path=/var/log/osquery/db/osquery.db
Is there any particular reason you need to use the non-default path for this?
a

Ahmed

06/23/2021, 11:53 AM
it was a convention in the environment, but i dont think there is a reason not to change it. but as you mentioned this will work for live instance, i’m trying to figure out something that prevent this issue from happening. trying to get the puppet to delete this file in the build process, but it seems there are time where the image is actually running before it’s used to create an instance.
11:58 AM
Shouldn’t this be a bug, when osquery is not deleting non standard location, i tested it, i found it deletes the location you mentioned but not the location i had in my configs.
a

Avi Norowitz

06/23/2021, 1:33 PM
Maybe it would be considered a bug with osquery. You could ask in #general maybe or file a bug report here https://github.com/osquery/osquery/issues. (Just a disclaimer, I'm not affiliated with either osquery or Fleet. I'm just a user of both.)
a

Ahmed

06/23/2021, 1:38 PM
i know, Thanks a lot for the help Avi
zwass

zwass

06/23/2021, 3:23 PM
@Ahmed I'd recommend shutting down osquery and deleting the database (
rm -rf
on that directory should work fine) before creating the image. Make sure that osquery is configured to start on boot.
a

Ahmed

06/29/2021, 2:24 PM
Thanks zack, i overcome this by stopping the service inside build process which didnot create the db files, and as a second step i did clean command on the default directory of db which will clean db if exist. that fixed the problem. thanks a lot.