Title
#fleet
d

Daniel Weeber

12/29/2021, 8:06 PM
I noticed that I am unable to open couple of hosts within fleetdm. showing “Unable to load host. Please try again”. Showing as online though! Live Queries also not working. Had a look at the reverseproxy logs and found bazillion of HTTP 500 for POST to /api/v1/osquery/distributed/write But some other hosts are working fine, live queries also working fine.. and they are located on the same network as the ones which are not working, so I dont suspect any networking/firewalling/proxy issue.
zwass

zwass

12/29/2021, 8:13 PM
Hmm, is there any more information about the 500 errors in the Fleet server logs? Which version of Fleet are you running? Can you run osqueryd manually on one of the effected hosts with
--verbose --tls_dump
and see if there's any information about what the actual error is?
d

Daniel Weeber

12/29/2021, 8:16 PM
Anything I should have a look for in special?
8:17 PM
Only thing I notice so far is
"error": "internal error: load host additional: load pack stats: timestamp: 2021-12-29T21:16:45+01:00: sql: Scan error on column index 10, name \"last_executed\": unsupported Scan, storing driver.Value type \u003cnil\u003e into type *time.Time"
zwass

zwass

12/29/2021, 8:23 PM
Which Fleet version is this? IIRC we fixed this issue or something similar recently.
d

Daniel Weeber

12/29/2021, 8:28 PM
4.6.2 Will upgrade to 4.7.0 now
zwass

zwass

12/29/2021, 8:30 PM
Please lmk if that resolves it. I think we may have fixed this issue in 4.7.0.
d

Daniel Weeber

12/29/2021, 8:34 PM
Working! Host is opening up again in fleetdm gui and also executing live queries… But its painfully slow. “get openssl versions” is taking like 1-2mins.
8:34 PM
Exact query is
SELECT name AS name, version AS version, 'deb_packages' AS source FROM deb_packages WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'apt_sources' AS source FROM apt_sources WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'rpm_packages' AS source FROM rpm_packages WHERE name LIKE 'openssl%';
8:39 PM
just tried via osqueryi on local machine.. result is instant
8:44 PM
Even with distributed_interval=1 its slow af
zwass

zwass

12/29/2021, 8:45 PM
Also instant with
sudo osqueryi
on local machine?
8:45 PM
Are other hosts also slow? Or just the previously effected ones?
d

Daniel Weeber

12/29/2021, 8:45 PM
yes. same query on target machine is instant via osqueryi
8:46 PM
Nope, presumably all of them.
zwass

zwass

12/29/2021, 8:47 PM
How's the CPU doing on your Fleet servers/Redis/MySQL?
d

Daniel Weeber

12/29/2021, 8:48 PM
Seems bored
8:49 PM
Webinterface of fleetdm is rapid fast
8:51 PM
Query of linux, windows and macs (my macbook) is painfully slow, just tested all 3. mac seems a bit faster, but not much about 75 seconds for
SELECT * FROM system_info;
Tomas Touceda

Tomas Touceda

01/03/2022, 11:38 AM
hi there! is that CPU graph for the fleet instance? could you tell me a bit more about your setup? eg: what redis and mysql are you using, how is CPU/memory in there
d

Daniel Weeber

01/07/2022, 5:54 PM
Sorry for late reply. VM on big virtualization cluster, flash only SAN, and so on. redis and mysql is local. everything is blazing fast expect executing queries on hosts - even on hosts in the same subnet as the fleet server i just checked again after my last message, did not use fleet over christmas/new years, same problem, but a bit different. if I execute a query (“get openssl versions”) for the first time it takes about 10 seconds. slow… but could be okay? if I then click on “run again” right afterwards it takes over 100 seconds
5:56 PM
and yes, this was the CPU graph of the fleet vm
zwass

zwass

01/07/2022, 5:57 PM
Did we already look at the configured
distributed_interval
for the hosts? You can see it on the host detail page.
d

Daniel Weeber

01/07/2022, 5:58 PM
Distributed interval1 min
zwass

zwass

01/07/2022, 5:59 PM
With that distributed interval, up to 60s would be totally normal (depending on where in the interval the host was when the live query started)
d

Daniel Weeber

01/07/2022, 5:59 PM
But I have “--distributed_interval=10” in my osquery.flags
zwass

zwass

01/07/2022, 5:59 PM
Maybe you have it set differently in your "agent options" in Fleet?
d

Daniel Weeber

01/07/2022, 6:01 PM
You’re right. wow… Just remove that line and apply? Or do I have to restart the remote osquery agents?
6:06 PM
Just removed the global agent config.. still 1min
6:13 PM
Okay, removing the line, restarting remote osqueryd and refetching.. now its showing 10sec
6:13 PM
Thank you!
zwass

zwass

01/07/2022, 11:32 PM
Sounds like things are working as expected now? Or are you still seeing unexpectedly long delays?