https://github.com/osquery/osquery logo
Title
d

Daniel Weeber

12/29/2021, 8:06 PM
I noticed that I am unable to open couple of hosts within fleetdm. showing “Unable to load host. Please try again”. Showing as online though! Live Queries also not working. Had a look at the reverseproxy logs and found bazillion of HTTP 500 for POST to /api/v1/osquery/distributed/write But some other hosts are working fine, live queries also working fine.. and they are located on the same network as the ones which are not working, so I dont suspect any networking/firewalling/proxy issue.
z

zwass

12/29/2021, 8:13 PM
Hmm, is there any more information about the 500 errors in the Fleet server logs? Which version of Fleet are you running? Can you run osqueryd manually on one of the effected hosts with
--verbose --tls_dump
and see if there's any information about what the actual error is?
d

Daniel Weeber

12/29/2021, 8:16 PM
Anything I should have a look for in special?
Only thing I notice so far is
"error": "internal error: load host additional: load pack stats: timestamp: 2021-12-29T21:16:45+01:00: sql: Scan error on column index 10, name \"last_executed\": unsupported Scan, storing driver.Value type \u003cnil\u003e into type *time.Time"
z

zwass

12/29/2021, 8:23 PM
Which Fleet version is this? IIRC we fixed this issue or something similar recently.
d

Daniel Weeber

12/29/2021, 8:28 PM
4.6.2 Will upgrade to 4.7.0 now
z

zwass

12/29/2021, 8:30 PM
Please lmk if that resolves it. I think we may have fixed this issue in 4.7.0.
d

Daniel Weeber

12/29/2021, 8:34 PM
Working! Host is opening up again in fleetdm gui and also executing live queries… But its painfully slow. “get openssl versions” is taking like 1-2mins.
Exact query is
SELECT name AS name, version AS version, 'deb_packages' AS source FROM deb_packages WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'apt_sources' AS source FROM apt_sources WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'rpm_packages' AS source FROM rpm_packages WHERE name LIKE 'openssl%';
just tried via osqueryi on local machine.. result is instant
Even with distributed_interval=1 its slow af
z

zwass

12/29/2021, 8:45 PM
Also instant with
sudo osqueryi
on local machine?
Are other hosts also slow? Or just the previously effected ones?
d

Daniel Weeber

12/29/2021, 8:45 PM
yes. same query on target machine is instant via osqueryi
Nope, presumably all of them.
z

zwass

12/29/2021, 8:47 PM
How's the CPU doing on your Fleet servers/Redis/MySQL?
d

Daniel Weeber

12/29/2021, 8:48 PM
Seems bored
Webinterface of fleetdm is rapid fast
Query of linux, windows and macs (my macbook) is painfully slow, just tested all 3. mac seems a bit faster, but not much about 75 seconds for
SELECT * FROM system_info;
t

Tomas Touceda

01/03/2022, 11:38 AM
hi there! is that CPU graph for the fleet instance? could you tell me a bit more about your setup? eg: what redis and mysql are you using, how is CPU/memory in there
d

Daniel Weeber

01/07/2022, 5:54 PM
Sorry for late reply. VM on big virtualization cluster, flash only SAN, and so on. redis and mysql is local. everything is blazing fast expect executing queries on hosts - even on hosts in the same subnet as the fleet server i just checked again after my last message, did not use fleet over christmas/new years, same problem, but a bit different. if I execute a query (“get openssl versions”) for the first time it takes about 10 seconds. slow… but could be okay? if I then click on “run again” right afterwards it takes over 100 seconds
and yes, this was the CPU graph of the fleet vm
z

zwass

01/07/2022, 5:57 PM
Did we already look at the configured
distributed_interval
for the hosts? You can see it on the host detail page.
d

Daniel Weeber

01/07/2022, 5:58 PM
*Distributed interval*1 min
z

zwass

01/07/2022, 5:59 PM
With that distributed interval, up to 60s would be totally normal (depending on where in the interval the host was when the live query started)
d

Daniel Weeber

01/07/2022, 5:59 PM
But I have “--distributed_interval=10” in my osquery.flags
z

zwass

01/07/2022, 5:59 PM
Maybe you have it set differently in your "agent options" in Fleet?
d

Daniel Weeber

01/07/2022, 6:01 PM
You’re right. wow… Just remove that line and apply? Or do I have to restart the remote osquery agents?
Just removed the global agent config.. still 1min
Okay, removing the line, restarting remote osqueryd and refetching.. now its showing 10sec
Thank you!
z

zwass

01/07/2022, 11:32 PM
Sounds like things are working as expected now? Or are you still seeing unexpectedly long delays?