I noticed that I am unable to open couple of hosts...
# fleet
d
I noticed that I am unable to open couple of hosts within fleetdm. showing “Unable to load host. Please try again”. Showing as online though! Live Queries also not working. Had a look at the reverseproxy logs and found bazillion of HTTP 500 for POST to /api/v1/osquery/distributed/write But some other hosts are working fine, live queries also working fine.. and they are located on the same network as the ones which are not working, so I dont suspect any networking/firewalling/proxy issue.
z
Hmm, is there any more information about the 500 errors in the Fleet server logs? Which version of Fleet are you running? Can you run osqueryd manually on one of the effected hosts with
--verbose --tls_dump
and see if there's any information about what the actual error is?
d
Anything I should have a look for in special?
Only thing I notice so far is
Copy code
"error": "internal error: load host additional: load pack stats: timestamp: 2021-12-29T21:16:45+01:00: sql: Scan error on column index 10, name \"last_executed\": unsupported Scan, storing driver.Value type \u003cnil\u003e into type *time.Time"
z
Which Fleet version is this? IIRC we fixed this issue or something similar recently.
d
4.6.2 Will upgrade to 4.7.0 now
z
Please lmk if that resolves it. I think we may have fixed this issue in 4.7.0.
d
Working! Host is opening up again in fleetdm gui and also executing live queries… But its painfully slow. “get openssl versions” is taking like 1-2mins.
Exact query is
Copy code
SELECT name AS name, version AS version, 'deb_packages' AS source FROM deb_packages WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'apt_sources' AS source FROM apt_sources WHERE name LIKE 'openssl%' UNION SELECT name AS name, version AS version, 'rpm_packages' AS source FROM rpm_packages WHERE name LIKE 'openssl%';
just tried via osqueryi on local machine.. result is instant
Even with distributed_interval=1 its slow af
z
Also instant with
sudo osqueryi
on local machine?
Are other hosts also slow? Or just the previously effected ones?
d
yes. same query on target machine is instant via osqueryi
Nope, presumably all of them.
z
How's the CPU doing on your Fleet servers/Redis/MySQL?
d
Seems bored
Webinterface of fleetdm is rapid fast
Query of linux, windows and macs (my macbook) is painfully slow, just tested all 3. mac seems a bit faster, but not much about 75 seconds for
Copy code
SELECT * FROM system_info;
t
hi there! is that CPU graph for the fleet instance? could you tell me a bit more about your setup? eg: what redis and mysql are you using, how is CPU/memory in there
d
Sorry for late reply. VM on big virtualization cluster, flash only SAN, and so on. redis and mysql is local. everything is blazing fast expect executing queries on hosts - even on hosts in the same subnet as the fleet server i just checked again after my last message, did not use fleet over christmas/new years, same problem, but a bit different. if I execute a query (“get openssl versions”) for the first time it takes about 10 seconds. slow… but could be okay? if I then click on “run again” right afterwards it takes over 100 seconds
and yes, this was the CPU graph of the fleet vm
z
Did we already look at the configured
distributed_interval
for the hosts? You can see it on the host detail page.
d
*Distributed interval*1 min
z
With that distributed interval, up to 60s would be totally normal (depending on where in the interval the host was when the live query started)
d
But I have “--distributed_interval=10” in my osquery.flags
z
Maybe you have it set differently in your "agent options" in Fleet?
d
You’re right. wow… Just remove that line and apply? Or do I have to restart the remote osquery agents?
Just removed the global agent config.. still 1min
Okay, removing the line, restarting remote osqueryd and refetching.. now its showing 10sec
Thank you!
z
Sounds like things are working as expected now? Or are you still seeing unexpectedly long delays?