Hi guys, Could someone please help me to connect ...
# fleet
a
Hi guys, Could someone please help me to connect osquery agents on Windows Server 2019 with fleet? Here are some points: • I have same configs and deploy scripts both on problem servers and on basic Windows 10 VMs (works fine); • osquery agents successfully passes enrolling, I see it in logs and I see agents in UI, but they are in state "Never fetched" and mostly in "Offline" state; • I checked the debug output of osquery daemon and see only one possible issue:
W0207 18:02:24.934172 4352 watcher.cpp:391] osqueryd worker (560) stopping: Maximum sustainable CPU utilization limit exceeded: 18
close after executing
fleet_detail_query_software_windows
• Adding
--disable_watchdog=false --watchdog_delay=120 --watchdog_level=0 --watchdog_memory_limit=400 --watchdog_utilization_limit=21
was with no luck; And now I have no thoughts..
z
All of the failing hosts are Windows Server?
a
Right, Microsoft Windows Server 2019 Standard 10.0 and Microsoft Windows Server 2019 Datacenter 10.0
t
hi there, one thing you can try to narrow it all down to the software inventory query is disabling software inventory:
Copy code
---
apiVersion: v1
kind: config
spec:
  host_settings:
    enable_software_inventory: false
if you apply that yaml, fleet will stop sending the software inventory queries to the hosts
could you tell me a bit more about your fleet setup: how many labels. policies, and packs/queries you've got?
a
Hi Tomas, Honestly, I tried to minimise query packs for these hosts. As they are Domain Controllers I expect too much ntfs or socket events. I attached the debug output of osquery daemon. You can see all applied queries. I have same result with disabled "process_all" query pack. My fleet installation is about 1k hosts, only 4 labels and no policies at all.
t
we have pending work to distribute the queries better for cases like this, I do notice you're running 5.1.0, would you be able to try 4.9.0? I have seen in the past issues with 5.x (in linux, though), but it would be good to discard some osquery issue
a
I dont have debug output for 4.9.0, but it was the same problem with it. I hopped upgrade will fix it 😞
t
ok, well that clears that doubt
have you tried disabling software inventory and seeing if that is the query causing the issue or if it's just a symptom of something else?
a
Oh, I tried to disable software inventory as you recommended and it helped to resolve the issue! I also tried to set
enable_host_users = false
as that query returns about 1800 domain users, but it looks like that the issue only with
enable_software_inventory
t
that's great! so either the software inventory query pushed things over the edge, or it's taking too long. Would you be able to run this in one of the windows hosts:
Copy code
WITH cached_users AS (SELECT * FROM users)
SELECT
  name AS name,
  version AS version,
  'Program (Windows)' AS type,
  'programs' AS source
FROM programs
UNION
SELECT
  name AS name,
  version AS version,
  'Package (Python)' AS type,
  'python_packages' AS source
FROM python_packages
UNION
SELECT
  name AS name,
  version AS version,
  'Browser plugin (IE)' AS type,
  'ie_extensions' AS source
FROM ie_extensions
UNION
SELECT
  name AS name,
  version AS version,
  'Browser plugin (Chrome)' AS type,
  'chrome_extensions' AS source
FROM cached_users CROSS JOIN chrome_extensions USING (uid)
UNION
SELECT
  name AS name,
  version AS version,
  'Browser plugin (Firefox)' AS type,
  'firefox_addons' AS source
FROM cached_users CROSS JOIN firefox_addons USING (uid)
UNION
SELECT
  name AS name,
  version AS version,
  'Package (Chocolatey)' AS type,
  'chocolatey_packages' AS source
FROM chocolatey_packages
UNION
SELECT
  name AS name,
  version AS version,
  'Package (Atom)' AS type,
  'atom_packages' AS source
FROM cached_users CROSS JOIN atom_packages USING (uid)
UNION
SELECT
  name AS name,
  version AS version,
  'Package (Python)' AS type,
  'python_packages' AS source
FROM python_packages;
to see if that's taking too long on its own?
a
So, users query returns me 1800 results (all domain users except computers). Software inventory uses
SELECT * FROM users
which returns ~3500 results (domain users + domain computers) and it increases CPU usage. I think this SELECT should be as in users query.
t
ah, interesting find!
we are discussing internally to see how we can better approach this, btw, thank you for your patience!
🙏 1
z
If you do
select * from users
on that DC, do all of the users have a value for the
directory
column?
a
No, it looks like
directory
field exists only for accounts with successful local login. I attached redacted results with filled
directory
, all domain users looks like as in the last two lines.
Hi, did you come to any conclusion?
t
hi there! this issue is still ongoing
👌 1
g
Also encountered this, on both Windows 2016/2019 running osqueryd 5.2.2, but only on domain controllers though. No more issue after setting
enable_software_inventory: false
d
Same here on multiple Windows DCs. Disabled software_inventory & enable_host_users and things are stable