Title
#general
Esteban

Esteban

10/20/2020, 4:09 PM
There's any performance issue un 4.5.1? I'm running queries on kolide on 4.5.1 hosts and they are not returning nothing until i restart the service on each hosts. Even an osquery_info Query is taking forever and not returning nothing
s

seph

10/20/2020, 4:17 PM
I don’t think I’ve seen or heard about any issues with 4.5.1 or 4.5.0. Do you have more information? How did 4.5.1 get installed? (If there’s any of Kolide’s update mechanism here, this may belong on #kolide)
Esteban

Esteban

10/20/2020, 4:20 PM
Via kolide launcher with autoupdate enabled
4:32 PM
It seems that after running a Query on windows_eventlog table (without where channel) the subsequent queries starts to fail or take too long
a

alessandrogario

10/20/2020, 4:55 PM
Hello Esteban! Could you please share us the query you are using? (cc @Akshay Kumar)
s

seph

10/20/2020, 5:15 PM
Are you using Kolide’s update servers, or your own? I only pushed 4.5.1 to stable 3 hours ago….
5:15 PM
And to clarify a bit here, the nodes update. Are running 4.5.1, and then a query to the eventlog table renders them unresponsive?
5:15 PM
That table is new to 4.5.0, isn’t it?
a

alessandrogario

10/20/2020, 5:17 PM
I suspect that the table doesn't play well without a WHERE clause. It will pull in too much data, causing the watcher to kill osquery
5:18 PM
Regardless, I think the table should reject queries without a
WHERE channel =
clause and also set a configurable limit of events that can be read from the event source
s

seph

10/20/2020, 6:17 PM
Yeah, that’s a good guess.
6:18 PM
Pretty sure I called out something like that in the PR. It’s one of those boil-the-ocean sorts of things
Esteban

Esteban

10/21/2020, 1:13 PM
I'm only running select * from windows_eventlog. After that i run any other Query like select * from osquery_info and everything stops working
1:15 PM
I'm using Kolide servers, i've setup up an installer with kolide launcher and package builder with update channel on stable. The thing i noticed is that i've configure the osquery version to 4.5.0 and update channel to stable, and when the package finishes installing it autoupdates to 4.5.1
s

seph

10/21/2020, 1:19 PM
I think the Kolide part isn’t a factor here. But yes, if you point launcher’s autoupdate at our servers, you will get what we release as stable. Doesn’t matter what you build the package with. It will downgrade or upgrade as needed.
1:21 PM
I suspect
select * from windows_eventlog
is the issue. That attempts to read most events into ram, and likely cause issues. Though I thought it had a required
channel
and
xpath
Though it’s a bit weird and complicated)
1:22 PM
I’m not sure if it got documented well. You could take a look at the discussion and code in https://github.com/osquery/osquery/pull/6563 where it merged
1:22 PM
There may well be a bug in that table
Esteban

Esteban

10/21/2020, 1:24 PM
I've just tested it with
WHERE
clause and it "crashes", i usually do that way and works fine but i've never tested it with
xpath
1:24 PM
Maybe quering by channel and event id?
1:33 PM
Ok, filtering by channel and eventid also kills osquery on the host apparently. It's a Windows host so i don't know a proper way to debug it.
1:47 PM
Limiting by 3 or 5 works also
s

seph

10/21/2020, 2:14 PM
Hrm. You said you don’t have access to a machine to debug this locally?
2:18 PM
I’m breaking out some test cases…
Esteban

Esteban

10/21/2020, 2:19 PM
No, i have access indeed, i don't know how to debug the service on the Windows machine
s

seph

10/21/2020, 2:20 PM
What happens if you run osquery on the command line?
2:20 PM
Testing in my environment, I cannot replicate this. If I run
select * from windows_eventlog
, I get back an error. My device is not hung,
2:23 PM
Though if I query for
select * from windows_eventlog where channel = "Security"
it’s taking a long time to return these results.
2:24 PM
I suspect this is not an osquery bug. There may or may not be a bug somewhere in the kolide stack. r this might be expected behavior around a table with this many rows. Not sure yet
2:33 PM
Might be a launcher bug in how this stuff is marshalled and send over the wire. Feel free to open a bug in the launcher repo, though I don’t know if I can prioritize it
Esteban

Esteban

10/21/2020, 2:34 PM
Yes, yesterday it happened the same to me. I Will try it on the host's cli, is any way to Open a CLI for hosts installed with package builder ? Or you talking about fleetctl CLI?
s

seph

10/21/2020, 2:35 PM
Neither. Running osquery from powershell is sometimes a good way to debug
Esteban

Esteban

10/21/2020, 2:35 PM
Understood
2:36 PM
The Query without Where clause it's returning something for You?
a

Akshay Kumar

10/21/2020, 2:40 PM
The
windows_eventlog
query without WHERE clause should return error log. It requires a channel or xpath to query the events. Also if you are querying the events without other constraints, it may take longer time depending on the number of events in the
security
or
Application
channel.
Esteban

Esteban

10/21/2020, 2:43 PM
Querying with WHERE clause also hangs up subsequent queries, also the Query takes too long to return something
s

seph

10/21/2020, 2:46 PM
I think I replicated what you’re seeing. My guess is there’s something weird in launcher about marshalling that much data. As said, feel free to open an issue in the launcher repo, though I don’t know if I can prioritize it
2:46 PM
Actually, I’ll open one
a

Akshay Kumar

10/21/2020, 2:50 PM
The Query with WHERE clause may take time because of large number of events log in a specific channel. I am planning to add a
max_windows_eventlog_events
flag with the query as suggested by @alessandrogario. This will reduce the query time.
s

seph

10/21/2020, 2:51 PM
I don’t think it’s query time. I think it’s data returned
a

Akshay Kumar

10/21/2020, 2:55 PM
Is it the size of data or some specific event data is causing the problem?
Esteban

Esteban

10/21/2020, 2:55 PM
For example querying with limit 3 or 5 works fine.