Title
#kolide
Erich Stoekl

Erich Stoekl

04/28/2020, 6:59 PM
Hi Folks. I am running into the strangest issue with Fleet. My distributed queries sometimes execute and sometimes do not. I am monitoring logs both on the Fleet server and the osquery node. I see in the Fleet server that a NewDistributedQueryCampaign is scheduled successfully. On the osquery node, I am monitoring with
--verbose
and
--tls_dump
to see all data. I see that the osquery node is polling the read endpoint (
api/v1/osquery/distributed/read
) with the correct node key. It usually just gets back an empty
queries:
response. Sometimes, however, it gets the proper query and runs it! It seems to run it about 10% of the time. Also notable is that creating packs/scheduling queries works 100% of the time. My Fleet server is deployed behind an HAProxy LB. The LB uses its own certificates (signed, wildcard), and my Fleet server uses self-signed certs. The osquery node is using the public key
pem
file for the LB cert, and it enrolls properly. Anyone have any ideas?
sundsta

sundsta

04/28/2020, 7:27 PM
Is HAProxy caching anything?
7:28 PM
I think it’s a GET request, so the response may be cached with an empty queries object from when osqueryd checked in and there were no distributed queries to run
7:32 PM
Edit: looks like it’s a POST request, so less likely to be cached, but still something to check
Erich Stoekl

Erich Stoekl

04/28/2020, 8:46 PM
interesting... i'll look into that
9:00 PM
Woah, I just found something crazy. When I'm logged in to the Fleet Web UI not using the load balancer, so connecting to the UI directly at the server, then my distributed queries work 100% of the time! The osquery nodes can be connected through the LB!
10:24 PM
It seems that when scheduling queries through the LB endpoint, the queries don't actually get scheduled
zwass

zwass

04/29/2020, 4:04 PM
Are you referring to scheduled queries (in packs) or live queries?
Erich Stoekl

Erich Stoekl

04/29/2020, 5:01 PM
live queries
zwass

zwass

04/29/2020, 5:03 PM
Does your LB support websockets? We try to support live queries either way, but that could potentially be an issue.
Erich Stoekl

Erich Stoekl

04/29/2020, 5:36 PM
I'm not sure, I will check. Does it normally use websockets? Can I explicitly disable websockets
zwass

zwass

04/29/2020, 9:08 PM
Normally it pushes results over a websocket. I don't think you can explicitly disable them.
Erich Stoekl

Erich Stoekl

04/29/2020, 9:48 PM
The osquery agent or the fleet server pushes results via websocket? I see some stuff in the
LiveQuery
method in
server/service/client_live_query.go
but I'm having trouble figuring out what the websocket is doing
9:52 PM
ohh it's so the Web UI can interact with the fleet server backend, right?
zwass

zwass

04/29/2020, 9:54 PM
yes
Erich Stoekl

Erich Stoekl

04/29/2020, 11:25 PM
Interesting, when not using the LB, my queries work but I see the following error in the Chrome Console:
WebSocket connection to '<wss://my-kolide-svc.my-company.com/api/v1/kolide/results/207/efwfuaj3/websocket>' failed: WebSocket is closed before the connection is established.
However I see in the network tab that the websocket performed the content download and it took 2.26 ms. When executing the live query through the LB, I see in the network tab that the websocket is initiated, but it just hangs there.