https://github.com/osquery/osquery logo
Title
d

Dawei Zhang

03/31/2022, 6:32 PM
hi, an issue reg. fleet running behind a load balancer
z

zwass

03/31/2022, 6:33 PM
Ah, this looks like an issue with your load balancer not supporting websockets.
d

Dawei Zhang

03/31/2022, 6:33 PM
When we run query in fleet UI, occassionally, it runs forever, this happens once in 7/8 tries.
we enabled wss, and wss works
for example, this time wss works.
that issue only happens once in 10 times or so
k

Kathy Satterlee

03/31/2022, 8:01 PM
Hi @Dawei Zhang! I'd like to get a little more info about your environment so that we can hopefully dig into the root of the environment. Let's start with: 1. How is your Fleet server deployed? 2. What version of Fleet are you running? 3. Is there anything the handing queries have in common, or does it seem random 4. Are you seeing any errors in the Fleet logs?
t

Tomas Touceda

03/31/2022, 8:17 PM
another important piece of the puzzle is Redis here, we've found some issues depending on configuration. See here for instance: https://fleetdm.com/docs/deploying/faq#im-only-getting-partial-results-from-live-queries
d

Dawei Zhang

04/01/2022, 4:58 PM
@Kathy Satterlee Thanks for looking into it. 1. We deploy two fleet instances behind a Nginx load balancer. 2. fleet version: fleet_v4.12.0_linux 3. Random queries. 4. I do see some errors in logs
{
  "component": "http",
  "err": "timestamp: 2022-03-31T18:30:26Z: error in query ingestion",
  "ingestion-err": "campaign waiting for listener (please retry)",
  "ip_addr": "10.124.121.115",
  "level": "error",
  "method": "POST",
  "took": "6.469503ms",
  "ts": "2022-03-31T18:30:26.444353614Z",
  "uri": "/api/v1/osquery/distributed/write",
  "x_for_ip_addr": "10.124.121.115"
}
let me know if you need more info
@Tomas Touceda Thank you for the info, let me try to update Redis config
a

Artem

04/28/2022, 2:36 PM
@Dawei Zhang hi! Have you solved you problem? We see same error now and try to fix it.
z

zwass

04/28/2022, 3:49 PM
If you are seeing the
xhr_send
request, this likely means your load balancer (or something in the network) is blocking websockets.
d

Dawei Zhang

04/28/2022, 4:31 PM
It's not fixed yet. We only have one instance running now
a

Artem

04/28/2022, 7:05 PM
We set all right settings in load balancer (nginx), so we don’t see any problems with websockets. BTW I have some problems with software inventory (and as result with vulnerability management module) with same errors:
Apr 28 18:55:58 fleet-01.test.tech fleet[3040986]: {"component":"http","err":"timestamp: 2022-04-28T18:55:58Z: error in query ingestion","ingestion-err":"ingesting query software_linux: update host software: insert software: timestamp: 2022-04-28T18:55:58Z: Error 1213: Deadlock found when trying to get lock; try restarting transaction","ip_addr":"172.12.13.14","level":"error","method":"POST","took":"6.156863664s","ts":"2022-04-28T18:55:58.53477351Z","uri":"/api/v1/osquery/distributed/write","x_for_ip_addr":"172.12.13.14"}

Apr 28 18:55:58 fleet-01.test.tech fleet[3040986]: {"component":"http","err":"timestamp: 2022-04-28T18:55:54Z: error in query ingestion || create transaction: timestamp: 2022-04-28T18:55:58Z: context canceled || save host with id 27: timestamp: 2022-04-28T18:55:58Z: context canceled","ingestion-err":"ingesting query software_linux: update host software: insert software: timestamp: 2022-04-28T18:55:54Z: context canceled","ip_addr":"172.12.13.15","level":"error","method":"POST","took":"19.774983596s","ts":"2022-04-28T18:55:58.898478856Z","uri":"/api/v1/osquery/distributed/write","x_for_ip_addr":"172.12.13.15"}
Ad-hoc and scheduled queries work fine. We also know that this is not load balancer problem (direct connection to fleet from osquery represents same problem). So now we try so locate reason between Redis and MySQL
t

Tomas Touceda

04/28/2022, 7:08 PM
hi @Artem what version of fleet are you running?
a

Artem

04/28/2022, 7:09 PM
Some of our clients have several days connection, but we still don’t see software data. P.S. if I do queries from https://github.com/fleetdm/fleet/blob/main/server/service/osquery_utils/queries.go#L391 in interactive mode, they works fine
Hi @Tomas Touceda! Fleet 4.13.0 • Go go1.17.8
I think we need to dive into our Redis and MySQL configs (because they were implemented by different commands). But it would me great if you can give any advices about right places to check. I checked Redis logs and don’t see any errors.
t

Tomas Touceda

04/28/2022, 7:14 PM
that deadlock is in mysql, could you check the host details for host id 27 and see if it shows software there?
a

Artem

04/28/2022, 7:15 PM
No, there is no software in /hosts/27
Currently I don’t have direct access to mysql server, so I can not see its logs, but I will try to do it using our DBA asap 🙂
👍 1