Hi, Most of the time the queries in the UI are jus...
# fleet
k
Hi, Most of the time the queries in the UI are just not reliable. The results never show up no matter how small the query is (
SELECT * FROM osquery_info
). Has anyone else face this issue? Am I doing something wrong?
z
Which Fleet version?
k
3.0.0
z
That is quite old. The first debugging step is usually to upgrade to the latest version.
Live query works reliably for tens of thousands of hosts on recent versions of Fleet.
k
last i checked i couldn't find latest image on docker hub so I though i'll just wait for it
I'll check again after upgrading
a
@koba I'm using 3.9.0 in production. No issues. Latest is 3.10.0 https://github.com/fleetdm/fleet/releases
docker pull fleetdm/fleet:3.9.0
👍 1
k
hmm, so I am trying the latest image
3.10
, But my web-server pods aren't coming up after
kubectl apply
Copy code
❯ kubectl get pods
NAME                                    READY   STATUS             RESTARTS   AGE
fleet-cache-redis-master-0              1/1     Running            0          3d11h
fleet-cache-redis-slave-0               1/1     Running            0          3d10h
fleet-cache-redis-slave-1               1/1     Running            9          74d
fleet-database-mysql-75ff9c4fff-ntt7p   1/1     Running            0          3d11h
fleet-prepare-db-rh78r                  0/1     Completed          0          5m13s
fleet-webserver-5548999cc6-2rcjk        0/1     CrashLoopBackOff   5          4m43s
fleet-webserver-5548999cc6-57zks        0/1     Pending            0          4m43s
fleet-webserver-5548999cc6-7hnw7        0/1     CrashLoopBackOff   5          4m43s
fleet-webserver-5548999cc6-nq26m        0/1     CrashLoopBackOff   5          4m43s
fleet-webserver-5548999cc6-wsdcv        0/1     Pending            0          4m43s
z
You probably need to run the database migrations. Make sure you check out https://github.com/fleetdm/fleet/blob/master/docs/1-Using-Fleet/7-Updating-Fleet.md.
k
I did that here's my
fleet-migrations.yml
Copy code
❯ cat  fleet-migrations.yml
apiVersion: batch/v1
kind: Job
metadata:
  name: fleet-prepare-db
spec:
  template:
    metadata:
      name: fleet-prepare-db
    spec:
      containers:
      - name: fleet
        image: fleetdm/fleet:latest
        command: ["fleet",  "prepare", "db"]
        env:
          - name: FLEET_MYSQL_ADDRESS
            value: fleet-database-mysql:3306
          - name: FLEET_MYSQL_PASSWORD
            valueFrom:
              secretKeyRef:
                name: fleet-database-mysql
                key: mysql-password
      restartPolicy: Never
  backoffLimit: 4
So I was able to upgrade to 3.10. I had to add few more env variable in my
fleet-deployment.yml
3.10 looks very noice!!
So i ran a test query
select * from osquery_info
and this is what i get
So I tried many queries after the upgrade. Only one (that I ran on a single machine) showed up the results. Rest all kept "spinning". Also, as you can see in the screenshot the progress status text says
0 out of 0 online hosts responding
...which is definitely not true because there were some 500+ hosts online at that point.
z
Do hosts show up in the All Hosts label when you are on the home page? Does it work consistently when targeting a single host? If you open the network inspector are you able to see the websocket connection? Like in screenshot:
k
Do hosts show up in the All Hosts label when you are on the home page? 
Yes.
Does it work consistently when targeting a single host?
Yes. Fails to establish web socket connection and show any results for the same host during multiple attempts.
z
Oh. This is likely a load balancer issue. You'll want to dig into why the websocket connection fails. It does try to fall back to XHR instead of websockets but you really want websockets working for reliability.
k
i see. it makes some sense now, I tried running query on a host 3-4 time in the morning and websocket failed every single time, However it gave results later during the day.
I'll investigate further from the LB side
It does try to fall back to XHR instead of websockets
But does that help though? Because I still didn't see any results...
z
It used to work mostly. I don't think many folks have LBs these days that don't support websockets. I think we may want to disable that functionality and just show you an error.
k
I realised i was running AWS' classic load balancer that doesn't support websockets. So I migrated to Application LB. But i still face the same issue (websocket error in dev console and XHR fall back fails with 404 Page not found). XHR fallback, however, works as soon as I set
replicas: 1
in my
fleet-deployment.yml
. Any thing more that 1, XHR fallback fails with 404.
z
Ahh, if you want XHR to work with multiple servers I think you'll need to make sure the LB has sticky sessions so that each request goes to the same server. Best move is just to get websockets working though.