Title
#fleet
w

wennan.he

09/16/2022, 3:49 AM
anyone knows what is the err meaning when i start the fleet? Sep 16 03:47:19 n107-019-021 fleet[1560407]: {"component":"http","err":"authentication error: invalid node key: JlBVRLpv/doDpN1CvShCIpZpnfCERea0","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-16T03:47:19.825997124Z"}
Michal Nicpon

Michal Nicpon

09/16/2022, 3:24 PM
This just means that one of the hosts that was enrolled with fleet has an invalid node key. In most cases, the osqueryd running on the host should successfully re-enroll if they have a valid enroll secret. Do you see this message repeated?
w

wennan.he

09/16/2022, 5:19 PM
yes i did., may i know how to locate the info of host failed on enrollment?
Kathy Satterlee

Kathy Satterlee

09/16/2022, 5:37 PM
You can grab that from the Rest API at
api/vu/fleet/hosts/identifier/<key from the error>
https://fleetdm.com/docs/using-fleet/rest-api#get-host-by-identifier Hope that helps!
5:40 PM
Though I just realized that this may not return if you're getting an invalid node key. 🤦
5:42 PM
Please give it a go and let me know what happens.
w

wennan.he

09/16/2022, 8:19 PM
well, i just see a lot of same err requests from our log. and is there anyway i can locate that data from fleet db?
Kathy Satterlee

Kathy Satterlee

09/16/2022, 8:35 PM
You could query the database directly, yes. But if nothing came back from the API call, I don't believe you'll get anything back from there either. Can you share what the response was from the Rest API? I realize there was a typo in the endpoint the first time I gave it:
<your fleet address>/api/v1/fleet/hosts/identifier/<node key>
Or, using MySQL to query the Fleet db:
SELECT id, hostname FROM hosts WHERE node_key=<node key>
w

wennan.he

09/16/2022, 8:39 PM
let me try other case.
8:41 PM
actually i cannot see any err like this anymore, but instead, i c Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"retrieve label queries: selecting label queries for host: context canceled","ip_addr":"10.114.61.96","level":"error","method":"POST","took":"15.995500738s","ts":"2022-09-16T19:41:51.981100237Z","uri": Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/config","ts":"2022-09-16T19:41:51.981438035Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.981738413Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-16T19:41:51.982167134Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.982510319Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.982865912Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-16T19:41:51.983389773Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-16T19:41:51.983705104Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.983845062Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-16T19:41:51.984115883Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: 2022/09/16 19:41:51 http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 5ms Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.987641322Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/write","ts":"2022-09-16T19:41:51.988006235Z"} Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"database error: listing hosts in pack: context canceled","ip_addr":"10.121.81.47","level":"error","method":"POST","took":"15.985706633s","ts":"2022-09-16T19:41:51.989177743Z","uri":"/api/v1/osquery/co Sep 16 19:41:51 n107-019-021 fleet[2438691]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/config","ts":"2022-09-16T19:41:51.989793985Z"}
8:42 PM
could u help to explain?
Michal Nicpon

Michal Nicpon

09/19/2022, 5:20 PM
We commonly see “context canceled” errors when queries to the database are taking too long and timing out. Can you run the following on your database?
show engine innodb status;
show processlist;
w

wennan.he

09/19/2022, 8:41 PM
i ran the cmd and got +---------+------------+---------------------+-------+---------+------+-------+------------------+ | Id | User | Host | db | Command | Time | State | Info | +---------+------------+---------------------+-------+---------+------+-------+------------------+ | 1974584 | fleet_user | 10.121.11.134:34298 | fleet | Sleep | 3163 | | NULL | | 1974585 | fleet_user | 10.121.11.134:34300 | fleet | Sleep | 3163 | | NULL | | 1974586 | fleet_user | 10.121.11.134:34302 | fleet | Sleep | 3162 | | NULL | | 1974587 | fleet_user | 10.121.11.134:34304 | fleet | Sleep | 3166 | | NULL | | 1974588 | fleet_user | 10.121.11.134:34306 | fleet | Sleep | 3165 | | NULL | | 1974589 | fleet_user | 10.121.11.134:34308 | fleet | Sleep | 3163 | | NULL | | 1974590 | fleet_user | 10.121.11.134:34310 | fleet | Sleep | 3166 | | NULL | | 1974591 | fleet_user | 10.121.11.134:34312 | fleet | Sleep | 3163 | | NULL | | 1974592 | fleet_user | 10.121.11.134:34314 | fleet | Sleep | 3165 | | NULL | | 1974593 | fleet_user | 10.121.11.134:34316 | fleet | Sleep | 3163 | | NULL | | 1974594 | fleet_user | 10.121.11.134:34318 | fleet | Sleep | 3162 | | NULL | | 1974595 | fleet_user | 10.121.11.134:34320 | fleet | Sleep | 3163 | | NULL | | 1974596 | fleet_user | 10.121.11.134:34322 | fleet | Sleep | 3163 | | NULL | | 1974597 | fleet_user | 10.121.11.134:34324 | fleet | Sleep | 3162 | | NULL | | 1974598 | fleet_user | 10.121.11.134:34326 | fleet | Sleep | 3163 | | NULL | | 1974599 | fleet_user | 10.121.11.134:34328 | fleet | Sleep | 3162 | | NULL | | 1974600 | fleet_user | 10.121.11.134:34330 | fleet | Sleep | 3163 | | NULL | | 1974601 | fleet_user | 10.121.11.134:34332 | fleet | Sleep | 3163 | | NULL | | 1974602 | fleet_user | 10.121.11.134:34334 | fleet | Sleep | 3162 | | NULL | | 1974603 | fleet_user | 10.121.11.134:34336 | fleet | Sleep | 3162 | | NULL | | 1974604 | fleet_user | 10.121.11.134:34338 | fleet | Sleep | 3163 | | NULL | | 1974605 | fleet_user | 10.121.11.134:34340 | fleet | Sleep | 3162 | | NULL | | 1974606 | fleet_user | 10.121.11.134:34342 | fleet | Sleep | 3166 | | NULL | | 1974607 | fleet_user | 10.121.11.134:34344 | fleet | Sleep | 3163 | | NULL | | 1974608 | fleet_user | 10.121.11.134:34346 | fleet | Sleep | 3167 | | NULL | | 1974609 | fleet_user | 10.121.11.134:34348 | fleet | Sleep | 3162 | | NULL | | 1974610 | fleet_user | 10.121.11.134:34350 | fleet | Sleep | 3163 | | NULL | | 1974611 | fleet_user | 10.121.11.134:34352 | fleet | Sleep | 3163 | | NULL | | 1974612 | fleet_user | 10.121.11.134:34354 | fleet | Sleep | 3166 | | NULL | | 1974613 | fleet_user | 10.121.11.134:34356 | fleet | Sleep | 3163 | | NULL | | 1974614 | fleet_user | 10.121.11.134:34358 | fleet | Sleep | 3163 | | NULL | | 1974615 | fleet_user | 10.121.11.134:34360 | fleet | Sleep | 3163 | | NULL | | 1974616 | fleet_user | 10.121.11.134:34362 | fleet | Sleep | 3163 | | NULL | | 1974617 | fleet_user | 10.121.11.134:34364 | fleet | Sleep | 3163 | | NULL | | 1974618 | fleet_user | 10.121.11.134:34366 | fleet | Sleep | 3167 | | NULL | | 1974619 | fleet_user | 10.121.11.134:34368 | fleet | Sleep | 3167 | | NULL | | 1974620 | fleet_user | 10.121.11.134:34370 | fleet | Sleep | 3163 | | NULL | | 1974621 | fleet_user | 10.121.11.134:34372 | fleet | Sleep | 3162 | | NULL | | 1974622 | fleet_user | 10.121.11.134:34374 | fleet | Sleep | 3162 | | NULL | | 1974623 | fleet_user | 10.121.11.134:34376 | fleet | Sleep | 3163 | | NULL | | 1974624 | fleet_user | 10.121.11.134:34378 | fleet | Sleep | 3163 | | NULL | | 1974625 | fleet_user | 10.121.11.134:34380 | fleet | Sleep | 3163 | | NULL | | 1974626 | fleet_user | 10.121.11.134:34382 | fleet | Sleep | 3163 | | NULL | | 1974627 | fleet_user | 10.121.11.134:34384 | fleet | Sleep | 3163 | | NULL | | 1974628 | fleet_user | 10.121.11.134:34386 | fleet | Sleep | 3162 | | NULL | | 1974629 | fleet_user | 10.121.11.134:34388 | fleet | Sleep | 3163 | | NULL | | 1974630 | fleet_user | 10.121.11.134:34390 | fleet | Sleep | 3162 | | NULL | | 1974631 | fleet_user | 10.121.11.134:34392 | fleet | Sleep | 3162 | | NULL | | 1974632 | fleet_user | 10.121.11.134:34394 | fleet | Sleep | 3162 | | NULL | | 2386345 | fleet_user | 10.121.8.225:46432 | fleet | Sleep | 295 | | NULL | | 2387079 | fleet_user | 10.121.8.225:47132 | fleet | Query | 0 | init | show processlist | +---------+------------+---------------------+-------+---------+------+-------+------------------+ it looks like we dont have so many threads blocking the db conns. but at mean while, i still c the err in the log of fleet and i cannot access fleet portal and got 502 err.
8:42 PM
Sep 19 20:41:53 n121-008-225 fleet[3648337]: {"component":"http","err":"authentication error: find host: timestamp: 2022-09-19T20:36:14Z: context canceled","level":"info","path":"/api/v1/osquery/config","ts":"2022-09-19T20:36:14.727078865Z"}
8:42 PM
this is the err i c in the log.
2:15 AM
@Michal Nicpon i suffering the same issue again and i got this when i run show engine innodb status;
2:15 AM

| InnoDB | |

2022-09-21 02:11:

19 0x7f5daedab700 INNODB MONITOR OUTPUT

Per second averages calculated from the last 22 seconds

BACKGROUND THREAD

srv_master_thread loops: 72529 srv_active, 0 srv_shutdown, 163964 srv_idle srv_master_thread log flush and writes: 236491----------

SEMAPHORES

OS WAIT ARRAY INFO: reservation count 2341481 OS WAIT ARRAY INFO: signal count 24338048 RW-shared spins 0, rounds 25940605, OS waits 424060 RW-excl spins 0, rounds 300796801, OS waits 596040 RW-sx spins 17062977, rounds 180087813, OS waits 922983 Spin rounds per wait: 25940605.00 RW-shared, 300796801.00 RW-excl, 10.55 RW-sx------------

TRANSACTIONS

Trx id counter 12624344 Purge done for trx's n😮 < 12624343 undo n😮 < 0 state: running but idle History list length 35 LIST OF TRANSACTIONS FOR EACH SESSION: ---TRANSACTION 421516198238992, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198222432, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198236232, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198214152, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198208632, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198233472, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198195752, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198210472, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198221512, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198235312, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198213232, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198211392, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198203112, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198201272, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198225192, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198197592, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198200352, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198192992, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198207712, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198220592, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198194832, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198196672, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198234392, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198218752, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198204032, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198216912, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198198512, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198227952, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198231632, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198223352, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198215072, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198205872, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198202192, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198228872, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198209552, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198229792, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198238072, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198215992, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198226112, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198227032, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198224272, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198217832, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198212312, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198193912, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198219672, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198199432, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198230712, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198237152, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198206792, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198232552, not started 0 lock struct(s), heap size 1136, 0 row lock(s) ---TRANSACTION 421516198204952, not started 0 lock struct(s), heap size 1136, 0 row lock(s)--------

FILE I/O

I/O thread 0 state: waiting for completed aio requests (insert buffer thread) I/O thread 1 state: waiting for completed aio requests (log thread) I/O thread 2 state: waiting for completed aio requests (read thread) I/O thread 3 state: waiting for completed aio requests (read thread) I/O thread 4 state: waiting for completed aio requests (read thread) I/O thread 5 state: waiting for completed aio requests (read thread) I/O thread 6 state: waiting for completed aio requests (write thread) I/O thread 7 state: waiting for completed aio requests (write thread) I/O thread 8 state: waiting for completed aio requests (write thread) I/O thread 9 state: waiting for completed aio requests (write thread) Pending normal aio reads: [0, 0, 0, 0] , aio writes: [0, 0, 0, 0] , ibuf aio reads:, log i/o's:, sync i/o's: Pending flushes (fsync) log: 0; buffer pool: 0 4927568 OS file reads, 5851378 OS file writes, 3296833 OS fsyncs 32.73 reads/s, 16384 avg bytes/read, 36.77 writes/s, 14.54 fsyncs/s-------------------------------------

INSERT BUFFER AND ADAPTIVE HASH INDEX

Ibuf: size 1, free list len 1763, seg size 1765, 12622 merges merged operations: insert 17289, delete mark 56533, delete 7033 discarded operations: insert 0, delete mark 0, delete 0 Hash table size 34673, node heap has 33 buffer(s) Hash table size 34673, node heap has 58 buffer(s) Hash table size 34673, node heap has 36 buffer(s) Hash table size 34673, node heap has 7 buffer(s) Hash table size 34673, node heap has 8 buffer(s) Hash table size 34673, node heap has 64 buffer(s) Hash table size 34673, node heap has 63 buffer(s) Hash table size 34673, node heap has 1 buffer(s) 9217.13 hash searches/s, 1054.68 non-hash searches/s---

LOG

Log sequence number 35694526585 Log flushed up to 35694526585 Pages flushed up to 35694526585 Last checkpoint at 35694526412 0 pending log flushes, 0 pending chkp writes 2968395 log i/o's done, 11.73 log i/o's/second----------------------

BUFFER POOL AND MEMORY

Total large memory allocated 137428992 Dictionary memory allocated 678571 Buffer pool size 8191 Free buffers 1024 Database pages 6897 Old database pages 2525 Modified db pages 0 Pending reads 0 Pending writes: LRU 0, flush list 0, single page 0 Pages made young 199355, not young 204938329 2.27 youngs/s, 770.10 non-youngs/s Pages read 4927491, created 109958, written 2772145 32.73 reads/s, 0.00 creates/s, 24.09 writes/s Buffer pool hit rate 999 / 1000, young-making rate 0 / 1000 not 33 / 1000 Pages read ahead 0.00/s, evicted without access 0.00/s, Random read ahead 0.00/s LRU len: 6897, unzip_LRU len: 0 I/O sum[3613]:cur[0], unzip sum[0]:cur[0]--------------

ROW OPERATIONS

0 queries inside InnoDB, 0 queries in queue 0 read views open inside InnoDB Process ID=3965556, Main thread ID=140040921364224, state: sleeping Number of rows inserted 1255910, updated 161249667, deleted 7670, read 3065965772 1.59 inserts/s, 182.17 updates/s, 2.68 deletes/s, 18251.76 reads/s----------------------------

END OF INNODB MONITOR OUTPUT

2:15 AM
could u help to explain what is the issue of fleet?
Michal Nicpon

Michal Nicpon

09/21/2022, 4:27 PM
Hmm, do you notice any particular patterns for when you start seeing these errors? There is an interesting error I saw
Sep 16 19:41:51 n107-019-021 fleet[2438691]: 2022/09/16 19:41:51 http: Accept error: accept tcp [::]:8080: accept4: too many open files; retrying in 5ms
Which suggests that maybe your fleet instance is trying to handle too many requests. Can you give me some information about your architecture? • How many fleet instances are you running? How much memory and cpu do they have? • How many hosts are enrolled with fleet?
w

wennan.he

09/21/2022, 4:33 PM
• How many fleet instances are you running? How much memory and cpu do they have? • 1, mem:no limit cpu need to check it out • How many hosts are enrolled with fleet? 20k
Michal Nicpon

Michal Nicpon

09/21/2022, 4:35 PM
Hmm, do you notice any particular patterns for when you start seeing these errors?
For example, do they happen every hour or do you see these errors consistently?
w

wennan.he

09/21/2022, 4:35 PM
cpu info root@n107-019-021😕# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Platinum 8275CL CPU @ 3.00GHz Stepping: 7 CPU MHz: 3599.998 BogoMIPS: 5999.99 Hypervisor vendor: KVM Virtualization type: full L1d cache: 32K L1i cache: 32K L2 cache: 1024K L3 cache: 36608K NUMA node0 CPU(s): 0-7
4:36 PM
i c a lot of errs have parttern like
4:37 PM
Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.594540367Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.596261708Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.598058435Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"retrieve policy queries: selecting policies for host: context canceled","ip_addr":"10.121.86.190","level":"error","method":"POST","took":"15.234862161s","ts":"2022-09-21T16:32:04.600067148Z","uri":"/api/v1/osquery/distributed/read","x_for_ip_addr":"10.121.86.190"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"database error: listing hosts in pack: context canceled","ip_addr":"10.121.43.106","level":"error","method":"POST","took":"15.236998083s","ts":"2022-09-21T16:32:04.602220416Z","uri":"/api/v1/osquery/config","x_for_ip_addr":"10.121.43.106"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"retrieve label queries: selecting label queries for host: context canceled","ip_addr":"10.121.34.10","level":"error","method":"POST","took":"15.193069269s","ts":"2022-09-21T16:32:04.603227619Z","uri":"/api/v1/osquery/distributed/read","x_for_ip_addr":"10.121.34.10"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.60417264Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"retrieve policy queries: selecting policies for host: context canceled","ip_addr":"10.121.79.160","level":"error","method":"POST","took":"15.228115695s","ts":"2022-09-21T16:32:04.605320777Z","uri":"/api/v1/osquery/distributed/read","x_for_ip_addr":"10.121.79.160"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"retrieve policy queries: selecting policies for host: context canceled","ip_addr":"10.121.97.143","level":"error","method":"POST","took":"15.229330501s","ts":"2022-09-21T16:32:04.606596027Z","uri":"/api/v1/osquery/distributed/read","x_for_ip_addr":"10.121.97.143"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || getting app config: selecting app config: context canceled","ingestion-err":"ingest detail query: selecting app config: context canceled","ip_addr":"10.121.41.122","level":"error","method":"POST","took":"23.479465627s","ts":"2022-09-21T16:32:04.606809732Z","uri":"/api/v1/osquery/distributed/write","x_for_ip_addr":"10.121.41.122"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"retrieve label queries: selecting label queries for host: context canceled","ip_addr":"10.121.92.165","level":"error","method":"POST","took":"15.219550645s","ts":"2022-09-21T16:32:04.607488476Z","uri":"/api/v1/osquery/distributed/read","x_for_ip_addr":"10.121.92.165"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || getting app config: selecting app config: context canceled","ingestion-err":"ingest detail query: selecting app config: context canceled","ip_addr":"10.121.12.235","level":"error","method":"POST","took":"20.638157224s","ts":"2022-09-21T16:32:04.609116523Z","uri":"/api/v1/osquery/distributed/write","x_for_ip_addr":"10.121.12.235"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.610842198Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || error in query ingestion || getting app config: selecting app config: context canceled","ingestion-err":"ingest detail query: selecting app config: context canceled","ip_addr":"10.121.20.68","level":"error","method":"POST","took":"20.667294618s","ts":"2022-09-21T16:32:04.61105094Z","uri":"/api/v1/osquery/distributed/write","x_for_ip_addr":"10.121.20.68"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.611227611Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"database error: listing hosts in pack: context canceled","ip_addr":"10.121.109.240","level":"error","method":"POST","took":"15.234558552s","ts":"2022-09-21T16:32:04.611759838Z","uri":"/api/v1/osquery/config","x_for_ip_addr":"10.121.109.240"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"authentication error: find host: context canceled","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T16:32:04.613172042Z"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"database error: listing hosts in pack: context canceled","ip_addr":"10.121.23.99","level":"error","method":"POST","took":"15.237111648s","ts":"2022-09-21T16:32:04.614290359Z","uri":"/api/v1/osquery/config","x_for_ip_addr":"10.121.23.99"} Sep 21 16:32:04 n107-019-021 fleet[2572833]: {"component":"http","err":"database error: listing hosts in pack: context canceled","ip_addr":"10.121.16.29","level":"error","method":"POST","took":"15.23287781s","ts":"2022-09-21T16:32:04.615687784Z","uri":"/api/v1/osquery/config","x_for_ip_addr":"10.121.16.29"}
5:08 PM
i restart fleet and right now i c a lot of errs like: Sep 21 17:07:30 n107-019-021 fleet[3065443]: {"component":"http","err":"authentication error: find host: dial tcp 127.0.0.1:3306: socket: too many open files","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T17:07:30.063073026Z"} Sep 21 17:07:30 n107-019-021 fleet[3065443]: {"component":"http","err":"authentication error: find host: dial tcp 127.0.0.1:3306: socket: too many open files","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T17:07:30.063076727Z"} Sep 21 17:07:30 n107-019-021 fleet[3065443]: {"component":"http","err":"authentication error: find host: dial tcp 127.0.0.1:3306: socket: too many open files","level":"info","path":"/api/v1/osquery/distributed/read","ts":"2022-09-21T17:07:30.063099665Z"}
6:31 PM
@Michal Nicpon is there any update?
Michal Nicpon

Michal Nicpon

09/21/2022, 6:38 PM
too many open files
This can be caused by having the ulimit for user running fleet being set too low. See https://fleetdm.com/docs/deploying/faq#what-do-i-do-about-too-many-open-files-errors
6:41 PM
If you are running fleet as a service using systemd, you would need to increase the limit in the service file eg.
LimitNOFILE=8192
Kathy Satterlee

Kathy Satterlee

09/21/2022, 6:42 PM
@Michal Nicpon Just for some context from a separate thread, we did find an issue with the vulnerabilities setup. The database path has been added now and we're still seeing some context cancelled errors. I've suggested giving it a little time for that initial load to level out then checking back in to see what things look like.
6:44 PM
@wennan.he, Let's continue the conversation over there to make sure that all of the data is in one spot: https://osquery.slack.com/archives/C01DXJL16D8/p1663449636947269
w

wennan.he

09/21/2022, 6:46 PM
ok sur