hey all - I am working on a grafana dashboard for fleet infrastructure, and plan on sharing it once complete. I am interested in hearing what are some key metrics you think should be tracked from a health & performance perspective?
c
clong
05/01/2019, 3:41 AM
Hoo boy, I have a bunch:
1. # of osqueryd daemon restarts
2. # of “system resource exceeded” events
3. # of “system memory exceeded” events
4. # of “event table overflow” events
5. Scheduled Query Failed events
6. RocksDB warnings
7. Extension errors/warnings
➕ 1
probably more but thats all i have off the top of my head