so it looks like my scheduled/automated queries st...
# fleet
m
so it looks like my scheduled/automated queries stopped functioning sometime after upgrading to v4.41.1 then suddenly i started seeing my results going to mysql binlogs instead of going to the fleet servers result.log? not sure if anyone else is seen anything like this. The binlogs ended up causing my server disk to reach capacity. I had to expand the disk to even authenticate and take a look at what happened config wise. since then I've gone into mysql and disabled binlogs to prevent the disk from being consumed again, but not sure where to go from here to ensure my scheduled queries to send results back to the result.log on the fleet server
k
Hi @mason kemmerer! Are you seeing any errors related to logging or query ingestion in your Fleet server logs? Do you have any queries that are scheduled outside of Fleet (set up manually in osquery config)?
m
not anymore, once i went to 4.41.1 i disabled the legacy packs at the URL: myfleetsvr.com/packs/manage
i can also see my queries are running on my hosts when i run a live query:
SELECT * from osquery_schedule
and can see the results going to the query_results table in mysql with current timestamps... just ever since this upgrade the result.log has not been updated, when i reviewed the mysql binlogs it looked like result data that was base64 encoded
happy to run any additional queries with osqueryd or my fleetserver to troubleshoot if theres errors somewhere i am missing?
fwiw i am monitoring Darshal Shah's thread as well (since it seemed we have a similar issue) and am also returning no results when running the same query:
SELECT * from osquery_schedule WHERE query LIKE "select * from rpm_packages"
i also upgraded to v4.43.0 if that helps with the troubleshooting
k
Are there any errors in the Fleet server logs?
Can you confirm that the query has a frequency set and that automations are enabled for that query?
m
Yes, I can confirm using the rpm_packages query as an example. It is set to run daily and automations are on. what would be the best way to get the fleet server logs you are looking for?
k
That depends on how you have Fleet deployed. The server logs are sent to stderr and stdout, so wherever those go in your environment.
m
im using docker based on this project: https://github.com/CptOfEvilMinions/FleetDM-Automation
ah i think i figured it out the fleet-webgui container when i tail the docker logs is riddled with these:
Copy code
{
  "component": "http",
  "err": "error writing result logs (if the logging destination is down, you can reduce frequency/size of osquery logs by increasing logger_tls_period and decreasing logger_tls_max_lines): writing log: can't rename log file: rename /var/log/osquery/result.log /var/log/osquery/result-2024-01-13T00-45-30.101.log: permission denied",
  "ip_addr": "172.18.38.205",
  "level": "error",
  "method": "POST",
  "took": "231.858647ms",
  "ts": "2024-01-13T00:45:30.101966418Z",
  "uri": "/api/v1/osquery/log",
  "uuid": "c8c7e695-684d-414c-8973-ece403f823c5",
  "x_for_ip_addr": "172.18.38.205"
}
i have that path specified in my fleetdm.yml file which i point to when preparing the database
Copy code
filesystem:
  status_log_file: /var/log/osquery/status.log
  result_log_file: /var/log/osquery/result.log
  enable_log_rotation: true
did something change wrt how logs are written? in one of the latest updates to fleet or osquery?
the Dockerfile-Fleetdm has result.lgo and status.log specified, so not sure where the result-YYYY-MM-DDT-HH-MM-SS-SEC.log came from
Copy code
### Setup logging directory ###
RUN mkdir /var/log/osquery && \
	chown root:root /var/log/osquery && \
	touch /var/log/osquery/status.log && \
	touch /var/log/osquery/result.log && \
	chown fleet:fleet /var/log/osquery/result.log && \
	chown fleet:fleet /var/log/osquery/status.log
verified nothing has changed perms wise in the fleet-webgui container:
Copy code
/var/log/osquery $ ls -l
total 536648
-rw-r--r--    1 root     root      25176131 Nov  7 19:40 masonk@10.0.0.5
-rw-r--r--    1 fleet    fleet    524287805 Jan  2 17:30 result.log
-rw-r--r--    1 fleet    fleet        52335 Jan  5 16:47 status.log
is it perhaps the result.log file finally reached its capacity and to prepare for log rotation (since i have that enabled above), and now fleet wants to rename the file (adding the timestamp at the end) and failed to do the rename on the fly? perhaps this may stemmed from when my disk had reached capacity?
as a test.. i went into the fleet-webgui container and ran:
cp result.log result.log.bk
then
touch result.log
and
chown fleet:fleet result.log
within moments the "new" result.log filled up, so I believe I am on the right track:
Copy code
/var/log/osquery # ls -l
total 1048652
-rw-r--r--    1 root     root      25176131 Nov  7 19:40 masonk@10.0.0.5
-rw-r--r--    1 fleet    fleet    524287744 Jan 13 01:55 result.log
-rw-r--r--    1 root     root     524287805 Jan 13 01:50 result.log.bk
-rw-r--r--    1 fleet    fleet        52951 Jan 13 01:53 status.log
i swapped the owner of the osquery folder from root to fleet and now it is able to write the timestamped result.logs...
Copy code
/var/log/osquery # ls -l
total 1153456
-rw-r--r--  1 root   root   25176131 Nov 7 19:40 mkemmerer@10.64.121.211
-rw-r--r--  1 fleet  fleet  524287744 Jan 13 01:55 result-2024-01-13T02-06-40.494.log
-rw-r--r--  1 fleet  fleet  107312777 Jan 13 02:07 result.log
-rw-r--r--  1 root   root   524287805 Jan 13 01:50 result.log.bk
-rw-r--r--  1 fleet  fleet    52951 Jan 13 01:53 status.log
@Kathy Satterlee thank you for the direction (it was the eureka moment i needed) i guess my final question is does fleet have any configuration to remove / delete result.logs of a certain age? or is that something administrators/customers are expected to address?
i presume it is 28 days? since I do not have
Copy code
filesystem:
   max_age: 0
specified in my fleetdm.yml config file?
or would the default value of 3 for max_backups cover me here so these rotated logs dont just consume the entire disk?
Copy code
filesystem:
   max_backups: 0