I have not dug too deeply into this , Fleet docs s...
# fleet
I have not dug too deeply into this , Fleet docs suggest mysql 5.7 , I have noticed performance issues around labels , and policies on 5.7 but they’re less prominent on a 8.0 test instance I upgraded to. Is the fleet team developing against 8 or 5.7 ? I have yet to look yet through raw queries etc
we are developing against 5.7, support for 8 is not technically official yet, but it should work. Could you share a bit more about your setup, and the performance differences you are noticing?
we are working on improving performance there in different ways, the more information we have the better
So for 1000 hosts we’re seeing an additional 10-20% cpu utilisation per Device Policy added to the point where it no longer returns works after 5 policies are added . On an 8 mirrored setup we’re seeing at least a response on like hardware.
This is timeouts against
what version of fleet are you running?
4.3.1? could you tell me the output of the following sql query:
Copy code
select count(*) from policy_membership_history
Right now I can’t but I will once the DB begins to respond.
also, could you tell me what type of resources do you have for the db?
GCP 16gb RAM , 4vcpu
Fleet prior to v3 used to run with the same host count on a smaller instance of 1vcpu , 4gb ram however it was increased to handle the 3.3 migration issue.
Note I do expect additional DB load due to increased queries with new features so that is not a compliant.
in the meantime, if I'm understanding you correctly, what fails is when you navigate to the list of policies, correct?
Timeout on that endpoint with an error page to raise a GH issue.
yeah, new features are tricky that way
I have an idea of what might be failing, once I understand the scale of it based on your count, I can start thinking about strategies
I'm sorry you're facing this issue
These spikes out of context are trying to get results from compliance checks
And don’t be sorry, things happen we yolo latest to discover bugs and feedback.
🙏🏽 1
we might be running and storing policies too much: https://github.com/fleetdm/fleet/issues/2240
you might need to disable policies while we address this one, we have an idea of what it might be
👀 1
Yeah at this time I think I need to manually edit the db to remove the policies as it’s not a run time flag.
Copy code
delete from policies
should do the trick, and you can also clear the history with:
Copy code
delete from policy_membership_history
One item I am going to look at is does the number of running pods impact performance.
we'll publish a new version with a fix for this in the next day or two
select count(*) from policy_membership_history
may have an exceptionally high number of results in it as it’s currently hung until timeout.
Copy code
SELECT table_rows
    -> FROM information_schema.tables
    -> WHERE table_name='policy_membership_history';
| table_rows |
|   36844359 |
1 row in set (0.04 sec)
For 1000 hosts & 1 configured policy over 24 hours I don’t think this is the norm.
yup, I'm working on a fix right now, if you run those deletes, it'll remove the policies and the history until we have the patch out