I have not dug too deeply into this Fleet docs suggest mysql osquery #fleet

I have not dug too deeply into this , Fleet docs s...

Gavin

09/27/2021, 4:59 PM

I have not dug too deeply into this , Fleet docs suggest mysql 5.7 , I have noticed performance issues around labels , and policies on 5.7 but they’re less prominent on a 8.0 test instance I upgraded to. Is the fleet team developing against 8 or 5.7 ? I have yet to look yet through raw queries etc

Tomas Touceda

09/27/2021, 5:01 PM

we are developing against 5.7, support for 8 is not technically official yet, but it should work. Could you share a bit more about your setup, and the performance differences you are noticing?

Tomas Touceda

09/27/2021, 5:01 PM

we are working on improving performance there in different ways, the more information we have the better

Gavin

09/27/2021, 5:03 PM

So for 1000 hosts we’re seeing an additional 10-20% cpu utilisation per Device Policy added to the point where it no longer returns works after 5 policies are added . On an 8 mirrored setup we’re seeing at least a response on like hardware.

Gavin

09/27/2021, 5:03 PM

This is timeouts against

policies/manage

Tomas Touceda

09/27/2021, 5:03 PM

what version of fleet are you running?

Gavin

09/27/2021, 5:03 PM

latest.

Tomas Touceda

09/27/2021, 5:04 PM

4.3.1? could you tell me the output of the following sql query:

Copy code

select count(*) from policy_membership_history

Gavin

09/27/2021, 5:04 PM

Right now I can’t but I will once the DB begins to respond.

Tomas Touceda

09/27/2021, 5:05 PM

also, could you tell me what type of resources do you have for the db?

Gavin

09/27/2021, 5:05 PM

GCP 16gb RAM , 4vcpu

Gavin

09/27/2021, 5:06 PM

Fleet prior to v3 used to run with the same host count on a smaller instance of 1vcpu , 4gb ram however it was increased to handle the 3.3 migration issue.

Gavin

09/27/2021, 5:07 PM

Note I do expect additional DB load due to increased queries with new features so that is not a compliant.

Tomas Touceda

09/27/2021, 5:07 PM

in the meantime, if I'm understanding you correctly, what fails is when you navigate to the list of policies, correct?

Gavin

09/27/2021, 5:07 PM

Yes

Gavin

09/27/2021, 5:07 PM

Timeout on that endpoint with an error page to raise a GH issue.

Tomas Touceda

09/27/2021, 5:07 PM

yeah, new features are tricky that way

Tomas Touceda

09/27/2021, 5:07 PM

I have an idea of what might be failing, once I understand the scale of it based on your count, I can start thinking about strategies

Tomas Touceda

09/27/2021, 5:08 PM

I'm sorry you're facing this issue

Gavin

09/27/2021, 5:08 PM

These spikes out of context are trying to get results from compliance checks

Gavin

09/27/2021, 5:08 PM

And don’t be sorry, things happen we yolo latest to discover bugs and feedback.

🙏🏽 1

Tomas Touceda

09/27/2021, 5:09 PM

we might be running and storing policies too much: https://github.com/fleetdm/fleet/issues/2240

Tomas Touceda

09/27/2021, 5:10 PM

you might need to disable policies while we address this one, we have an idea of what it might be

👀 1

Gavin

09/27/2021, 5:11 PM

Yeah at this time I think I need to manually edit the db to remove the policies as it’s not a run time flag.

Tomas Touceda

09/27/2021, 5:13 PM

Copy code

delete from policies

should do the trick, and you can also clear the history with:

Copy code

delete from policy_membership_history

Gavin

09/27/2021, 5:13 PM

One item I am going to look at is does the number of running pods impact performance.

Tomas Touceda

09/27/2021, 5:27 PM

we'll publish a new version with a fix for this in the next day or two

Gavin

09/27/2021, 5:52 PM

select count(*) from policy_membership_history

may have an exceptionally high number of results in it as it’s currently hung until timeout.

Gavin

09/27/2021, 5:54 PM

Copy code

SELECT table_rows
    -> FROM information_schema.tables
    -> WHERE table_name='policy_membership_history';
+------------+
| table_rows |
+------------+
|   36844359 |
+------------+
1 row in set (0.04 sec)

Gavin

09/27/2021, 5:55 PM

For 1000 hosts & 1 configured policy over 24 hours I don’t think this is the norm.

Tomas Touceda

09/27/2021, 5:57 PM

yup, I'm working on a fix right now, if you run those deletes, it'll remove the policies and the history until we have the patch out

Tomas Touceda

09/27/2021, 6:18 PM

https://github.com/fleetdm/fleet/pull/2246

3 Views

Open in Slack

Previous Next