Was just trying to update fleedm to 4.67.0 from 64...
# fleet
j
Was just trying to update fleedm to 4.67.0 from 64.2 and got a sql error
Copy code
2025/04/24 14:40:25 FAIL 20250217093329_MigratePendingUpcomingActivities.go (failed to insert pending vpp app installs: Error 1267 (HY000): Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='), quitting migration.
MYSQL_8_0_36.R20250304.00_01
u
Hi @Jason Roberts (osquery)! Reaching out to the team to see how those collations might be mismatched. Do you have the ability to query the fleet database?
j
i do
u
Awesome. Can you run this query for me:select table_name, column_name, character_set_name, collation_name, column_type from columns where table_schema = 'fleet' and collation_name != 'utf8mb4_unicode_ci' and collation_name != 'utf8mb4_bin';
j
Some of your SQL statements failed to execute (Learn more at https://cloud.google.com/sql/docs/mysql/manage-data-using-studio). Details: Error 1146: Table 'fleetdm.columns' doesn't exist
u
Can you try information_schema.columns ?
j
image.png
u
Was your Fleet server completely offline durng the migration?
j
no
u
Okay. That's likely the root of our issue. There were some migrations that were specifically intended to fix missmatched collation. Some data must have been inserted while migrations were running.
j
ah
let me try that again
u
In that query, I'm assuming that the database name is 'fleet'. I noticed in the error that we had 'fleetdm'. Can you confirm that we actually ran that against the fleet database?
select table_name, column_name, character_set_name, collation_name, column_type from columns where table_schema = 'fleet' collation_name = 'utf8mb4_bin';
We should get results for that one.
j
Some of your SQL statements failed to execute (Learn more at https://cloud.google.com/sql/docs/mysql/manage-data-using-studio). Details: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'collation_name = 'utf8mb4_bin'' at line 1
u
I think we missed an AND there.
select table_name, column_name, character_set_name, collation_name, column_type from columns where table_schema = 'fleet' AND collation_name = 'utf8mb4_bin';
Sorry about that, I saw it when I copied and failed to fix it.
j
image.png
u
Thanks for also adding the information_schema. Trying to move too fast apparently.
j
i could just rollback the whole db to yesterday
u
That's definitely the fast and easy option. I've got engineering taking a look at things as well, just in case there is an issue in one of the migrations.
Lat thing I want to do is have you roll back and just hit the error again.
j
rolling back now
with no additional instances
u
Before you start migrations, Can you truncate the tables in question?
j
which tables
u
TRUNCATE upcoming_activitiesTRUNCATE software_install_upcoming_activities
Standing by.
j
Cannot truncate a table referenced in a foreign key constraint (
fleetdm
.
script_upcoming_activities
, CONSTRAINT
fk_script_upcoming_activities_upcoming_activity_id
)
was able to drop the second
u
DELETE FROM upcoming_activities
Should do the trick
j
yup
u
Brilliant.
Last one hopefully (may need tweaking )SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME = 'fleetdm';
j
image.png
still having the upgrade issue
even after restore
image.png
no other instances are running
err maybe they were running
gcp takes a really long time to deleted instances in a managed autoscale instance group
k
I'm sorry, that's incredibly frustrating.
j
yea it's still stuck
with no other instances
k
Looking.
Was that with another rollback in between?
j
yea
rollback # 3
k
I fugured you were on top of it, but wanted to be sure.
j
did a rollback with no other instances i did not truncate then ran migrations
k
Is rolling back the update and scheduling some time tomorrow to give it another look an option for you? I'd like to get more of the engineering team involved to take a look at things. If not, we can keep digging together.
j
that works
k
Mind if I DM my calendly link?
j
sure
k
We can stick to Slack rather than hopping to Zoom, I just want to make sure I've got dedicated time on my calendar.
j
sent
k
Brilliant. Sorry this wasn't a quick fix! I looked at those migrations and I'm a little stymied about where that miss-match could possibly be coming from.
j
thanks for all the help
im not sure we even use those tables?
we are not using the mdm parts.
k
Interesting. If that's the case, we might just skip importing the data into that unified queue table. I'm working on a hunch at the moment.
Ready whenever you are @Jason Roberts
j
ready
k
Alright. What I'd like to do if it works for you is: 1. Create a new DB backup 2. Take all Fleet containers down 3. Try running database migrations on a single container I'm not necessarily expecting it to pass, just hoping to give it the best chance to do so.
j
i did that yesterday, will try again though
k
Once we get to the failure, we're going to hop in to the db and test some queries individually/figure out if we can force or skip the migration.
j
got it
DEFAULT 2025-04-25T203922.039611274Z [resource.labels.instanceId: fleetdm-corp-g9bc] # Missing migrations: tables=[20250217093329 20250219090511 20250219100000 20250219142401 20250224184002 20250225085436 20250226000000 20250226153445 20250304162702 20250306144233 20250313163430 20250317130944 20250318165922 20250320132525 20250320200000 20250326161930 20250326161931 20250331042354 20250331154206 20250401155831 20250408133233 20250410104321]. DEFAULT 2025-04-25T203922.039739474Z [resource.labels.instanceId: fleetdm-corp-g9bc] #C01DXJL16D8
"2025/04/25 133921 FAIL 20250217093329_MigratePendingUpcomingActivities.go (failed to insert pending vpp app installs: Error 1267 (HY000): Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='), quitting migration.
k
I figured. Here's the first thing I want to try in the database in the current state:
Copy code
INSERT INTO vpp_app_upcoming_activities
	(
		upcoming_activity_id,
		adam_id,
		platform,
		policy_id,
		created_at
	)
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	INNER JOIN
		nano_view_queue nvq ON nvq.command_uuid = hvi.command_uuid
	LEFT OUTER JOIN vpp_app_upcoming_activities vaua
		ON vaua.upcoming_activity_id = ua.id
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	nvq.status IS NULL AND
	vaua.upcoming_activity_id IS NULL
j
Some of your SQL statements failed to execute (Learn more at https://cloud.google.com/sql/docs/mysql/manage-data-using-studio). Details: Error 1267: Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='
k
Ty. Now just the select:
Copy code
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	INNER JOIN
		nano_view_queue nvq ON nvq.command_uuid = hvi.command_uuid
	LEFT OUTER JOIN vpp_app_upcoming_activities vaua
		ON vaua.upcoming_activity_id = ua.id
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	nvq.status IS NULL AND
	vaua.upcoming_activity_id IS NULL
j
Copy code
Details: Error 1267: Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='
k
TY. Now we can try to narrow down which one is throwing the error (I could not, for the life of me, replicate this):
Copy code
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	INNER JOIN
		nano_view_queue nvq ON nvq.command_uuid = hvi.command_uuid
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	nvq.status IS NULL AND
hold. forgot to edit.
j
This fleetdm has been running since before it was kolide.
k
Edited!
j
Copy code
Details: Error 1064: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 16
Copy code
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	INNER JOIN
		nano_view_queue nvq ON nvq.command_uuid = hvi.command_uuid
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	nvq.status IS NULL
Copy code
Details: Error 1267: Illegal mix of collations (utf8mb4_unicode_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='
k
Copy code
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	LEFT OUTER JOIN vpp_app_upcoming_activities vaua
		ON vaua.upcoming_activity_id = ua.id
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	vaua.upcoming_activity_id IS NULL
j
No rows to display
k
I thought that would be the culprit. You said you are not using MDM at all, correct?
j
correct
k
Copy code
CREATE OR REPLACE VIEW nano_view_queue AS
SELECT
    q.id,
    q.created_at,
    q.active,
    q.priority,
    c.command_uuid,
    c.request_type,
    c.command,
    r.updated_at AS result_updated_at,
    r.status,
    r.result
FROM
    nano_enrollment_queue AS q

        INNER JOIN nano_commands AS c
        ON q.command_uuid = c.command_uuid

        LEFT JOIN nano_command_results r
        ON r.command_uuid = q.command_uuid AND r.id = q.id
ORDER BY
    q.priority DESC,
    q.created_at;
j
Copy code
Statement executed successfully
k
Copy code
SELECT
	ua.id,
	hvi.adam_id,
	hvi.platform,
	hvi.policy_id,
	hvi.created_at
FROM
	upcoming_activities ua
	INNER JOIN host_vpp_software_installs hvi
		ON hvi.command_uuid = ua.execution_id
	INNER JOIN
		nano_view_queue nvq ON nvq.command_uuid = hvi.command_uuid
	LEFT OUTER JOIN vpp_app_upcoming_activities vaua
		ON vaua.upcoming_activity_id = ua.id
WHERE
	ua.activity_type = 'vpp_app_install' AND
	hvi.removed = 0 AND
	nvq.status IS NULL AND
	vaua.upcoming_activity_id IS NULL
j
now rows to display
k
Trying the select again to see if we can just move forward with the migration
Perfect. Let's try it again.
The migration.
j
worked
well it's past the part it was stuck on
done
Migrations completed..
k
What a relief.
j
woo ty
is this because we are not using mdm?
k
No, I suspect that there's some weirdness with what the default collation was when the view was originally created. I found a couple of bugs in other versions of MySQL that had the same symptoms we were seeing.
j
ah
k
Everything indicates that the collation is correct (hence our not seeing them when we explicitly checked and corrected collation in the migration that created the unified queue tables in the first place
While this is the only time I've seen this crop up, I am going to submit a bug ticket to perhaps add a migration before this that drops and recreates the view.
j
ah makes sense
k
This one is much older, but perhaps the fix for it was not retroactive: https://bugs.mysql.com/bug.php?id=18227
Thanks again for hanging in there with me, and for coming back to this with fresh eyes and a plan of attack.
j
thanks for helping me with this
k
Of course! Migration issues are always a pretty high priority.