Sal
09/30/2020, 3:44 PMDBA team response:
Essentially this is not db server tuning issue, but an app tuning issue. I understand this needs to be related to the vendor, however as you can see from the Humio links inserts & updates are clashing on the 'hosts' table. The correct approach is to minimize these deadlock situations arising in the first place. The engine will detect and roll back 1 transaction - eg in detail:
------------------------
LATEST DETECTED DEADLOCK
------------------------
2020-09-30 09:46:45 0x7de11cce9700
*** (1) TRANSACTION:
TRANSACTION 35762332832, ACTIVE 0 sec starting index read
mysql tables in use 1, locked 1
LOCK WAIT 3 lock struct(s), heap size 1136, 2 row lock(s)
MySQL thread id 1157846, OS thread handle 139400941860608, query id 14939859382 <http://prod-10-32-7-83.pw1.bcc.somehost.com|prod-10-32-7-83.pw1.bcc.somehost.com> 10.32.7.83 tlsproddb_user updating
UPDATE hosts SET
seen_time = '2020-09-30 13:46:45.647691'
WHERE node_key='6u8qphrXq6FhlWXYZfYWHT2xwBHX+SsN'
*** (1) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 771 page no 7 n bits 88 index PRIMARY of table `tlsproddb`.`hosts` trx id 35762332832 lock_mode X locks rec but not gap waiting
*** (2) TRANSACTION:
TRANSACTION 35762332830, ACTIVE 0 sec updating or deleting
mysql tables in use 1, locked 1
4 lock struct(s), heap size 1136, 3 row lock(s), undo log entries 1
MySQL thread id 1157730, OS thread handle 138405804414720, query id 14939859380 <http://prod-10-32-1-12.pw1.bcc.somehost.com|prod-10-32-1-12.pw1.bcc.somehost.com> 10.32.1.12 tlsproddb_user update
INSERT INTO hosts (
detail_update_time,
label_update_time,
osquery_host_id,
seen_time,
node_key,
enroll_secret_name
) VALUES ('1970-01-02 00:00:00', '1970-01-02 00:00:00', 'DB938000-FD74-11E7-8000-000000000000', '2020-09-30 13:46:45.647665', 'WqAC/7dPeDzkh90b5yb4TIIvvJrgd4ut', 'default')
ON DUPLICATE KEY UPDATE
node_key = VALUES(node_key),
deleted = FALSE
*** (2) HOLDS THE LOCK(S):
RECORD LOCKS space id 771 page no 7 n bits 88 index PRIMARY of table `tlsproddb`.`hosts` trx id 35762332830 lock_mode X locks rec but not gap
*** (2) WAITING FOR THIS LOCK TO BE GRANTED:
RECORD LOCKS space id 771 page no 3489 n bits 248 index idx_host_unique_nodekey of table `tlsproddb`.`hosts` trx id 35762332830 lock_mode X locks rec but not gap waiting
*** WE ROLL BACK TRANSACTION (1)
Any help on this matter would be greatly appreciated.zwass
Adam
10/02/2020, 7:17 AMzwass
select @@global.transaction_isolation, @@global.tx_isolation, @@global.autocommit;
in your db?ts=2020-10-02T19:48:16.182951Z component=http err="authentication error: invalid node key: g95Bpj/mvBjjmYoJqEbCwXK3rt9UJiX4"
near to the method=EnrollAgent
lines in the Fleet stderr logs?Sal
10/02/2020, 7:50 PMzwass
Sal
10/02/2020, 7:54 PM+--------------------------------+-----------------------+---------------------+
| @@global.transaction_isolation | @@global.tx_isolation | @@global.autocommit |
+--------------------------------+-----------------------+---------------------+
| READ-COMMITTED | READ-COMMITTED | 1 |
+--------------------------------+-----------------------+---------------------+
1 row in set, 1 warning (0.00 sec)zwass
Sal
10/02/2020, 8:16 PMzwass
mysql_max_open_conns
, mysql_max_idle_conns
, and mysql_conn_max_lifetime
?Sal
10/08/2020, 1:09 PMkolide_mysql_max_open_conns: "120"
kolide_mysql_max_idle_conns: "120"
mysql> SHOW VARIABLES LIKE 'wait_timeout';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| wait_timeout | 3600 |
+---------------+-------+
1 row in set (0.01 sec)
Max Lifetime is default and or not available in Fleet 3.0zwass
mysql_conn_max_lifetime
goes very far back in Fleet. Based on the commentary in https://github.blog/2020-05-20-three-bugs-in-the-go-mysql-driver/ it may no longer be necessary to set with the changes with the update to the MySQL driver in https://github.com/kolide/fleet/pull/2321. That will go out in the next release which I will be cutting shortly.Sal
10/08/2020, 5:01 PMzwass