When I select for update from join multiple tables, matching rows in all joined tables will be locked. But I was wondering what is the locking order, which rows are locked first? Is it determined by where table names are in the join statement?
For example, I have two tables. user and department.
| user | CREATE TABLE `user`
( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(255) DEFAULT NULL,
`gender` tinyint(4) DEFAULT NULL,
`department_id` int(11) DEFAULT NULL,
`age` int(11) DEFAULT NULL,
`status` varchar(63) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `department_id` (`department_id`)
) ENGINE=MyISAM AUTO_INCREMENT=11 DEFAULT CHARSET=latin1
department | CREATE TABLE `department` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(63) DEFAULT NULL,
`status` varchar(31) DEFAULT NULL,
PRIMARY KEY (`id`)
)
I have two queries.
explain
select user.name as name, user.age as age, department.name as dept_name
from user
INNER JOIN department ON user.department_id = department.id
where user.status='on'
AND department.status='operating'
order by user.age
limit 1 for update \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: user
type: ALL
possible_keys: department_id
key: NULL
key_len: NULL
ref: NULL
rows: 10
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: department
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 5
Extra: Using where; Using join buffer
2 rows in set (0.00 sec)
explain
select user.name as name, user.age as age, department.name as dept_name
from user
INNER JOIN department ON user.department_id = department.id
where user.status='off'
AND department.status='operating'
order by user.age
limit 1 for update \G;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: user
type: ALL
possible_keys: department_id
key: NULL
key_len: NULL
ref: NULL
rows: 10
Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: department
type: ALL
possible_keys: PRIMARY
key: NULL
key_len: NULL
ref: NULL
rows: 5
Extra: Using where; Using join buffer
SELECT user.id as user_id, order.id as order_id, department.id as dept_id
FROM user
INNER JOIN order ON user.id = order.user_id
INNER JOIN department ON department.id = user.department_id
WHERE user.age > 30 AND department.name in ('sales')
LIMIT 10 FOR UPDATE
Not clear about the result of explain. It does not show which rows are locked.
Are rows in table user locked first, and then order and department? Or the matching row locked first?
1 Answer 1
It is quite natural for the Optimizer might pick different query plans for those two queries. Even if the EXPLAINs
look similar today, they could be different tomorrow.
Regardless, you should plan on deadlocks; it is often futile to try to eliminate all of them. When you get a deadlock, replay the entire transaction; it is very likely to succeed the second time.
Yes, there can be edge cases where two "independent" queries can collide. This is probably because of edge cases where gap locking collided - or something like that.
There might be cryptic clues in SHOW ENGINE INNODB STATUS;
This index on user
might decrease the frequency of deadlocks:
INDEX(status, age)
And it is likely to speed up the queries when you have a lot of rows.
EXPLAIN SELECT ...
which will show you how the optimizer has reordered the tables (if it has)... but since this is (arguably) an implementation detail that SQL, theoretically, shields the user from and makes irrelevant... why do you ask?EXPLAINs
for both queries.SHOW CREATE TABLE
for all three tables.ORDER BY
, you are getting a random 10 rows; this may be part of the problem.