3

When I select for update from join multiple tables, matching rows in all joined tables will be locked. But I was wondering what is the locking order, which rows are locked first? Is it determined by where table names are in the join statement?

For example, I have two tables. user and department.

 | user | CREATE TABLE `user` 
( `id` int(11) NOT NULL AUTO_INCREMENT, `name` varchar(255) DEFAULT NULL,
 `gender` tinyint(4) DEFAULT NULL,
 `department_id` int(11) DEFAULT NULL,
 `age` int(11) DEFAULT NULL,
 `status` varchar(63) DEFAULT NULL,
 PRIMARY KEY (`id`),
 KEY `department_id` (`department_id`)
) ENGINE=MyISAM AUTO_INCREMENT=11 DEFAULT CHARSET=latin1 
 department | CREATE TABLE `department` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `name` varchar(63) DEFAULT NULL,
 `status` varchar(31) DEFAULT NULL,
 PRIMARY KEY (`id`)
)

I have two queries.

 explain 
select user.name as name, user.age as age, department.name as dept_name
 from user
 INNER JOIN department ON user.department_id = department.id
 where user.status='on'
 AND department.status='operating'
 order by user.age
 limit 1 for update \G;
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: user
 type: ALL
possible_keys: department_id
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 10
 Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
 id: 1
 select_type: SIMPLE
 table: department
 type: ALL
possible_keys: PRIMARY
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 5
 Extra: Using where; Using join buffer
2 rows in set (0.00 sec)
 explain 
select user.name as name, user.age as age, department.name as dept_name
 from user
 INNER JOIN department ON user.department_id = department.id
 where user.status='off'
 AND department.status='operating'
 order by user.age
 limit 1 for update \G;
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: user
 type: ALL
possible_keys: department_id
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 10
 Extra: Using where; Using temporary; Using filesort
*************************** 2. row ***************************
 id: 1
 select_type: SIMPLE
 table: department
 type: ALL
possible_keys: PRIMARY
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 5
 Extra: Using where; Using join buffer
SELECT user.id as user_id, order.id as order_id, department.id as dept_id 
 FROM user 
 INNER JOIN order ON user.id = order.user_id 
 INNER JOIN department ON department.id = user.department_id 
 WHERE user.age > 30 AND department.name in ('sales') 
 LIMIT 10 FOR UPDATE

Not clear about the result of explain. It does not show which rows are locked.

Are rows in table user locked first, and then order and department? Or the matching row locked first?

Rick James
80.7k5 gold badges52 silver badges119 bronze badges
asked Mar 30, 2017 at 23:42
8
  • 1
    At least the matching rows will be locked. You can learn something about the order of operations using EXPLAIN SELECT ... which will show you how the optimizer has reordered the tables (if it has)... but since this is (arguably) an implementation detail that SQL, theoretically, shields the user from and makes irrelevant... why do you ask? Commented Mar 31, 2017 at 2:52
  • 2
    @Michael-sqlbot, the reason I ask is that I recently got a race condition with deadlock. Specifically, there are two transactions of very similar queries, which select and join the same tables, but the where clause is different (e.g., the different department.name in above example). The deadlock dump shows that one transaction holds user lock and waits for department lock while another transaction holds department lock and waits for user lock. So, I guess the locking order is not by tables, but by the matching records. Maybe DB first locks the matching rows, and then apply gap lock on others. Commented Mar 31, 2017 at 20:24
  • Please provide EXPLAINs for both queries. Commented Apr 2, 2017 at 1:05
  • And provide SHOW CREATE TABLE for all three tables. Commented Apr 2, 2017 at 1:05
  • Without an ORDER BY, you are getting a random 10 rows; this may be part of the problem. Commented Apr 2, 2017 at 1:06

1 Answer 1

1

It is quite natural for the Optimizer might pick different query plans for those two queries. Even if the EXPLAINs look similar today, they could be different tomorrow.

Regardless, you should plan on deadlocks; it is often futile to try to eliminate all of them. When you get a deadlock, replay the entire transaction; it is very likely to succeed the second time.

Yes, there can be edge cases where two "independent" queries can collide. This is probably because of edge cases where gap locking collided - or something like that.

There might be cryptic clues in SHOW ENGINE INNODB STATUS;

This index on user might decrease the frequency of deadlocks:

INDEX(status, age)

And it is likely to speed up the queries when you have a lot of rows.

answered Apr 18, 2017 at 21:24

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.