6

I have two tables levels and users_favorites

+--------------------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+-------+
| id | int(9) | NO | PRI | NULL | |
| user_id | int(10) | NO | MUL | NULL | |
| level_name | varchar(20) | NO | | NULL | |
| user_name | varchar(45) | NO | | NULL | |
| rating | decimal(3,2) | NO | | 2.50 | |
| votes | int(5) | NO | | 0 | |
| plays | int(5) | NO | | 0 | |
| date_published | date | NO | MUL | NULL | |
| user_comment | varchar(255) | YES | | NULL | |
| playable_character | int(2) | NO | MUL | 1 | |
| is_featured | tinyint(1) | NO | MUL | 0 | |
+--------------------+--------------+------+-----+---------+-------+
+----------+--------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------+--------+------+-----+---------+-------+
| user_id | int(8) | NO | PRI | NULL | |
| level_id | int(8) | NO | PRI | NULL | |
+----------+--------+------+-----+---------+-------+

I have my local dev environment and the production servers. This query:

SELECT id, level_name, date_published, rating
FROM levels
WHERE id IN (SELECT level_id FROM users_favorites WHERE user_id = 2);

runs very fast locally (around 0.0x seconds) and very slow on production (~15 seconds). The EXPLAIN's are different. On local:

id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE users_favorites ref uniq_user_level,idx_user idx_user 4 const 21 "Using index"
1 SIMPLE levels eq_ref PRIMARY PRIMARY 4 users_favorites.level_id 1 "Using where"

And on production:

id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY levels ALL NULL NULL NULL NULL 3368988 "Using where"
2 "DEPENDENT SUBQUERY" users_favorites eq_ref uniq_user_level,idx_user uniq_user_level 8 const,func 1 "Using index"

I know the data is the the same because it was imported and exported from the same schema. I've run OPTIMIZE and made sure the indexes are the same, tried forcing the indexes. Nothing worked.

The only difference I can spot is the version of MySQL: locally it's 5.6.10, on production it's 5.5.34-log. If that's the reason, I'll upgrade, but I'm wondering if there could some other reason? Or way to phrase the query so it always reduces by the subquery first, as it does locally: 21 rows instead of 3368988?

TIA

asked Nov 27, 2013 at 23:22
2
  • 1
    Before I post an answer, one question: Is there a reason this is written as a subquery when it seems like it would be better written as a join, and presumably less ambiguous to the optimizer in 5.5? SELECT l.id, l.level_name, l.date_published, l.rating FROM levels l JOIN users_favorites uf ON uf.level_id = l.id WHERE uf.user_id = 2 ... if that is as logically equivalent as it seems like it would be, how does that query look with EXPLAIN on 5.5? (Okay, that might be two questions). Commented Nov 28, 2013 at 2:09
  • @sqlbot, "Is there a reason this is written as a subquery when it seems like it would be better written as a join." Yes, the answer being that apparently I'm not that great at SQL. I tried the query as you suggested and it gave the same EXPLAIN on 5.5 and 5.6 and was identical to my subquery's EXPLAIN on 5.5. So, in short your JOIN gave the result I needed. Commented Dec 2, 2013 at 15:19

1 Answer 1

3

Just make a simple join. Sub-queries does not provide the best result quite often

EXPLAIN SELECT l.id, l.level_name, l.date_published, l.rating
FROM levels AS l
INNER JOIN users_favorites AS uf 
ON uf.level_id = l.id
WHERE l.user_id = 2;
jcolebrand
6,3764 gold badges43 silver badges67 bronze badges
answered Nov 28, 2013 at 15:57
7
  • Thanks. What would be the motivation of the INNER join? Without it, I get very very slightly faster results (fetch time is always zero without INNER, whereas with it, I get a small fetch time. Commented Dec 2, 2013 at 15:33
  • Do compare the speed of the actual queries, you need to ignore the query cache. To profile atleast use SELECT SQL_NO_CACHE * from xxx.... Please read on "performance of sub-query in MySQL" and you will find the answer. Commented Dec 2, 2013 at 16:15
  • I used the SELECT SQL_NO_CACHE. The query without the INNER on the JOIN is still consistently a little faster. Commented Dec 2, 2013 at 22:54
  • Next step :) Run the query EXPLAIN EXTENDED SELECT * .... and then SHOW WARNINGS. That will show you the actual query running under the hood. Check out how your sub-queries & join queries are re-structured Commented Dec 3, 2013 at 2:48
  • Thanks, an interesting study. In the end I actually found no difference if INNER was used. Under the hood they were identical. Commented Dec 3, 2013 at 15:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.