Can anyone help me optimise this query? I have the following table:
cdu_user_progress:
--------------------------------------------------------------
|id |uid |lesson_id |game_id |date |score |
--------------------------------------------------------------
For each user, I'm trying to obtain the difference between the best and first scores for a particular game_id for a particular lesson_id, and order the results by that difference ('progress' in my query):
SELECT ms.uid AS id, ms.max_score - fs.first_score AS progress
FROM (
SELECT up.uid, MAX(CASE WHEN game_id = 3 THEN score ELSE NULL END) AS max_score
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) ms
LEFT JOIN (
SELECT up.uid, up.score AS first_score
FROM cdu_user_progress up
INNER JOIN (
SELECT up.uid, MIN(CASE WHEN game_id = 3 THEN date ELSE NULL END) AS first_date
FROM cdu_user_progress up
WHERE (up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086')) AND (up.lesson_id = '65') AND (up.score > '-1')
GROUP BY up.uid
) fd ON fd.uid = up.uid AND fd.first_date = up.date
) fs ON fs.uid = ms.uid
ORDER BY progress DESC
Any help would be most appreciated!
1 Answer 1
SELECT ms.uid AS id
No effect on performance, but this seems weird. You have a column in that table named id
, but you are actually returning uid
as id
. Why not just keep it as uid
?
LEFT JOIN (
You know that the user has a score, because you found it in the first query. Therefore, you don't need a LEFT JOIN
. A regular INNER JOIN
is fine and will be considerably faster.
CASE WHEN game_id = 3 THEN date ELSE NULL END
It's not clear to me what this buys you. Why not just put up.game_id = 3
in the WHERE
clause? This will seriously confuse the optimizer. Same thing for the max_score
query.
Given this query, do you have an index on lesson_id
, game_id
, uid
, score
, date
? This query could hit only the index. It's possible that date
should be before score
. I'd try making it both ways and then running EXPLAIN
to see which it hits. Then you can delete the other. You should put whichever of lesson_id
and game_id
has more potential values first. It's possible that uid
should go before either, but I'd have to test to see.
You put score
and date
last in the index because they have inequalities. This keeps the query from being able to optimize later columns, so they go last. In regards to the uid IN
portion, I believe that it may be able to hold its place on the game_id
and lesson_id
pair, so I put uid
after them.
SELECT ms.uid AS id, ms.max_score - fs.first_score AS progress
FROM (
SELECT up.uid, MAX(score) AS max_score
FROM cdu_user_progress up
WHERE up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086') AND up.lesson_id = '65' AND up.score > '-1' AND up.game_id = 3
GROUP BY up.uid
) ms
INNER JOIN (
SELECT up.uid, up.score AS first_score
FROM cdu_user_progress up
INNER JOIN (
SELECT up.uid, MIN(date) AS first_date
FROM cdu_user_progress up
WHERE up.uid IN ('1671', '1672', '1673', '1674', '1675', '1676', '1679', '1716', '1725', '1726', '1937', '1964', '1996', '2062', '2065', '2066', '2085', '2086') AND up.lesson_id = '65' AND up.score > '-1' AND up.game_id = 3
GROUP BY up.uid
) fd ON fd.uid = up.uid AND fd.first_date = up.date
WHERE up.game_id = 3 AND up.lesson_id = '65'
) fs ON fs.uid = ms.uid
ORDER BY progress DESC
Note that I added WHERE up.game_id = 3 AND up.lesson_id = '65'
because I think it will help the optimizer. You should check the EXPLAIN
plan to be sure.
-
\$\begingroup\$ Hi, the CASE part is to ensure I receive results for ALL up.uids, even if they are NULL... \$\endgroup\$user62423– user624232015年01月06日 07:48:10 +00:00Commented Jan 6, 2015 at 7:48