In the query below, from
and tid
are indexes of the replies
table.
SELECT * FROM `replies`
WHERE `from`="<userId>"
OR `tid` IN (SELECT `tid` FROM `posts` WHERE `from`="<userId>")
By using "OR", it seems that it does a full table scan (~3 million rows).
The EXPLAIN
says that a possible key would be from
, but then it doesn't use any.
However, in the query below, frid_lt
and frid_gt
are indexed. The two columns are in a complex index (frid_lt, frid_gt), but frid_gt
has also its own index.
SELECT `mid` FROM `messages`
WHERE `frid_lt`="<userId>" OR `frid_gt`="<userId>"
And this query DOES use two indexes. The EXPLAIN
says "index_merge" and "Using sort_union(frid_lt,frid_gt); Using where".
Why does the first query not use an index merge?
Is there any improvement I can make to make the engine use an index merge as well?
1 Answer 1
OR
does not optimize well. A common workaround is to use UNION
:
( SELECT * FROM replies WHERE `from` = "..." )
UNION ALL -- or UNION DISTINCT if you know there are no dups
( SELECT r.* FROM replies AS r
JOIN posts AS p ON p.tid = r.tid
WHERE p.from = "..." )
Notice that I also avoided the usually-inefficient IN ( SELECT ... )
For further performance, have these indexes:
replies: INDEX(`from`)
posts: INDEX(`from`, tid) -- in this order
replies: INDEX(tid)
(And note that the PRIMARY KEY
is an index, so don't add a redundant index.)
In your second example, the "index merge" that you experienced may or may not be faster than a UNION
.
Oh, it's an UPDATE
To optimize UPDATE
, do two separate UPDATEs
(no UNION
, no OR
). One straightforwardly checks from
. The other is a "multi-table UPDATE" (see the manual) similar to the second select above.
from
onposts
indexed as well?CREATE TABLE
statements and exact output ofEXPLAIN
to the question as well.from
is indexed (I said in the post). @AaronBertrand I do understand that. However, by using the "last name" index, at least, I saved time searching for Smith. I guess the engine could be improved to do two lookups on the table using each index, rather than a FULL table scan using no index. I would be happy to tell that to engine withFORCE INDEXES
or similar, to help it decide.