Using "OR" against two indexed columns doesn't use indexes

Question 1

In the query below, from and tid are indexes of the replies table.

SELECT * FROM `replies`
WHERE `from`="<userId>"
OR `tid` IN (SELECT `tid` FROM `posts` WHERE `from`="<userId>")

By using "OR", it seems that it does a full table scan (~3 million rows). The EXPLAIN says that a possible key would be from, but then it doesn't use any.

However, in the query below, frid_lt and frid_gt are indexed. The two columns are in a complex index (frid_lt, frid_gt), but frid_gt has also its own index.

SELECT `mid` FROM `messages`
WHERE `frid_lt`="<userId>" OR `frid_gt`="<userId>"

And this query DOES use two indexes. The EXPLAIN says "index_merge" and "Using sort_union(frid_lt,frid_gt); Using where".

Why does the first query not use an index merge?
Is there any improvement I can make to make the engine use an index merge as well?

Question 2

Is from on posts indexed as well?

Question 3

You may want to add CREATE TABLE statements and exact output of EXPLAIN to the question as well.

Question 4

Imagine you were allowed to make just one forward-only pass of a phone book, and someone asked you to find all the people with the last name Smith OR the first name Yvette. Sure you could easily seek to the Smiths, but that doesn't help you because you can't go backward and start finding all the Yvettes with last name starting with A, B, etc. Sometimes an index (seek) isn't the most efficient way to solve a query that has multiple filter criteria (or returns too many rows, or too many columns that aren't in the index, or ...)

Question 5

@wolfgangwalther Yes, from is indexed (I said in the post). @AaronBertrand I do understand that. However, by using the "last name" index, at least, I saved time searching for Smith. I guess the engine could be improved to do two lookups on the table using each index, rather than a FULL table scan using no index. I would be happy to tell that to engine with FORCE INDEXES or similar, to help it decide.

Question 6

(phone book, continued) While scanning for Yvette, it could trivially check for Smith, thereby doing it all in a single table scan. Note: Fetching a row costs a lot more than checking for whether to keep the row.

Question 7

OR does not optimize well. A common workaround is to use UNION:

( SELECT * FROM replies WHERE `from` = "..." )
UNION ALL -- or UNION DISTINCT if you know there are no dups
( SELECT r.* FROM replies AS r
 JOIN posts AS p ON p.tid = r.tid
 WHERE p.from = "..." )

Notice that I also avoided the usually-inefficient IN ( SELECT ... )

For further performance, have these indexes:

replies: INDEX(`from`)
posts: INDEX(`from`, tid) -- in this order
replies: INDEX(tid)

(And note that the PRIMARY KEY is an index, so don't add a redundant index.)

In your second example, the "index merge" that you experienced may or may not be faster than a UNION.

Oh, it's an UPDATE

To optimize UPDATE, do two separate UPDATEs (no UNION, no OR). One straightforwardly checks from. The other is a "multi-table UPDATE" (see the manual) similar to the second select above.

Rick James Rick James 80.7k5 gold badges52 silver badges119 bronze badges · Accepted Answer · 2016-10-24 02:19:28Z

OR does not optimize well. A common workaround is to use UNION:

( SELECT * FROM replies WHERE `from` = "..." )
UNION ALL -- or UNION DISTINCT if you know there are no dups
( SELECT r.* FROM replies AS r
 JOIN posts AS p ON p.tid = r.tid
 WHERE p.from = "..." )

Notice that I also avoided the usually-inefficient IN ( SELECT ... )

For further performance, have these indexes:

replies: INDEX(`from`)
posts: INDEX(`from`, tid) -- in this order
replies: INDEX(tid)

(And note that the PRIMARY KEY is an index, so don't add a redundant index.)

In your second example, the "index merge" that you experienced may or may not be faster than a UNION.

Oh, it's an UPDATE

To optimize UPDATE, do two separate UPDATEs (no UNION, no OR). One straightforwardly checks from. The other is a "multi-table UPDATE" (see the manual) similar to the second select above.

Stack Exchange Network

Using "OR" against two indexed columns doesn't use indexes

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Using "OR" against two indexed columns doesn't use indexes

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions