1

Essentially, I want to get intersection between 2 tables based on matching IDs with the option of wildcard '*' to return all IDs:

Select * from A a inner join B b on b.id = a.id or b.id = '*'

Table A is fairly large (10 M rows), and I have setup the id to be index already.

Because of OR, index id is not used, and it does a full scan (Takes 20 secs).

If I don't allow the wildcard:

Select * from A a inner join B b on b.id = a.id

It uses id index and takes only 300 ms

Edit:

By separating into 2 select with Union helped. Detail here: https://stackoverflow.com/questions/5901791/is-having-an-or-in-an-inner-join-condition-a-bad-idea

asked Mar 21, 2016 at 15:18
2
  • can't you split your query in 2 rather than using or? Commented Mar 21, 2016 at 15:29
  • What is the estimated selectivity of the index on b.id with and without or b.id = '*'? It should be able to use the index with the OR, but if there are a lot of rows with b.id = '*', it might opt for the full scan because of a low estimated selectivity. Commented Mar 21, 2016 at 17:11

2 Answers 2

2

A common trick for optimizing OR is to turn it into UNION:

( SELECT * FROM A JOIN B USING(id) WHERE condition_1 )
UNION DISTINCT
( SELECT * FROM A JOIN B USING(id) WHERE condition_2 )

(I'm with Jack on being confused on what you really wanted, so I avoided spelling out the details of the conditions.)

Be sure to include index(es) that let the queries start with the conditions.

answered Mar 21, 2016 at 18:45
1

Reformatting your query slightly we have:

select * from A a 
inner join B b on 
 b.id = a.id -- condition 1
 or
 b.id = '*' -- condition 2

Consider how this will match rows from B with rows from A:

  1. If there is a row in B with an id that matches the id in A, include that row along with the row from A.
  2. If there is any row in B with an id that matches '*', include that row along with the row from A.

Condition #2 means that if there are any rows in B with an id of '*', then all of the rows in A must be included, because they all match those rows in B.

This also means that every row in B that has an id of '*' matches every row in A.

Is that really what you want? If so then there is no way to avoid a full scan because every row in A is included.

answered Mar 21, 2016 at 18:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.