0

I have a table with emails:

email
 - id : numeric, primary key
 - id_in_target : text, the ID as stored in Google/MS, indexed
 - in_reply_to : nullable, text, references id_in_target in case of a reply, indexed
 - ts : timestamp, email's timestamp
 ... some other columns

Given a list of email IDs, I'm trying to fetch all replies or source emails affected by list of email IDs. So the email table is joined with itself. The query has the following form:

select reply.id, extract(epoch from (source.ts - reply.ts))
from email source
join email reply on source.id_in_target = reply.in_reply_to
where source.id in (ids) or reply.id in (ids)

The problem is with the OR condition on the primary key. If I only select the source or the reply the optimizer uses the primary key. However, with the OR condition, the planner chooses to scan the entire table. I know I can "duplicate" the queries with union, but I just don't understand why it chooses the suboptimal plan when there's clearly a primary key condition.

asked Aug 30, 2022 at 13:26
0

1 Answer 1

0

That is because of the OR. PostgreSQL cannot automatically rewrite the query to a UNION of two queries, because it cannot prove that the result would be the same: the query with the OR could return two identical result rows, the UNION query cannot.

answered Aug 30, 2022 at 16:53

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.