2

I have the following table:

id message_read notification_sent send_date text version recipient sender 
-- ------------ ----------------- ------------------- --------------------------------------------------- ------- --------- ------ 
1 true false 2015年02月27日 00:00:00 Bonjour. Je suis disponible pour la garde partagée. 0 2 3 
3 false false 2015年03月01日 00:00:00 Je suis disponible pour vous rencontrer dès demain. 0 2 3 
12 false false 2015年02月27日 00:00:00 Hello dude 0 2 1 

The query below is the type of queries that typically get run on this table:

select * from public.message
where (recipient = 2 and sender = 3) or (sender = 2 and recipient = 3);

I am trying to create efficient indices on this table and bearing in mind the kind of queries run on it.

Here are my indices:

CREATE INDEX recipient_idx ON message(recipient);
CREATE INDEX sender_idx ON message(sender);

and here is the output of a explain analyze on the query above:

QUERY PLAN 
-------------------------------------------------------------------------------------------------------------------------- 
Bitmap Heap Scan on message (cost=8.57..13.40 rows=1 width=49) (actual time=0.072..0.075 rows=3 loops=1) 
 Recheck Cond: ((sender = 3) OR (recipient = 3)) 
 Filter: (((recipient = 2) AND (sender = 3)) OR ((sender = 2) AND (recipient = 3))) 
 Heap Blocks: exact=1 
 -> BitmapOr (cost=8.57..8.57 rows=2 width=0) (actual time=0.050..0.050 rows=0 loops=1) 
 -> Bitmap Index Scan on sender_idx (cost=0.00..4.28 rows=1 width=0) (actual time=0.037..0.037 rows=2 loops=1) 
 Index Cond: (sender = 3) 
 -> Bitmap Index Scan on recipient_idx (cost=0.00..4.28 rows=1 width=0) (actual time=0.009..0.009 rows=1 loops=1) 
 Index Cond: (recipient = 3) 
Planning time: 0.468 ms 
Execution time: 0.188 ms 

Can anyone please tell me what I am getting wrong? Why isn't my query using a btree index and a bitmap one instead? How can I improve my indices?

edit:

Here is what I've tried:

CREATE INDEX recipient_sender_idx ON message(recipient, sender);

Running the explain analyze now yields:

QUERY PLAN 
-------------------------------------------------------------------------------------------------------------------------- 
Bitmap Heap Scan on message (cost=8.57..13.40 rows=1 width=49) (actual time=0.035..0.038 rows=3 loops=1) 
 Recheck Cond: ((sender = 3) OR (recipient = 3)) 
 Filter: (((recipient = 2) AND (sender = 3)) OR ((sender = 2) AND (recipient = 3))) 
 Heap Blocks: exact=1 
 -> BitmapOr (cost=8.57..8.57 rows=2 width=0) (actual time=0.023..0.023 rows=0 loops=1) 
 -> Bitmap Index Scan on sender_idx (cost=0.00..4.28 rows=1 width=0) (actual time=0.015..0.015 rows=2 loops=1) 
 Index Cond: (sender = 3) 
 -> Bitmap Index Scan on recipient_idx (cost=0.00..4.28 rows=1 width=0) (actual time=0.007..0.007 rows=1 loops=1) 
 Index Cond: (recipient = 3) 
Planning time: 0.183 ms 
Execution time: 0.070 ms 

edit 2:

explain analyze
select * from public.message
where recipient = 2 and sender = 3 
UNION ALL
select * from public.message
where sender = 2 and recipient = 3

yields:

QUERY PLAN 
--------------------------------------------------------------------------------------------------------------------------------------------- 
Append (cost=0.29..16.63 rows=2 width=49) (actual time=0.031..0.043 rows=3 loops=1) 
 -> Index Scan using recipient_sender_idx on message (cost=0.29..8.31 rows=1 width=49) (actual time=0.030..0.031 rows=2 loops=1) 
 Index Cond: ((recipient = 2) AND (sender = 3)) 
 -> Index Scan using recipient_sender_idx on message message_1 (cost=0.29..8.31 rows=1 width=49) (actual time=0.009..0.010 rows=1 loops=1) 
 Index Cond: ((recipient = 3) AND (sender = 2)) 
Planning time: 0.389 ms 
Execution time: 0.075 ms 

but it is not feasible to modify my SQL query.

asked Sep 14, 2015 at 11:26
8
  • 1
    It looks completely reasonable to me, given the existing indexes - the information needed for finding the requested rows is split into two indexes. Try to create a new one on (recipient, sender) and check if you get a result you expect. Commented Sep 14, 2015 at 11:31
  • Thanks dezso. Just one question though, do I need two indices as follows: one(recipient, sender) and two(sender, recipient) - in different order or just the one you advised? Commented Sep 14, 2015 at 11:33
  • I have tried as advised, I still don't get any mention of the recipient_sender_idx being used. Do I need to remove the two first indices? Commented Sep 14, 2015 at 11:39
  • Have you ANALYZEd the table? How many rows do you have in the table and how many match this condition? Commented Sep 14, 2015 at 11:45
  • I have 1K rows in the message table and 3 rows meet the condition. I am not sure about the ANALYZE. Can you point me in the right direction as to how to use ANALYZE? Commented Sep 14, 2015 at 11:50

1 Answer 1

1

I think the reason it's picking the single indexes and ORing them is that your data is toy-sized; witness:

 -> BitmapOr (rows=0 loops=1) 
 -> Bitmap Index Scan on sender_idx (rows=2 loops=1) 
 Index Cond: (sender = 3) 
 -> Bitmap Index Scan on recipient_idx (rows=1 loops=1) 
 Index Cond: (recipient = 3) 

and

 scanned 101 of 101 pages, containing 9642 live rows

The table is pretty tiny and the indexes are highly selective, so it's probably faster to scan the smaller indexes then filter the results. There's no point using a composite index when the singular indexes find just a couple of rows anyway.


If the data was bigger and/or there were more rows for a given sender/recipient, it should ideally use two independent index scans on a composite index, then merging them. I don't think the planner is smart enough to realise that this can be executed as two queries and UNIONed,though, so the best you're going to get is probably two bitmap index scans on the composite index, then a BitmapOr.

Try:

select * from public.message
where recipient = 2 and sender = 3 
UNION ALL
select * from public.message
sender = 2 and recipient = 3;

to see if the plan picked is closer to what you want. Even if it produces the plan you want, with two IndexScan nodes on the composite index then an Append node for the union, it might actually be slower.

answered Sep 14, 2015 at 12:19
4
  • Can you advise how many rows is not "tiny"? 10K? 100K? Commented Sep 14, 2015 at 12:29
  • @balteo I think that proves the point. The performance is nearly the same; there's no real benefit, and the current plan is fine. It's not just about table size, it's also about selectivity: how many rows have the same value. If 10% of rows had recipient=1, this plan would be quite bad. If 0.05% of rows have recipient=1, this plan is quite good. PostgreSQL adjusts its plan choices based on statistics about the table for this reason. So it's not simply that 10k rows isn't much for a narrow table on a modern system, it's also that the indexes look highly selective. Commented Sep 14, 2015 at 12:29
  • OK. Thanks a lot Graig. I am now going to carry on my exploration of Postgresql indices and explain plans... Commented Sep 14, 2015 at 12:31
  • 1
    @balteo You'll find explain.depesz.com good for more complex plans. Read up on the statistics based planner, statistics targets, index selectivity, etc. Commented Sep 14, 2015 at 12:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.