I have read that PostgreSQL can use multiple indexes in general, but in the specific case of a query that ranges across two indexes, will it use both? If so, will they be loaded sequentially or together?
For example, if this query spans across two partial indexes by column_1
, how will partial indexes be used if at all, and how is the index data loaded and discarded:
SELECT 1 FROM sample_table WHERE column_1 > 50 AND column_2 < 50000
1 Answer 1
Very short version: Yes, sometimes.
PostgreSQL can use bitmap index scans to combine multiple indexes.
A predicate like
WHERE a > 50 AND a < 50000
is a specialisation of the more general form:
wHERE a > 50 and b < 50000
for a = b.
PostgreSQL can use two indexes here, one for each part of the predicate, and then bitmap AND
them. It doesn't matter if they happen to be on different ranges of the same column.
This is much less efficient than a single index, and may not be useful for some queries, but it's possible.
The bigger problem is that PostgreSQL's partial index support is not very bright. Irrespective of whether there's one or two indexes it might just not figure out that it can use the index at all.
Demonstration setup:
CREATE TABLE partial (x integer, y integer);
CREATE INDEX xs_above_50 ON partial(x) WHERE (x > 50);
CREATE INDEX xs_below_50000 ON partial(x) WHERE (x < 5000);
INSERT INTO partial(x,y) SELECT a, a FROM generate_series(1,100000) a;
OK, what will Pg prefer for given queries?
regress=> EXPLAIN SELECT y FROM partial WHERE x > 50 AND x < 50000;
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using xs_above_50 on partial (cost=0.29..1788.47 rows=50309 width=4)
Index Cond: ((x > 50) AND (x < 50000))
(2 rows)
regress=> EXPLAIN SELECT y FROM partial WHERE x > 20 AND x < 50000;
QUERY PLAN
--------------------------------------------------------------
Seq Scan on partial (cost=0.00..1943.00 rows=50339 width=4)
Filter: ((x > 20) AND (x < 50000))
(2 rows)
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 50000;
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using xs_above_50 on partial (cost=0.29..1787.45 rows=50258 width=4)
Index Cond: ((x > 100) AND (x < 50000))
(2 rows)
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 20000;
QUERY PLAN
---------------------------------------------------------------------------------
Index Scan using xs_above_50 on partial (cost=0.29..710.71 rows=19921 width=4)
Index Cond: ((x > 100) AND (x < 20000))
(2 rows)
What if we try to force a bitmap index scan just to find out if Pg can use one, even if it's not worth doing for this particular simple case and small sample?
Try:
regress=> SET enable_seqscan = off;
SET
regress=> SET enable_indexscan = off;
SET
regress=> SET enable_indexonlyscan = off;
SET
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 AND x < 20000;
QUERY PLAN
--------------------------------------------------------------------------------
Bitmap Heap Scan on partial (cost=424.48..1166.30 rows=19921 width=4)
Recheck Cond: ((x > 100) AND (x < 20000))
-> Bitmap Index Scan on xs_above_50 (cost=0.00..419.50 rows=19921 width=0)
Index Cond: ((x > 100) AND (x < 20000))
(4 rows)
Hm. Nope. Not combining the indexes there. It might be able to but simply not think it's worth scanning a second index, though.
What about a query that ORs two predicates instead?
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 OR x < 200;
QUERY PLAN
---------------------------------------------------------------------------------------
Bitmap Heap Scan on partial (cost=1905.29..3848.29 rows=99908 width=4)
Recheck Cond: ((x > 100) OR (x < 200))
-> BitmapOr (cost=1905.29..1905.29 rows=100000 width=0)
-> Bitmap Index Scan on xs_above_50 (cost=0.00..1849.60 rows=99908 width=0)
Index Cond: (x > 100)
-> Bitmap Index Scan on xs_below_50000 (cost=0.00..5.73 rows=193 width=0)
Index Cond: (x < 200)
(7 rows)
Here PostgreSQL has ORed both indexes to find a match, then done a heap scan and recheck.
So yes, PostgreSQL can combine multiple partial indexes, at least for some queries, where it is useful to do so.
But if I RESET
the planner overrides...
regress=> RESET enable_seqscan;
RESET
regress=> RESET enable_indexscan ;
RESET
regress=> RESET enable_indexonlyscan ;
RESET
regress=> EXPLAIN SELECT y FROM partial WHERE x > 100 OR x < 200;
QUERY PLAN
--------------------------------------------------------------
Seq Scan on partial (cost=0.00..1943.00 rows=99908 width=4)
Filter: ((x > 100) OR (x < 200))
(2 rows)
... Pg will realise it's faster to just sequentially scan the table.
-
Really an amazing answer. Thank you very much.Jim Bob– Jim Bob2015年02月06日 19:57:56 +00:00Commented Feb 6, 2015 at 19:57
-
One more limitation I've seen is that it can't use two partial indexes together in a case where it absolutely needs both. For example, if I have one index cover IDs < 500 and the other cover IDs ≥ 500, then I select without an ID constraint, it never uses the indexes. I was trying to do that so I could kinda parallelize index creation...sudo– sudo2017年03月14日 23:39:19 +00:00Commented Mar 14, 2017 at 23:39
-
2There is a pretty impactful typo in this answer: CREATE INDEX xs_below_50000 ON partial(x) WHERE (x < 5000); That
x < 5000
should bex < 50000
. With that corrected, both the second and third queries pick up the partial indexes. E.g., ``` Index Scan using xs_below_50000 on partial (cost=0.29..1659.47 rows=50219 width=4) Index Cond: (x > 20)tantrix– tantrix2018年07月20日 00:38:30 +00:00Commented Jul 20, 2018 at 0:38
Explore related questions
See similar questions with these tags.
explain analyze
on your query with your data. My experience is that Postgres prefers to use two indexes only whenor
conditions are involved, I have hardly ever seen this withand
conditions.and
conditions?and
condition (at least I can't recall it)