In my Rails 5 with PostgreSQL 9.5 app I have 2 frequent queries:
SELECT * FROM table WHERE lower(varchar_col) = 'some_string' AND int1_col = some_int AND int2_col = some_other_int
-- and
SELECT * FROM table WHERE lower(varchar_col) LIKE 'some_str%' AND int1_col = some_int
table
is about 1.5m rows and quickly growing.
So, I've created 2 indices for both of these queries:
CREATE INDEX index_1 ON table USING btree (lower((varchar_col)::text), int1_col, int2_col)
-- and
CREATE INDEX index_2 ON table USING btree (lower((varchar_col)::text) varchar_pattern_ops, int1_col)
I have 2 questions, maybe they are a little noob, I'm not good in DB performance:
1.Can I use only one index for both of these queries? Something like:
CREATE INDEX index ON table USING btree (lower((varchar_col)::text) varchar_pattern_ops, int1_col, int2_col)
Does varchar_pattern_ops
shows the same performance for LIKE
and =
queries? Will PostgreSQL planner use multicolumn index for 3 columns in query for 2 columns?
2.I noticed that Rails generate index with expression lower((varchar_col)::text
, but my varchar_col
has varchar
type, not text
. Does this matter and Rails made some mistake and I need to write CREATE INDEX
query to specifically include column type in query?
Thanks for any help!
P.S. Of course, table's, columns' and indices' name are changed for better reading.
1 Answer 1
Using an index which is somewhere between the one suggested by @ypercubeTM and your index_2
... which would be:
CREATE INDEX index_1 ON t
USING btree
(int1_col,
lower((varchar_col)::text) varchar_pattern_ops,
int2_col) ;
... seems to be a good compromise.
You put first the two columns that you use always, and add the third one as the last. You add the varchar_pattern_ops
so that queries with LIKE use the index.
From the two always used columns, put the one which is always tested for equality first, the one which is tested with other operators (>=, <=, like) second. If both were tested for equality, you normally would want to have the one which is more discriminant [which means, more or less, that has a bigger number of different values] first. If several columns are equally discriminant, you put first the ones with the datatypes easier/faster to compare (for instance: fixed-size values such as integers would go before varying length values such as texts).
These are the exeuction plans of the two queries you want to perform:
EXPLAIN ANALYZE
SELECT
*
FROM
t
WHERE
lower(varchar_col) = 'some_string'
AND int1_col = 5678
AND int2_col = 1234 ;
| QUERY PLAN | | :------------------------------------------------------------------------------------------------------------- | | Index Scan using index_1 on t (cost=0.42..8.44 rows=1 width=40) (actual time=0.007..0.008 rows=1 loops=1) | | Index Cond: ((int1_col = 5678) AND (lower((varchar_col)::text) = 'some_string'::text) AND (int2_col = 1234)) | | Planning time: 0.274 ms | | Execution time: 0.023 ms |
EXPLAIN ANALYZE
SELECT * FROM t WHERE lower(varchar_col) LIKE 'some_str%' AND int1_col = 5678 ;
| QUERY PLAN | | :------------------------------------------------------------------------------------------------------------------------------------------- | | Bitmap Heap Scan on t (cost=4.44..12.16 rows=1 width=40) (actual time=0.013..0.015 rows=3 loops=1) | | Recheck Cond: (int1_col = 5678) | | Filter: (lower((varchar_col)::text) ~~ 'some_str%'::text) | | Heap Blocks: exact=1 | | -> Bitmap Index Scan on index_1 (cost=0.00..4.44 rows=2 width=0) (actual time=0.005..0.005 rows=3 loops=1) | | Index Cond: ((int1_col = 5678) AND (lower((varchar_col)::text) ~>=~ 'some'::text) AND (lower((varchar_col)::text) ~<~ 'somf'::text)) | | Planning time: 0.177 ms | | Execution time: 0.027 ms |
Both query plans use your index and are fast (although, my simulated data might not be realistic at all).
dbfiddle here
Explore related questions
See similar questions with these tags.