I have an expensive, periodic PostgreSQL 9.3 query of the form:
SELECT * from mytable where name NOT LIKE '%foo%';
foo
is actually a constant here that never changes. The query is expensive because it requires a table scan. I expect that only a few rows will actually match the query, so I want to use a partial index to speed up the query.
CREATE INDEX foo_idx ON mytable ((name NOT LIKE '%foo%')) WHERE name NOT LIKE '%foo%';
However, when I do an EXPLAIN
on my query, it still is using a sequential scan instead of relying on the new index. What am I doing wrong?
I have tried several different expressions inside the parenthesized expression clause of the CREATE INDEX
statement, but nothing helps:
CREATE INDEX foo_idx ON mytable (name) WHERE name NOT LIKE '%foo%';
CREATE INDEX foo_idx ON mytable ((name LIKE '%foo%')) WHERE name NOT LIKE '%foo%';
I have also tried removing the WHERE
clause for the index, and that doesn't help either.
4 Answers 4
The index should work as is. As pointed out by @jjanes, you just needed to run ANALYZE
. However, I suggest you modify the index. The expression in your definition is not useful.
(削除) CREATE INDEX foo_idx ON intentions ((data LIKE '%foo%')) WHERE data NOT LIKE '%foo%'; (削除ここまで)
The expression is always FALSE
and useless noise. Either simplify to the slightly cheaper equivalent index:
CREATE INDEX foo_idx ON intentions ((FALSE)) WHERE data NOT LIKE '%foo%';
Or, better yet, put something useful there. Since each index entry is going to allocate MAXALIGN
(typically 8 bytes) for data anyway, and the overhead per index row is another 12 bytes (and the default fillfactor
for btree indexes is 90%) you might as well put the index to good use and help other queries. If you have any other queries that might use the same partial index?
If you put something worth up to 8 bytes (like one or two int4
columns), the index retains exactly the same size and performance, but offers more possible use cases. If (one of) the column(s) is involved in updates, you have some additional maintenance cost. Still, if you have anything useful, use it. Any additional column that's used in a JOIN
/ WHERE
or ORDER BY
clause. Or column(s) that enable index-only scans:
CREATE INDEX foo_idx ON intentions (<useful_column(s)>)
WHERE data NOT LIKE '%foo%';
You could try a slightly different approach.
CREATE INDEX foo_idx ON mytable (strpos(name,'foo'));
SELECT * from mytable where strpos(name,'foo') = 0;
The NOT LIKE
will be problematic. Imagine this query without the NOT
. The index could be walked to find the interval of values (indexes are sorted lists) which match. The NOT
means you want everything else, both before and after the matching interval. It's a pretty special optimiser which will do that.
You may be able to invert the indexes to identify all the rows which do contain "foo", then use that result in a NOT EXISTS
(or equivalent) to find your desired output. I think you'll still be seeing a table scan, however.
Alternatively add the column IsFoo
to the table. Populate this on table writes (perhaps use a persisted calculated column?). Include IsFoo
in your indexes and filters.
I needed to run ANALYZE
for the proper query plan to take effect. Once this was done, everything worked like a charm. My final index looks like this:
CREATE INDEX foo_idx ON intentions ((data LIKE '%foo%')) WHERE data NOT LIKE '%foo%';
Explore related questions
See similar questions with these tags.
NOT LIKE '%foo%'
in a simple SQL function that you use in both the index and query, does that then use the index?like '%foo%'
, not sure aboutnot like
though: postgresonline.com/journal/archives/…