3

I have a medium sized table (~4M rows) of "functionCalls" which consists of 2 columns, input and function (both ids for another table):

 Column | Type | Collation | Nullable | Default 
----------+---------+-----------+----------+---------
 input | integer | | not null | 
 function | integer | | not null | 
Indexes:
 "functionCall_pkey" PRIMARY KEY, btree (input, function) CLUSTER
 "functionCallSearch" btree (function, input)
Foreign-key constraints:
 "fkey1" FOREIGN KEY (function) REFERENCES function(id) ON UPDATE CASCADE ON DELETE CASCADE
 "fkey2" FOREIGN KEY (input) REFERENCES input(id)

I want to find all rows that match a certain function, which is why I added the functionCallSearch index. Here is my query:

SELECT c.input FROM "functionCall" c
INNER JOIN "function" ON (function.id = c.function)
WHERE function.text LIKE 'getmyinode'
ORDER BY c.input DESC
LIMIT 25 OFFSET 0;

This takes forever (currently ~ 20s) because pg refuses to use the index, and decides to do a Index Only Scan Backward on the primary key instead:

 Limit (cost=0.71..2178.97 rows=25 width=4) (actual time=12903.294..19142.568 rows=8 loops=1)
 Output: c.input
 Buffers: shared hit=59914 read=26193 written=54
 -> Nested Loop (cost=0.71..135662.48 rows=1557 width=4) (actual time=12903.292..19142.561 rows=8 loops=1)
 Output: c.input
 Inner Unique: true
 Join Filter: (c.function = function.id)
 Rows Removed by Join Filter: 3649900
 Buffers: shared hit=59914 read=26193 written=54
 -> Index Only Scan Backward using "functionCall_pkey" on public."functionCall" c (cost=0.43..80906.80 rows=3650225 width=8) (actual time=0.040..17083.489 rows=3649908 loops=1)
 Output: c.input, c.function
 Heap Fetches: 3649909
 Buffers: shared hit=59911 read=26193 written=54
 -> Materialize (cost=0.28..2.30 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=3649908)
 Output: function.id
 Buffers: shared hit=3
 -> Index Scan using function_text on public.function (cost=0.28..2.30 rows=1 width=4) (actual time=0.023..0.026 rows=1 loops=1)
 Output: function.id
 Index Cond: ((function.text)::text = 'getmyinode'::text)
 Buffers: shared hit=3
 Planning Time: 0.392 ms
 Execution Time: 19143.967 ms

When I remove the LIMIT this query is blazingly fast:

 Sort (cost=5247.53..5251.42 rows=1557 width=4) (actual time=3.762..3.763 rows=8 loops=1)
 Output: c.input
 Sort Key: c.input DESC
 Sort Method: quicksort Memory: 25kB
 Buffers: shared hit=6 read=4
 -> Nested Loop (cost=0.71..5164.97 rows=1557 width=4) (actual time=0.099..3.739 rows=8 loops=1)
 Output: c.input
 Buffers: shared hit=6 read=4
 -> Index Scan using function_text on public.function (cost=0.28..2.30 rows=1 width=4) (actual time=0.054..0.056 rows=1 loops=1)
 Output: function.id
 Index Cond: ((function.text)::text = 'getmyinode'::text)
 Buffers: shared hit=2 read=1
 -> Index Only Scan using "functionCallSearch" on public."functionCall" c (cost=0.43..5103.71 rows=5897 width=8) (actual time=0.039..3.670 rows=8 loops=1)
 Output: c.function, c.input
 Index Cond: (c.function = function.id)
 Heap Fetches: 8
 Buffers: shared hit=4 read=3
 Planning Time: 0.514 ms
 Execution Time: 3.819 ms

Things I've tried so far:

  • After reading Postgres sometimes uses inferior index for WHERE a IN (...) ORDER BY b LIMIT N I have checked n_distinct - but is not that far off, pg_stats says 623 while SELECT COUNT(*) FROM (SELECT DISTINCT function FROM "functionCall") returns 1065.

  • I've increased the table SET STATISTICS to 10k and ran ANALYZE. That cuts the time in half (9s) but still won't use the index I created.

Why is this? And how can I fix this?

asked Dec 9, 2019 at 14:29
0

1 Answer 1

6

PostgreSQL estimates that there will be 1557 rows that satisfy the condition, so it thinks that it will be faster if it avoids an explicit sort and rather scans the rows in ORDER BY order and does nested loop joins until it has found enough rows.

Unfortunately, that doesn't work out at all, and PostgreSQL has to scan the whole table that way, since there are only 8 matches total.

The problem seems to be that the estimates are quite off: it thinks that the index scan on public."functionCall" for function will produce 5897 rows, when really there are only 8.

As a first measure, try to calculate new distribution statistics on the table:

ANALYZE public."functionCall";

If that alone does not improve the estimate, increase the granularity:

ALTER TABLE public."functionCall" ALTER function SET STATISTICS 1000;
ANALYZE public."functionCall";

That should improve the estimate. You can try with higher values than 1000 too.

If all fails, tell PostgreSQL explicitly not to use that strategy by using

ORDER BY c.input + 0 DESC
answered Dec 9, 2019 at 15:24
4
  • 4
    FYI - neither the ANALYZE nor the SET STATISTICS (to 10k) helped. The ORDER BY + 0 does "work" Commented Dec 9, 2019 at 15:41
  • 2
    @Sjon It has to make a generic estimate of the number of rows matching c.function = function.id without yet knowing what the specific value of function.id will be. If different values of "id" have greatly different counts, this will always be a risky thing to estimate. So it is not surprising that ANALYZE and SET STATISTICS did not help much, as the problem is in a cross-table correlation, not the raw statistics. Commented Dec 9, 2019 at 19:22
  • @laurenz-albe We had to use the "if all fails" workaround as well. Is this safe to use in production or is there a more reliable solution? Commented Feb 23, 2021 at 17:47
  • @Rocky I cannot imagine what could be unsafe about that. Commented Feb 23, 2021 at 17:49

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.