Why changing limit triggers different query plans in postgresql?

Question 1

Changing only limit from 40 to 50 in following query triggers different execution plans. And unfortunatly one I needed is much slower. So the question is: why this happening and how can I force postgresql to use faster plan? I'm using postgresql 14.5

SELECT "Id"
 FROM "Podcasts" AS P 
 INNER JOIN "PodcastCategories" AS PC ON P."Id"=PC."PodcastId"
 WHERE "LastPublishDate" IS NOT NULL AND "Dead" = false AND "Hidden" = false AND PC."CategoryId" = ANY (ARRAY[1]) 
 AND P."LastPublishDate"<'2023-01-14 23:00:00+00'
 ORDER BY "LastPublishDate" DESC
 LIMIT 50

This is plan for limit 40 and it is expected one and fast!

Limit (cost=1000.87..53095.17 rows=40 width=12) (actual time=46.797..606.536 rows=40 loops=1)
->Gather Merge (cost=1000.87..60909.31 rows=46 width=12) (actual time=46.796..606.518 rows=40 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 ->Nested Loop (cost=0.84..59903.98 rows=19 width=12) (actual time=23.367..448.066 rows=15 loops=3)
 -> Parallel Index Only Scan using ""IX_Podcasts_LastPublishDate"" on ""Podcasts"" p (cost=0.42..55488.81 rows=2442 width=12) (actual time=0.791..63.207 rows=259 loops=3)
 Index Cond: (""LastPublishDate"" < '2023-01-14 23:00:00+00'::timestamp with time zone)
 Heap Fetches: 776
 -> Index Only Scan using ""PK_PodcastCategories"" on ""PodcastCategories"" pc (cost=0.42..1.80 rows=1 width=4) (actual time=1.487..1.487 rows=0 loops=776)
 Index Cond: ((""PodcastId"" = p.""Id"") AND (""CategoryId"" = ANY ('{1}'::integer[])))
 Heap Fetches: 21
Planning Time: 2.468 ms
Execution Time: 606.588 ms

This is plan, when limit is 50 and it runs much slower

Limit (cost=59885.72..59888.83 rows=27 width=12) (actual time=34419.067..34436.304 rows=50 loops=1)
->Gather Merge (cost=59885.72..59888.83 rows=27 width=12) (actual time=34419.065..34436.298 rows=50 loops=1)
 Workers Planned: 1
 Workers Launched: 1
->Sort (cost=58885.71..58885.78 rows=27 width=12) (actual time=34415.504..34415.510 rows=40 loops=2)
 Sort Method: top-N heapsort Memory: 28kB
 Sort Key: p.""LastPublishDate"" DESC
 Worker 0: Sort Method: top-N heapsort Memory: 29kB
 ->Parallel Hash Join (cost=55858.19..58885.07 rows=27 width=12) (actual time=34386.500..34412.404 rows=10528 loops=2)
 Hash Cond: (pc.""PodcastId"" = p.""Id"")
 ->Parallel Bitmap Heap Scan on ""PodcastCategories"" pc (cost=336.90..3313.48 rows=19163 width=4) (actual time=94.378..2852.945 rows=16934 loops=2)
 Recheck Cond: (""CategoryId"" = ANY ('{1}'::integer[]))
 Heap Blocks: exact=1292"
 ->Bitmap Index Scan on ""IX_PodcastCategories_CategoryId"" (cost=0.00..328.75 rows=32577 width=0) (actual time=91.542..91.543 rows=33879 loops=1)
 Index Cond: (""CategoryId"" = ANY ('{1}'::integer[]))
 ->Parallel Hash (cost=55490.76..55490.76 rows=2442 width=12) (actual time=31518.266..31518.267 rows=130037 loops=2)
 Buckets: 131072 (originally 8192) Batches: 4 (originally 1) Memory Usage: 4128kB
 ->Parallel Index Only Scan using ""IX_Podcasts_LastPublishDate"" on ""Podcasts"" p (cost=0.42..55490.76 rows=2442 width=12) (actual time=0.029..30960.929 rows=130037 loops=2)
 Index Cond: (""LastPublishDate"" < '2023-01-14 23:00:00+00'::timestamp with time zone)
 Heap Fetches: 260290
Planning Time: 0.348 ms
Execution Time: 34436.367 ms

Question 2

At some point it thinks reading all the qualifying points and sorting them will be faster than walking an index in already sorted order and filtering out the ones that don't qualify until it reaches the LIMIT. And it is true, at some point it would be faster. But it misestimates at which point the change-over will happen, probably because it grossly misestimates the number of rows meeting the LastPublishDate criterion (2442 estimated vs 130037 actual).

I think there is no good reason for such a horrible misestimate. Your table seems to be severely under-analyzed. And probably also under-vacuumed, based on the large number of heap fetches you are seeing.

Question 3

And, on top, I would check if random_page_cost is set correctly.

Question 4

creating apropriate statistics probably would help. doc: postgresql.org/docs/current/sql-createstatistics.html

Question 5

@RabbanKeyak I don't see what extended statistic is likely to help here. Just the default statistics should be good enough, they just need to be kept more up to date.

jjanes jjanes 42.4k3 gold badges44 silver badges54 bronze badges · Accepted Answer · 2023-01-16 00:08:13Z

At some point it thinks reading all the qualifying points and sorting them will be faster than walking an index in already sorted order and filtering out the ones that don't qualify until it reaches the LIMIT. And it is true, at some point it would be faster. But it misestimates at which point the change-over will happen, probably because it grossly misestimates the number of rows meeting the LastPublishDate criterion (2442 estimated vs 130037 actual).

I think there is no good reason for such a horrible misestimate. Your table seems to be severely under-analyzed. And probably also under-vacuumed, based on the large number of heap fetches you are seeing.

And, on top, I would check if random_page_cost is set correctly.
creating apropriate statistics probably would help. doc: postgresql.org/docs/current/sql-createstatistics.html
@RabbanKeyak I don't see what extended statistic is likely to help here. Just the default statistics should be good enough, they just need to be kept more up to date.

Stack Exchange Network

Why changing limit triggers different query plans in postgresql?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Why changing limit triggers different query plans in postgresql?

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions