Dealing with suboptimal postgres query plans related to nested loops

Question 1

I'm developing an application that dynamically builds up SQL queries and executes them against postgres 13.

Some of the queries are very slow because they use nested-loops but the query planner underestimates the number of loops. Running the same query manually with set enable_nestloop to off; improves query speed from 28s to 500ms. (I'll post the plans for this example query at the end). I think the cause is that a lot of filters are used (and more filters can be added by the user on demand).

Things I've tried:

Globally disabling nested loops: helps for some queries a lot, but makes a lot of other queries quite a bit slower, hence not a good solution
Increasing statistics target to max: I have set the statistics target for all related table to max (10000) and analyzed all tables. No effect.
Extended statistics: I have set up multiple extended statistics. I've tried a lot of different combinations, and while it helped for a few queries, for most queries (including the example query) it had no effect. I hoped it would have an impact since filters can often depend on each other.
Manipulation random_page_cost: I have set the cost for random access very high (e.g. 15x the cost for seq access). This encouraged the planner not to use nested loops for some queries, but it made performance bad for many other queries and feels like an even worse hack than disabling nested loops in general.

So far I had no luck and I've run out of options. Ideas very welcome!

Here is an example of a query and the plans (with and without nested loops):

explain analyze select
 count(distinct EMPLOYEE.employeeId) as "measure_0",
 avg(DATA.score) as "measure_1",
 EMPLOYEE.managerId as "dimension_0",
 DATA.versionId as "dimension_1"
from
 data
join EMPLOYEE on
 (DATA.EMPLOYEEID = EMPLOYEE.EMPLOYEEID
 and DATA.versionid = EMPLOYEE.versionid
 and EMPLOYEE.namespace = 'company1')
where
 (DATA.namespace = 'company1'
 and EMPLOYEE.managerId in ('id1', 'id2', ...[100 more])
 and EMPLOYEE.versionId in ('version1', 'version2', ...[10 more])
 and DATA.versionId in ('version1', 'version2', ...[10 more, same as above]))
group by
 "dimension_0",
 "dimension_1";

Plan with nested loops enabled:

GroupAggregate (cost=62420.73..62420.75 rows=1 width=34) (actual time=28117.300..28142.646 rows=172 loops=1)
 Group Key: employee.managerid, data.versionid
 -> Sort (cost=62420.73..62420.73 rows=1 width=35) (actual time=28116.925..28123.021 rows=23132 loops=1)
 Sort Key: employee.managerid, data.versionid
 Sort Method: quicksort Memory: 4021kB
 -> Gather (cost=1082.22..62420.72 rows=1 width=35) (actual time=85.357..28059.440 rows=23132 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Nested Loop (cost=82.22..61420.62 rows=1 width=35) (actual time=89.586..27974.865 rows=7711 loops=3)
 -> Parallel Bitmap Heap Scan on employee (cost=81.79..7385.09 rows=16 width=27) (actual time=9.869..118.660 rows=541 loops=3)
 Recheck Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((managerid)::text = ANY ('{id1, id2, ...[100 more]}'::text[]))
 Rows Removed by Filter: 5739
 Heap Blocks: exact=2848
 -> Bitmap Index Scan on index_employee_namespace_versionid (cost=0.00..81.78 rows=6338 width=0) (actual time=5.864..5.865 rows=20535 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 -> Index Scan using index_data_namespace_versionid_employeeid on data (cost=0.43..3377.21 rows=1 width=30) (actual time=29.467..51.361 rows=14 loops=1624)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = (employee.versionid)::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((employee.employeeid)::text = (employeeid)::text)
 Rows Removed by Filter: 19192
Planning Time: 19.748 ms
Execution Time: 28142.911 ms

With nested loops disabled:

GroupAggregate (cost=150534.11..150534.14 rows=1 width=34) (actual time=486.321..511.282 rows=172 loops=1)
 Group Key: employee.managerid, data.versionid
 -> Sort (cost=150534.11..150534.12 rows=1 width=35) (actual time=486.009..494.591 rows=23132 loops=1)
 Sort Key: employee.managerid, data.versionid
 Sort Method: quicksort Memory: 4021kB
 -> Gather (cost=143162.61..150534.10 rows=1 width=35) (actual time=315.171..427.200 rows=23132 loops=1)
 Workers Planned: 2
 Workers Launched: 2
 -> Parallel Hash Join (cost=142162.61..149534.00 rows=1 width=35) (actual time=295.646..398.855 rows=7711 loops=3)
 Hash Cond: (((employee.employeeid)::text = (data.employeeid)::text) AND ((employee.versionid)::text = (data.versionid)::text))
 -> Parallel Bitmap Heap Scan on employee (cost=81.79..7385.09 rows=16 width=27) (actual time=18.563..107.860 rows=541 loops=3)
 Recheck Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((managerid)::text = ANY ('{id1, id2, ...[100 more]}'::text[]))
 Rows Removed by Filter: 5739
 Heap Blocks: exact=4523
 -> Bitmap Index Scan on index_employee_namespace_versionid (cost=0.00..81.78 rows=6338 width=0) (actual time=5.950..5.952 rows=20535 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 -> Parallel Hash (cost=140631.49..140631.49 rows=96622 width=30) (actual time=275.676..275.679 rows=76621 loops=3)
 Buckets: 262144 Batches: 1 Memory Usage: 18304kB
 -> Parallel Bitmap Heap Scan on data (cost=2894.61..140631.49 rows=96622 width=30) (actual time=14.263..159.147 rows=76621 loops=3)
 Recheck Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Heap Blocks: exact=4099
 -> Bitmap Index Scan on index_data_namespace_versionid_employeeid (cost=0.00..2836.64 rows=231892 width=0) (actual time=29.397..29.397 rows=229862 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
Planning Time: 19.753 ms
Execution Time: 511.575 ms

UPDATED QUERY PLANS

Using set max_parallel_workers_per_gather TO 0; and with buffers here.

enable_nestloop=on

GroupAggregate (cost=149659.22..149659.24 rows=1 width=34) (actual time=29435.636..29452.065 rows=172 loops=1)
 Group Key: employee.managerid, data.versionid
 Buffers: shared hit=29795930 read=12823
 I/O Timings: read=4732.666
 -> Sort (cost=149659.22..149659.22 rows=1 width=34) (actual time=29435.143..29437.483 rows=23132 loops=1)
 Sort Key: employee.managerid, data.versionid
 Sort Method: quicksort Memory: 4021kB
 Buffers: shared hit=29795930 read=12823
 I/O Timings: read=4732.666
 -> Nested Loop (cost=0.86..149659.21 rows=1 width=34) (actual time=440.432..29371.818 rows=23132 loops=1)
 Buffers: shared hit=29795927 read=12823
 I/O Timings: read=4732.666
 -> Index Scan using index_employee_namespace_versionid on employee (cost=0.43..7451.89 rows=43 width=26) (actual time=2.811..2057.831 rows=1624 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((managerid)::text = ANY ('{{id1, id2, ...[100 more]}'::text[]))
 Rows Removed by Filter: 17216
 Buffers: shared hit=18938 read=727
 I/O Timings: read=484.313
 -> Index Scan using index_data_namespace_versionid_employeeid on data (cost=0.43..3307.14 rows=1 width=30) (actual time=10.658..16.779 rows=14 loops=1624)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = (employee.versionid)::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((employee.employeeid)::text = (employeeid)::text)
 Rows Removed by Filter: 19192
 Buffers: shared hit=29776989 read=12096
 I/O Timings: read=4248.353
Planning:
 Buffers: shared hit=698
Planning Time: 40.529 ms
Execution Time: 29452.638 ms

enable_nestloop=off

GroupAggregate (cost=158256.87..158256.90 rows=1 width=34) (actual time=444.300..462.896 rows=172 loops=1)
 Group Key: employee.managerid, data.versionid
 Buffers: shared hit=32581
 -> Sort (cost=158256.87..158256.88 rows=1 width=34) (actual time=444.046..447.080 rows=23132 loops=1)
 Sort Key: employee.managerid, data.versionid
 Sort Method: quicksort Memory: 4021kB
 Buffers: shared hit=32581
 -> Hash Join (cost=150369.91..158256.86 rows=1 width=34) (actual time=305.940..410.826 rows=23132 loops=1)
 Hash Cond: (((employee.employeeid)::text = (data.employeeid)::text) AND ((employee.versionid)::text = (data.versionid)::text))
 Buffers: shared hit=32581
 -> Index Scan using index_employee_namespace_versionid on employee (cost=0.43..7451.89 rows=43 width=26) (actual time=0.128..89.940 rows=1624 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Filter: ((managerid)::text = ANY ('{{id1, id2, ...[100 more]}'::text[]))
 Rows Removed by Filter: 17216
 Buffers: shared hit=19662
 -> Hash (cost=146839.27..146839.27 rows=235347 width=30) (actual time=304.247..304.250 rows=229862 loops=1)
 Buckets: 262144 Batches: 1 Memory Usage: 17088kB
 Buffers: shared hit=12919
 -> Bitmap Heap Scan on data (cost=2980.64..146839.27 rows=235347 width=30) (actual time=19.781..165.733 rows=229862 loops=1)
 Recheck Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Heap Blocks: exact=12285
 Buffers: shared hit=12919
 -> Bitmap Index Scan on index_data_namespace_versionid_employeeid (cost=0.00..2921.80 rows=235347 width=0) (actual time=17.701..17.701 rows=229862 loops=1)
 Index Cond: (((namespace)::text = 'company1'::text) AND ((versionid)::text = ANY ('{version1,version2, ...[10 more]}'::text[])))
 Buffers: shared hit=634
Planning Time: 21.766 ms
Execution Time: 463.536 ms

Question 2

Postgres 14 has some improvements for queries with large IN lists. Do you have the possibility to test your query on that version?

Question 3

If you set max_parallel_workers_per_gather TO 0; in your session before collecting the plans, they will be much easier to interpret.

Question 4

you should turn on track_io_timing and then collect your plan with EXPLAIN (ANALYZE, BUFFERS)

Question 5

You describe some dependence between filter conditions. Please illustrate them with some examples. Similarly, you said you tried some extended statistics. But without knowing which ones you tried, what could we possibly recommend?

Question 6

Why does your query plan contain so many casts to text? What are your column types to start with?

Question 7

You are currently testing EMPLOYEE.versionId and DATA.versionId against the same IN-list. The planner will multiply those two selectivities together when making the estimates. But of course that is wrong, since the two versionId columns are already constrained to equal each other, either both will pass the IN list or both will fail. They are completely dependent, not at all independent as the planner thinks. So the easiest thing you can try is to delete one of those two IN-list tests. That is not the only estimation problem of course, but it is the easiest to do something about.

It is hard to predict what this will do, for two reasons. So you will just have to try it and see. One reason is that we don't know how selective the IN-list is in isolation. So we don't know how much difference multiplying by it only once rather than twice will make. The other is that estimated row counts are clamped to be not less than one, and also reported as integers. So is the 1 estimate for the hash join really 1.49, or really 0.0000001? Multiplying the first of these by (say) 100 will make a big difference to the end result, while multiplying the latter by 100 will not.

Experimentally, I do see that having columns of type varchar(255) does cause the plans to show a lot of casting to text. Changing those types to 'text' would be unlikely to make a performance difference, but it would make the plans easier to read. Cross-type comparisons can disable efficient index usage and also lead to explicit casting showing up in the plan, which is why I asked about it. But since varchar columns always show up as casts but doesn't disable indexes, that probably isn't actually an issue here.

But that does make me wonder why the usage of the index "index_data_namespace_versionid_employeeid" is so bad. The name implies the index is over (namespace, versionid, employeeid). But if so, why is employeeid present in the Filter rather than in the Index Condition? I can't reproduce this myself, I get all 3 columns showing up in the Index Condition as simple equality checks, while the versionID IN-list check is what gets relegated to the Filter. If you could get use of all 3 columns of the index, I think the "bad" plan would become much better--maybe better than the hash plan, or maybe at least good enough not to worry about further. Dropping the extraneous IN-list check should be able to get this to happen (Or it might prompt a change to using the hash join and so get rid of the nested loop altogether--either of those would be a victory). But if that doesn't work, you could try reversing the last two columns in the index, so it would be (namespace, employeeid, versionid). Since I can't reproduce the problem, I can't test if these things would solve it.

Question 8

First of all, thank you for this awesome response. If you could get use of all 3 columns of the index, I think the "bad" plan would become much better, yes! When I get the planner do this, the query is super fast. Unfortuntely, removing the extraneous IN check makes it more likely to happen, but on its own (without other undesired changes like random_page costs) does not make it happen. I will try your suggestion of reversing the columns in the index though, maybe that helps.

Question 9

If you have that many elements in the IN list, write them into a temporary table instead, ANALYZE that table and calculate a join with that table.

Question 10

I have also seen joins against a values clause to be much faster than using large IN lists.

Question 11

Do you mean both ids or only the 100+ manager-ids?

Question 12

I am talking about the long IN lists. Not sure if it will help, but it is worth a try.

jjanes jjanes 42.4k3 gold badges44 silver badges54 bronze badges · Answer 1 · 2022-03-30 18:52:11Z

You are currently testing EMPLOYEE.versionId and DATA.versionId against the same IN-list. The planner will multiply those two selectivities together when making the estimates. But of course that is wrong, since the two versionId columns are already constrained to equal each other, either both will pass the IN list or both will fail. They are completely dependent, not at all independent as the planner thinks. So the easiest thing you can try is to delete one of those two IN-list tests. That is not the only estimation problem of course, but it is the easiest to do something about.

It is hard to predict what this will do, for two reasons. So you will just have to try it and see. One reason is that we don't know how selective the IN-list is in isolation. So we don't know how much difference multiplying by it only once rather than twice will make. The other is that estimated row counts are clamped to be not less than one, and also reported as integers. So is the 1 estimate for the hash join really 1.49, or really 0.0000001? Multiplying the first of these by (say) 100 will make a big difference to the end result, while multiplying the latter by 100 will not.

Experimentally, I do see that having columns of type varchar(255) does cause the plans to show a lot of casting to text. Changing those types to 'text' would be unlikely to make a performance difference, but it would make the plans easier to read. Cross-type comparisons can disable efficient index usage and also lead to explicit casting showing up in the plan, which is why I asked about it. But since varchar columns always show up as casts but doesn't disable indexes, that probably isn't actually an issue here.

But that does make me wonder why the usage of the index "index_data_namespace_versionid_employeeid" is so bad. The name implies the index is over (namespace, versionid, employeeid). But if so, why is employeeid present in the Filter rather than in the Index Condition? I can't reproduce this myself, I get all 3 columns showing up in the Index Condition as simple equality checks, while the versionID IN-list check is what gets relegated to the Filter. If you could get use of all 3 columns of the index, I think the "bad" plan would become much better--maybe better than the hash plan, or maybe at least good enough not to worry about further. Dropping the extraneous IN-list check should be able to get this to happen (Or it might prompt a change to using the hash join and so get rid of the nested loop altogether--either of those would be a victory). But if that doesn't work, you could try reversing the last two columns in the index, so it would be (namespace, employeeid, versionid). Since I can't reproduce the problem, I can't test if these things would solve it.

First of all, thank you for this awesome response. If you could get use of all 3 columns of the index, I think the "bad" plan would become much better, yes! When I get the planner do this, the query is super fast. Unfortuntely, removing the extraneous IN check makes it more likely to happen, but on its own (without other undesired changes like random_page costs) does not make it happen. I will try your suggestion of reversing the columns in the index though, maybe that helps.

Laurenz Albe Laurenz Albe 62k4 gold badges57 silver badges93 bronze badges · Answer 2 · 2022-03-21 16:15:21Z

0

If you have that many elements in the IN list, write them into a temporary table instead, ANALYZE that table and calculate a join with that table.

Share

Improve this answer

answered Mar 21, 2022 at 16:15

Laurenz Albe's user avatar

Laurenz Albe Laurenz Albe

62k4 gold badges57 silver badges93 bronze badges

3

1

I have also seen joins against a values clause to be much faster than using large IN lists.

user1822
– user1822

2022年03月21日 16:38:34 +00:00
Commented Mar 21, 2022 at 16:38
Do you mean both ids or only the 100+ manager-ids?

valenterry
– valenterry

2022年03月21日 19:52:02 +00:00
Commented Mar 21, 2022 at 19:52
I am talking about the long IN lists. Not sure if it will help, but it is worth a try.

Laurenz Albe
– Laurenz Albe

2022年03月22日 02:33:58 +00:00
Commented Mar 22, 2022 at 2:33

Add a comment |

Stack Exchange Network

Dealing with suboptimal postgres query plans related to nested loops

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Dealing with suboptimal postgres query plans related to nested loops

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions