I am working on a query that performs PostgreSQL CTE Recursion in PostgreSQL 13. This query is intended to recurse from the root nodes to the deepest leaf node with certain properties (labeled 'begat' in the resource_dependency table, with a matching "sample type" property from another table).
https://explain.dalibo.com/plan/g2d2f5fg9c563b0d#plan https://explain.depesz.com/s/eaRu
It seems like both the CTE table construction and moreso the scanning of the CTE to apply the constraints are the most time consuming parts of the query. Most work appears to be done in memory with the given settings. We still seem to scan 780GB of cache (cache hit) which seems like duplicate work.
The actual number of nodes in any given tree may be 1000-5000. In this operation, we are starting from 890 root nodes. There are both 1:many and many:1 relationships in this tree (splitting, recombining)
Doing some experiments in a blank environment I notice that:
- To get all parent->child resource_ids in single column, we need to do left join so all leaf nodes will have a
null
child row with their calculated depth. - We track path to ensure we don't get in an infinite cycle. While path is important for avoiding cycles, it it not relevant for the final calculation. Some paths will be duplicating information for the same root -> descendant. This makes the CTE table much larger
Are there any techniques that can be used here to make this recursion less painful? We need to calculate the deepest leaf node from the root matching the starting root to only the deepest leaf with the 'begat' relationship and matching 'sample type' property.
1 Answer 1
I tried several optimizations for this query. It seems that having a tree with 1 or more fan-ins (many samples -> 1 sample for example) when calculating columns for path and cycle results in mostly duplicated information (only path will differ).
For a hierarchy of records where the integer points (the main join condition) fan out and back in like so: 1 sample -> 1000 samples -> 1 sample -> 10 samples -> 1 sample
The resulting CTE work table is 32000 records long, where only path differs. Dropping path (and the associated cycle column) reduces the the CTE work table to 2021 rows (# of resource_dependency edges + 1 leaf node). 1000 connections between 1 Gen 0 and 1000 Gen 1 1000 connections between 1000 Gen 1 and 1 Gen 2 10 connections between 1 Gen 2 and 10 Gen 3 10 connections between 1 Gen 3 and 1 Gen 4 1 final row for leaf node (child_id is null) <<< target row information
When tracking path and cycle, the fan-ins to the same child_id result in us having to repeatedly join that same child_id integer N times where N is the number of samples fanned in from parent to child.
UNION would deduplicate rows in the work table, but the unique paths used for cycle detection prevent that.
Some solutions that seem to work:
- disabling nested loop does seem help in some cases, as the row estimate from the CTE working table often seems underestimated. For this tree with the original query, performance improved from 103 seconds to 37 seconds
- removing cycle detection by dropping path and cycle calculated columns from the query resolves in 600ms (!)
- Similar techniques that work: -- using a fixed upper limit for generation (depth) allows us to mitigate cycle performance issues without tracking path and complete in less than 1 second -- using a 2nd CTE (non-materialized seems to work) allows us to deduplicate rows from the original CTE by dropping path and cycle information in a 2nd step (7.5 seconds). The 1st CTE work is the same, but the primary query has many less rows to scan
In summary: when using CTE Recursion (which is always materialized FWIW), UNION performance can degrade if path and cycle are tracked (at least with the technique used by the query shown here). Performance of path and cycle tracking appears similar to UNION ALL performance. Relationships pointing to the same join condition for the next recursive step (in my case an integer ID) will inflate the CTE working table.
Some other techniques which might mitigate cycle issues without hurting query performance:
- Prevent cycles from being inserted
- Remove path and cycle information before the primary query with a 2nd CTE
- Used an upper limit on generation/depth or use a query timeout
Explains for the optimizations: https://explain.depesz.com/s/4ZJB
-
Additional optimizations: 1. Create a compound index on resource_depedency(prior_id, label) which reduces work within recursion. 2. Rewrite query to minimize # of joins in the recursive CTE. This greatly reduces the size the the Work Table and increases efficiency. explain.dalibo.com/plan/6d8bb7a7ccb948h3Justin Lowen– Justin Lowen2024年10月23日 18:48:11 +00:00Commented Oct 23, 2024 at 18:48
Explore related questions
See similar questions with these tags.
null
child nodes). Without removing unique paths, the CTE is 32000 rows.enable_nestloop
tooff
for that one query.ix_resource_dependency_prior_id
could benefit from adding the columnlabel
since much filtering has been done.barcode
also needs to be in an index.