0

Imagine I have a table "departments" like this:

name TEXT
parent_department TEXT (nullable)

I also have a "budgets" table like this:

department_name TEXT
budget INTEGER

I need to "traverse" this departments table from "leaf to root", following the parent department column. The end result is that I need to sum up the budgets of each department, but the twist is that the "budgets" table that I'll be using at any given time is very sparse (10s to 100s of rows), while departments is very dense (100m rows).

Normally, I would approach this by creating a "with recursive" CTE that "flattens" the department relationship into a temporary table of just "root department", "leaf department". Then I would join that table on leaf department to budgets, and group by root. The problem is that the "department" table is FAR too big now (the analogy to departments breaks down here a little bit I admit, no company has>100m departments).

Concretely, if I had departments like this (child -> parent):

C -> B
B -> A
A -> NULL
E -> D
D -> NULL
F -> G
G -> NULL

And budgets like this:

A 1
C 3
E 5

I would want to get as output:

A 4
D 5

And ideally, F and G, for example, would never be touched, because there are hundreds of millions of rows like F and G that shouldn't come up. But in my current query, they do. How can I fix this so that I only "traverse" the departments that I actually need to?

asked May 24, 2020 at 19:17

1 Answer 1

2

It took a while, but I figured it out:

WITH RECURSIVE
initial AS (
 SELECT
 department_name,
 department_name,
 FALSE
 FROM
 budgets
 GROUP BY department_name
),
tmp (child, parent, root) AS (
 SELECT
 *
 FROM
 initial
 UNION ALL
 SELECT
 tmp.child,
 CASE WHEN
 departments.parent_department IS NULL THEN tmp.parent
 ELSE departments.parent_department
 END,
 departments.parent_department IS NULL
 FROM
 tmp
 INNER JOIN departments
 ON tmp.parent = departments.name
 WHERE
 NOT tmp.root
)
SELECT
 tmp.parent,
 SUM(budget)
FROM
 tmp
 INNER JOIN budgets
 ON tmp.child = budgets.department_name
WHERE
 tmp.root
GROUP BY tmp.parent
answered May 24, 2020 at 20:46

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.