Help with Recursive CTE query with INSERT, ON CONFLICT and RETURNING

Question 1

I'm trying to write a query to insert filenames to the following table table path

CREATE SEQUENCE path_id_seq;
CREATE TABLE path (
 id INTEGER PRIMARY KEY DEFAULT NEXTVAL('path_id_seq'),
 parent_id INTEGER,
 name TEXT NOT NULL);
CREATE UNIQUE INDEX path_parent_id_name_ix
 ON path (COALESCE(parent_id, 0), name);
ALTER TABLE path
 ADD CONSTRAINT ios_path_parent_id_fk
 FOREIGN KEY (parent_id) REFERENCES path (id)
 ON DELETE CASCADE
 DEFERRABLE INITIALLY DEFERRED;

For example, given filenames Assets/images/a.png and Assets/images/b.png I'd like to insert the following rows.

id | parent_id | name 
----+-----------+-------------------
 1 | ¤ | Assets
 2 | 1 | images
 3 | 2 | a.png
 4 | 2 | b.png

Inserting the parent is easy;

INSERT INTO path (name)
VALUES ('Assets')
ON CONFLICT (COALESCE(parent_id, 0), name)
DO UPDATE SET name = EXCLUDED.name -- this strange update is so that we get rows back
RETURNING id, parent_id, name;

which yields;

 id | parent_id | name 
----+-----------+--------
 1 | ¤ | Assets

But I can't seem to put together a recursive CTE to use the parent's id as the parent_id in the next INSERT statement.

It will probably look somethings like this invalid query below.

WITH RECURSIVE names AS (
 SELECT name, lag(name, 1) OVER () as previous_name
 FROM (SELECT unnest(string_to_array('Assets/images/232.png', '/')) as name) names),
paths(id, name) AS (
 INSERT INTO path (name)
 SELECT name
 FROM names
 WHERE previous_name IS NULL
 ON CONFLICT (COALESCE(parent_id, 0), name)
 DO UPDATE SET name = EXCLUDED.name
 RETURNING id, name
 UNION
 INSERT INTO path (parent_id, name)
 SELECT paths.id, names.name
 FROM paths
 JOIN names ON names.previous_name = paths.name
 ON CONFLICT (COALESCE(parent_id, 0), name)
 DO UPDATE SET name = EXCLUDED.name
 RETURNING id, name)
SELECT *
FROM paths;

But I just can't seem to get it to work. If anyone sees a simpler way to do this it would be much appreciated.

Question 2

You don't need a recursive CTE, you can use nextval() in the result from the unnest() directly and then use lag() to get the value from the previous row. To deal with existing rows, I would do this using a join between the result of the unnest() and the path table, detecting new and existing rows.

For the new rows, a new ID will be generated and only those will be inserted:

insert into path (id, name, parent_id)
select id, name, parent_id
from (
 select *,
 case 
 when is_new
 then lag(id) over (order by level) 
 else existing_parent
 end as parent_id
 from (
 SELECT case 
 when p.id is null then nextval('path_id_seq') 
 else p.id
 end as id, 
 x.name, x.level, p.parent_id as existing_parent, 
 p.id is null as is_new 
 FROM unnest(string_to_array('Assets/images/232.png', '/')) with ordinality as x(name, level)
 LEFT JOIN path p on p.name = x.name
 order by level
 ) t1
) t2
where is_new;

Live example: http://rextester.com/YGMT19565

The only "drawback" if you will is that the sequence values are not necessarily assigned in the "order" you'd expect. But as the only purpose of a generated primary key is to be unique (and nothing else) this shouldn't be a problem.

Question 3

The lazy but effective way to solve this is with

CREATE OR REPLACE FUNCTION my_func(filename TEXT) RETURNS SETOF INTEGER AS
 $BODY$
 DECLARE
 pid INTEGER := NULL;
 word TEXT;
 BEGIN
 FOR word IN SELECT unnest(string_to_array(filename, '/')) LOOP
 WITH parent AS (
 INSERT INTO path (parent_id, name)
 VALUES (pid, word)
 ON CONFLICT (COALESCE(parent_id, 0), name)
 DO UPDATE SET name = EXCLUDED.name
 RETURNING id)
 SELECT parent.id INTO pid FROM parent;
 RETURN NEXT pid;
 END LOOP;
 RETURN;
 END;
 $BODY$ LANGUAGE plpgsql;
SELECT my_func('the/boots/are/on/the/ground');

user1822user1822 · Accepted Answer · 2016-11-03 21:22:29Z

You don't need a recursive CTE, you can use nextval() in the result from the unnest() directly and then use lag() to get the value from the previous row. To deal with existing rows, I would do this using a join between the result of the unnest() and the path table, detecting new and existing rows.

For the new rows, a new ID will be generated and only those will be inserted:

insert into path (id, name, parent_id)
select id, name, parent_id
from (
 select *,
 case 
 when is_new
 then lag(id) over (order by level) 
 else existing_parent
 end as parent_id
 from (
 SELECT case 
 when p.id is null then nextval('path_id_seq') 
 else p.id
 end as id, 
 x.name, x.level, p.parent_id as existing_parent, 
 p.id is null as is_new 
 FROM unnest(string_to_array('Assets/images/232.png', '/')) with ordinality as x(name, level)
 LEFT JOIN path p on p.name = x.name
 order by level
 ) t1
) t2
where is_new;

Live example: http://rextester.com/YGMT19565

The only "drawback" if you will is that the sequence values are not necessarily assigned in the "order" you'd expect. But as the only purpose of a generated primary key is to be unique (and nothing else) this shouldn't be a problem.

Stack Exchange Network

Help with Recursive CTE query with INSERT, ON CONFLICT and RETURNING

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Help with Recursive CTE query with INSERT, ON CONFLICT and RETURNING

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions