I am trying to write a Postgres procedure in pgAdmin. The procedure I've written fails with different error messages depending on where I put the cursor. It even sometimes seems to succeed, but with no procedure created.
My questions:
- What is wrong with my code?
- Is there a better/simpler way to achieve the same thing?
The below code tries to iterate through a table of values (outer loop), and for each row in that table, to iterate through another table that has some missing values (inner loop). Whenever a missing value is found in the inner loop table, if a value that can be used exists in the outer table, I update the inner table with that value.
CREATE OR REPLACE PROCEDURE impute()
LANGUAGE plpgsql
AS $$
DECLARE
cntry_dec_means RECORD;
table_row RECORD;
BEGIN
<<outer_loop>>
FOR cntry_dec_means IN
SELECT * FROM country_decade_means
LOOP
-- if all imputed values are null, don't do anything and move to the next row
IF ISNULL(cntry_dec_means.gdppc)
AND ISNULL(cntry_dec_means.gdp)
AND ISNULL(cntry_dec_means.ed_gdp)
AND ISNULL(cntry_dec_means.population) THEN
CONTINUE;
-- get matching rows from the actual table
<<inner_loop>>
FOR table_row IN
SELECT * FROM inls625_impute
WHERE country = cntry_dec_means.country
AND decade = cntry_dec_means.decade
LOOP
IF (table_row.gdppc IS NULL) AND (cntry_dec_means.gdppc IS NOT NULL) THEN
UPDATE inls625_impute SET gdppc = cntry_dec_means.gdppc
WHERE inls625_impute.country = cntry_dec_means.country
AND inls625_impute.decade = cntry_dec_means.decade;
END LOOP; --<<inner_loop>>
END LOOP; --<<outer_loop>>
END;
$$;
Update
Thanks too @prasad for clarifying how to use loop labels. Unfortunately, it does not seem to work for me.
Here are the errors (in psql):
CREATE OR REPLACE PROCEDURE impute()
LANGUAGE plpgsql
AS $$"
LINE 1: $$;
^
INLS-623-Labs=# table_row RECORD;
ERROR: syntax error at or near "table_row"
LINE 1: table_row RECORD;
^
INLS-623-Labs=# BEGIN
INLS-623-Labs-# <<outer_loop>>
INLS-623-Labs-# FOR cntry_dec_means IN
INLS-623-Labs-# SELECT * FROM country_decade_means
INLS-623-Labs-# LOOP
INLS-623-Labs-# -- if all imputed values are null, don't do anything and move to the next row
INLS-623-Labs-# IF ISNULL(cntry_dec_means.gdppc)
INLS-623-Labs-# AND ISNULL(cntry_dec_means.gdp)
INLS-623-Labs-# AND ISNULL(cntry_dec_means.ed_gdp)
INLS-623-Labs-# AND ISNULL(cntry_dec_means.population) THEN
INLS-623-Labs-# CONTINUE outer_loop;
ERROR: syntax error at or near "<<"
LINE 2: <<outer_loop>>
^
INLS-623-Labs=#
INLS-623-Labs=# -- get matching rows from the actual table
INLS-623-Labs=# <<inner_loop>>
INLS-623-Labs-# FOR table_row IN
INLS-623-Labs-# SELECT * FROM inls625_impute
INLS-623-Labs-# WHERE country = cntry_dec_means.country
INLS-623-Labs-# AND decade = cntry_dec_means.decade
INLS-623-Labs-# LOOP
INLS-623-Labs-# IF ISNULL(table_row.gdppc) AND NOT ISNULL(cntry_dec_means.gdppc) THEN
INLS-623-Labs-# UPDATE inls625_impute SET gdppc = cntry_dec_means.gdppc
INLS-623-Labs-# WHERE inls625_impute.country = cntry_dec_means.country
INLS-623-Labs-# AND inls625_impute.decade = cntry_dec_means.decade;
ERROR: syntax error at or near "<<"
LINE 1: <<inner_loop>>
^
INLS-623-Labs=# END LOOP inner_loop;
ERROR: syntax error at or near "LOOP"
LINE 1: END LOOP inner_loop;
^
INLS-623-Labs=# END LOOP outer_loop;
ERROR: syntax error at or near "LOOP"
LINE 1: END LOOP outer_loop;
^
INLS-623-Labs=# END;
WARNING: there is no transaction in progress
COMMIT
INLS-623-Labs=# $$;
INLS-623-Labs$#
What's more, I reproduced the example provided by @prasad and get similar errors:
CREATE OR REPLACE PROCEDURE t_proc()
LANGUAGE plpgsql
AS $$"
LINE 1: $$;
^
INLS-623-Labs=# t2_rec RECORD;
ERROR: syntax error at or near "t2_rec"
LINE 1: t2_rec RECORD;
^
INLS-623-Labs=# BEGIN
INLS-623-Labs-# <<outer_loop>>
INLS-623-Labs-# FOR t_rec IN
INLS-623-Labs-# SELECT * FROM t
INLS-623-Labs-# LOOP
INLS-623-Labs-# IF t_rec.name IS NULL AND t_rec.data IS NULL THEN
INLS-623-Labs-# CONTINUE outer_loop;
ERROR: syntax error at or near "<<"
LINE 2: <<outer_loop>>
^
INLS-623-Labs=# END IF;
ERROR: syntax error at or near "IF"
LINE 1: END IF;
^
INLS-623-Labs=# RAISE NOTICE 't.id value: %', t_rec.id;
ERROR: syntax error at or near "RAISE"
LINE 1: RAISE NOTICE 't.id value: %', t_rec.id;
^
INLS-623-Labs=# <<inner_loop>>
INLS-623-Labs-# FOR t2_rec IN
INLS-623-Labs-# SELECT * FROM t2 WHERE t2.id = t_rec.id
INLS-623-Labs-# LOOP
INLS-623-Labs-# RAISE NOTICE 't2.id value: %', t2_rec.id;
ERROR: syntax error at or near "<<"
LINE 1: <<inner_loop>>
^
INLS-623-Labs=# END LOOP inner_loop;
ERROR: syntax error at or near "LOOP"
LINE 1: END LOOP inner_loop;
^
INLS-623-Labs=# END LOOP outer_loop;
ERROR: syntax error at or near "LOOP"
LINE 1: END LOOP outer_loop;
^
INLS-623-Labs=# END;
WARNING: there is no transaction in progress
COMMIT
INLS-623-Labs=# $$;
2 Answers 2
- What is wrong with my code?
Technically, only END IF;
is missing twice.
- Is there a better/simpler way to achieve the same thing?
Yes. A plain UPDATE
statement.
UPDATE inls625_impute i
SET gdppc = c.gdppc
FROM cntry_dec_means c
WHERE i.gdppc IS NULL
AND c.gdppc IS NOT NULL
AND ((c.gdp, c.ed_gdp, c.population) IS NULL) IS NOT TRUE
AND i.country = c.country
AND i.decade = c.decade;
Not only is it much simpler and faster, it is also correct. Your procedure (even with fixed syntax) would be open to race conditions. Between reading data and arriving at the UPDATE
eventually, concurrent transactions could interfere. You could prevent that with explicit locking. But just don't go there.
Note, in particular, this formulation:
((c.gdp, c.ed_gdp, c.population) IS NULL) IS NOT TRUE
You only want to skip the update if all of these columns are null. (Plus, c.gdppc
cannot be null in any case.) See:
-
1Thank you. Your solution worked. I ran the UPDATE 4 times - once for each of my gdppc, gdp, population and ed_gdp columns. I also implemented your suggestion for the procedure, and this fixed it. Although I no longer need it, it was a good learning experience.bellysavalas– bellysavalas2024年11月12日 20:46:21 +00:00Commented Nov 12, 2024 at 20:46
Here is how you can code your procedure. I am using two table's t
and t2
for showing the correct way of the syntax (LOOP, IF-END IF, etc.) you are trying in your code.
I am using PostgreSQL version 14 and the commands are run from psql
shell.
TABLE t;
id | name | data
----+-------+------
1 | first | 12
2 | | 9
3 | |
Table t
has three columns and the empty values above are null
values. Table t2
has just one column.
TABLE t2;
id
----
2
9
The procedure:
CREATE OR REPLACE PROCEDURE t_proc()
LANGUAGE plpgsql
AS $$
DECLARE
t_rec RECORD;
t2_rec RECORD;
BEGIN
<<outer_loop>>
FOR t_rec IN
SELECT * FROM t
LOOP
IF t_rec.name IS NULL AND t_rec.data IS NULL THEN
CONTINUE outer_loop;
END IF;
RAISE NOTICE 't.id value: %', t_rec.id;
<<inner_loop>>
FOR t2_rec IN
SELECT * FROM t2 WHERE t2.id = t_rec.id
LOOP
RAISE NOTICE 't2.id value: %', t2_rec.id;
END LOOP inner_loop;
END LOOP outer_loop;
END;
$$;
Run the procedure and review the output:
CALL t_proc();
NOTICE: t.id value: 1
NOTICE: t.id value: 2
NOTICE: t2.id value: 2
CALL
Further reference: https://www.postgresql.org/docs/14/plpgsql-control-structures.html
-
Why didn’t you put the IF conditions in the outer loop’s WHERE clause?RonJohn– RonJohn2024年11月12日 09:01:27 +00:00Commented Nov 12, 2024 at 9:01
-
@RonJohn I dont follow your question.prasad_– prasad_2024年11月12日 09:24:41 +00:00Commented Nov 12, 2024 at 9:24
-
Why didn't you write the outer loop as
SELECT * FROM t WHERE name IS NOT NULL AND data IS NOT NULL
? That's the whole purpose of the WHERE clause.RonJohn– RonJohn2024年11月12日 19:03:24 +00:00Commented Nov 12, 2024 at 19:03 -
@RonJohn The question post is about demonstrating correct (working and error free) usage of LOOP, END-IF , IS NULL, etc. It is not about using SQL. The code by OP has questions and errors related to some procedural constructs - as mentioned in the title and the code.prasad_– prasad_2024年11月13日 03:29:18 +00:00Commented Nov 13, 2024 at 3:29
Explore related questions
See similar questions with these tags.
END LOOP
andCONTINUE
statements. This is generally helpful when using nested loops.