I have simple table for the sake of argument. I have a function that selects ids and loops through them called loop_test
. I can select an array of ids and loop through them, causing my changes in a transaction.
CREATE OR REPLACE FUNCTION loop_test() RETURNS void AS $$
DECLARE
_ids_array INTEGER[];
_id INTEGER;
BEGIN
SELECT ARRAY(SELECT id FROM loop_test) INTO _ids_array;
FOREACH _id IN ARRAY _ids_array
LOOP
UPDATE loop_test SET looped = TRUE WHERE id = _id;
END LOOP;
END;
$$ LANGUAGE plpgsql;
Table:
db=# \d loop_test;
Table "public.loop_test"
Column | Type | Modifiers
---------------+---------+-----------
id | integer |
other_id | integer |
id_copy | integer |
other_id_copy | integer |
looped | boolean |
db=# select * from loop_test;
id | other_id | id_copy | other_id_copy | looped
----+----------+---------+---------------+--------
1 | 10 | | |
6 | 15 | | |
2 | 11 | | |
7 | 16 | | |
3 | 12 | | |
4 | 13 | | |
5 | 14 | | |
(7 rows)
When I call select loop_test()
, I get the following results:
db=# select * from loop_test;
id | other_id | id_copy | other_id_copy | looped
----+----------+---------+---------------+--------
1 | 10 | | | t
6 | 15 | | | t
2 | 11 | | | t
7 | 16 | | | t
3 | 12 | | | t
4 | 13 | | | t
5 | 14 | | | t
(7 rows)
I would, however, like to create a function to select both the id
and the other_id
into an array. I was told about using something like agg_array
, but I don't completely understand how that works.
I was imagining something like the following?
CREATE OR REPLACE FUNCTION agg_loop_test() RETURNS void AS $$
DECLARE
_ids_array INTEGER[][];
_id INTEGER;
BEGIN
SELECT AGG_ARRAY(SELECT id, other_id FROM loop_test) INTO _ids_array;
FOREACH _id IN ARRAY _ids_array
LOOP
UPDATE loop_test SET id_copy = _id[0], other_id_copy = _id[1] WHERE id = _id[0];
END LOOP;
END;
$$ LANGUAGE plpgsql;
-
Why are you using a loop at all?user1822– user18222018年11月12日 06:46:16 +00:00Commented Nov 12, 2018 at 6:46
3 Answers 3
A much better way, yet: just update. No loop needed.
UPDATE loop_test
SET id_copy = id
, other_id_copy = other_id;
WHERE id IS NOT NULL;
The WHERE
condition is only useful if id
can be null and you want a perfect equivalent of what you had.
Loop
If you are just exploring loops - you can assign multiple variables. See:
CREATE OR REPLACE FUNCTION better_loop_test()
RETURNS void
LANGUAGE plpgsql AS
$func$
DECLARE
_id int;
_other_id int;
BEGIN
-- example makes no sense, just a loop demo
FOR _id, _other_id IN
SELECT id, other_id FROM loop_test
LOOP
UPDATE loop_test
SET id_copy = _id
, other_id_copy = _other_id
WHERE id = _id;
END LOOP;
END
$func$;
While you just need the two columns of known type, that may be a bit cheaper than fetching whole (possibly big) rows.
The @Erwin's reply is absolutely correct. Using a arrays for described example is performance error (unfortunately common). Sometimes it can be necessary - because you need to pass some values as function parameters.
There are two techniques - 1. pass a array of composite values, 2. pass multidimensional array. The performance should be +/- same, for me - using a array of composite can be for some cases more readable. Not sure, if you can create multidimensional arrays from query result on 9.3.
CREATE TYPE test_type AS (id1 int, id2 int);
CREATE OR REPLACE FUNCTION fx1(ids test_type[])
RETURNS void AS $$
DECLARE r test_type;
FOR r IN ARRAY ids
LOOP
UPDATE ...
END LOOP;
probably still, there can be used only one UPDATE
statement without cycle with function unnest
:
CREATE TABLE test (id1 integer, id2 integer);
UPDATE test SET id2 = u.id2
FROM unnest(array[(1,10),(3,4)]::test_type[]) u
WHERE test.id1 = u.id1;
The performance impact depends on size of arrays - for small arrays it will be minimal - but still there can be deeper nesting of cycles, and there it can be performance issue.
For multidimensional arrays PLpgSQL FOREACH
statement has SLICE
clause:
CREATE OR REPLACE FUNCTION fx2(ids int[])
RETURNS void AS $$
DECLARE _ids int[];
BEGIN
FOREACH _ids SLICE 1 IN ARRAY ids
LOOP
RAISE NOTICE 'ids[0]=% ids[1]=%', _ids[0], _ids[1];
END LOOP;
END;
$$ LANGUAGE plpgsql;
postgres=# SELECT fx2(ARRAY[[1,2],[3,4]]);
NOTICE: ids[0]=<NULL> ids[1]=1
NOTICE: ids[0]=<NULL> ids[1]=3
I don't know about multidimensional arrays, but I found a much better way to do what I was trying to do:
CREATE OR REPLACE FUNCTION better_loop_test() RETURNS void AS $$
DECLARE
_row RECORD;
BEGIN
FOR _row IN SELECT * FROM loop_test LOOP
UPDATE loop_test SET id_copy = _row.id, other_id_copy = _row.other_id WHERE id = _row.id;
END LOOP;
END;
$$ LANGUAGE plpgsql;