Lets say I have a table called users, who looks something like the following:
users table:
|user_id | name |
1 Bob
2 Alice
3 Joe
4 Tom
And another table called books, who got a "users" columns of array of integers relates to users ids who own the books (type: ARRAY[]::integer[])
books:
name num_pages users
mybook1 130 {4,2,1,44}
mybook2 140 {1,2}
mybook3 145 {5,7,8,10}
unfortunately, some of the users has been deleted, and now I have to remove their user id reference from the table.
My question is - Which query should I run in order to remove from the books table the user_ids who are not exists anymore in "users" table? In this case I would like to remove any user id who is not 1/2/3/4 from the users column
for example on "books" table, mybook1 user id 44 does not exist anymore, or mybook3 ids 5,7,8,10 does not exist in users table
(I've created a trigger to handle it from now on when removing users but how can I remove the current "orphan users" who is not reflected in the users table anymore)
note: I have no foreign keys or something like that in the books table
Thanks a lot
2 Answers 2
This should perform nicely:
UPDATE books AS b
SET users = bu.users
FROM (
SELECT b1.name, u.users
FROM books b1
CROSS JOIN LATERAL (
SELECT ARRAY(
SELECT user_id
FROM unnest(b1.users) WITH ORDINALITY u(user_id, ord)
JOIN users u USING (user_id)
ORDER BY u.ord
)
) u(users)
) bu
WHERE b.name = bu.name
AND b.users <> bu.users;
Preserves original order of array elements. If you don't need to, drop ORDINALITY
& friends from the query.
Can result in empty arrays.
Assuming ...
books.name
is the PK. Else use some otherUNIQUE
column frombooks
instead.users.user_id
is at leastUNIQUE
.books.users
does not contain NULL values. Else those are eliminated as well.
CREATE OR REPLACE FUNCTION intersection(anyarray, anyarray) RETURNS anyarray as $$
SELECT ARRAY(
SELECT 1ドル[i]
FROM generate_series( array_lower(1,ドル 1), array_upper(1,ドル 1) ) i
WHERE ARRAY[1ドル[i]] && 2ドル
);
$$ language sql;
DO
$BODY$
DECLARE
user_ids INTEGER[];
row books%ROWTYPE;
BEGIN
user_ids := (SELECT array_agg(id) FROM users) ;
FOR row IN
SELECT * FROM books
LOOP
UPDATE books
SET users=intersection(row.users,user_ids)
WHERE id = row.id;
END LOOP;
END;
$BODY$ language plpgsql
DELETE
statement if that was properly normalized