Am looking to delete a vast amount of data in our database, we are not sure where to place the deleted data for now but to have a separate copy is what we are looking to do.
I explored using a CTE originally but there are some concerns about the use of a CTE for such large data so have opted for a temp table but am having difficulty getting it working.
Am also trying to wrap it in a transaction so we have the option of rolling back should we want to:
begin; -- typically faster and safer wrapped in a single transaction
CREATE TEMP TABLE tmpold AS ( ERROR: syntax error at or near "t" Position: 51)
SELECT t.*
--into temporary table tmpold (also tried this but ERROR: syntax error at or near "t" Position: 51
FROM play t
where t.play_created_on > NOW() - interval '1 years';
delete t.* from play t
inner join tmpold on t.play_id = tmpold.play_id;
select count (*) from tmpold
ROLLBACK;
Cant understand why it is not working, I want to select a set of values from a play table into a temp table and then delete the data from the play table using the values from the temp table.
Not sure whether the temp table is the correct way, I want to retain the deleted data from the play table in a separate table until we can decide what we do with it.
1 Answer 1
There's a number of things you need to change.
A temp table is not what you want, they are removed as soon as the connection that created them closes, and they cannot be accessed from any other connection. Use a regular table instead.
You have an unneeded left parenthesis (that is unclosed) in the CREATE TABLE AS statement; it's just CREATE TABLE <tblname> AS <SELECT statement>
, not CREATE TABLE <tblname> AS (<SELECT statement>)
.
Your delete syntax is wrong in two ways. First, you select rows, not columns, to delete, which means you just DELETE FROM (tablename) as (alias)
, not DELETE (alias).* FROM (tablename) as (alias)
. Second, regular JOIN syntax like this is used for SELECT statements, not DELETE*; instead, use a sub-select in the WHERE and the planner will set up the required join, something like:
delete from play t
WHERE t.play_id IN (select tmpold.play_id from tmpold)
*Caveat - DELETE
can have a regular JOIN clause if it also has a USING, but it's not needed here.
One other thing that looks weird to me but may be as intended - typically in the situation like this, the intent is to move old records out of the main table. The WHERE clause on your initial CREATE TABLE AS SELECT would instead move the most recent year's worth of records out of the the main table, while leaving all the old records. If that wasn't the intent, you probably want to change t.play_created_on > NOW() - interval '1 years'
to t.play_created_on <= NOW() - interval '1 years'
-
that worked a treat, I did the sub select instead. Yes you are correct on the > being incorrect, I was just testing a smaller subset of data and forgot to change that back. In terms of the actual query itself, I will be testing for in excess of 500m rows, is there a good way of batching this so its not done in one lump?rdbmsNoob– rdbmsNoob2021年10月19日 14:33:16 +00:00Commented Oct 19, 2021 at 14:33
DELETE
statement.