I created a table2
from another table1
with:
CREATE TABLE table2 AS SELECT * FROM table1;
table1
is 4.8 GB with 1.5 mio. rows and 20 columns of types integer (8x), charvar(1) (9x), real (2x) and geometry (1x). table2
is 3.5 GB after doing that.
I then add 3 columns (real, real, integer) to table2
, and update the columns with some values.
However, after doing that, table2
becomes about 4 times larger with 14 GB.
What could be the cause for that? I expect adding these columns would occupy less space.
I performed a full vacuum but it didn't change anything.
I check the sizes with:
select table_name, pg_total_relation_size(quote_ident(table_name))
from information_schema.tables
where table_schema = 'public'
order by 2;
I create the table and use the update-set commands in a SQL script called with psql:
DO $$
BEGIN
...
All my SQL commands
...
END $$
-
Do you know if it is the DDL or the DML that increases its size?Gerard H. Pille– Gerard H. Pille2021年10月11日 16:13:04 +00:00Commented Oct 11, 2021 at 16:13
-
The size increases a lot after the update-set commands, so it's the DML... The table size doubles after using an "UPDATE table2 SET my_field_integer = CASE...", and then doubles again using 4 different "SET my_field_real = ROUND(CAST(field_a / (field_b * 0.27777) AS numeric), 1)"...Marc– Marc2021年10月11日 16:25:18 +00:00Commented Oct 11, 2021 at 16:25
-
What is a "full vacuum"? Is that the same thing as VACUUM FULL? Or does that just mean you didn't interrupt it half way through?jjanes– jjanes2021年10月11日 16:32:15 +00:00Commented Oct 11, 2021 at 16:32
-
Yes, I meant a vacuum full, following the advice here: dba.stackexchange.com/questions/172247/…Marc– Marc2021年10月11日 16:33:14 +00:00Commented Oct 11, 2021 at 16:33
-
1I can't replicate this at all. How do you even get 1.5M rows to take up 3.5Go in the first place? It is not easy to do that even through malice.jjanes– jjanes2021年10月11日 16:55:03 +00:00Commented Oct 11, 2021 at 16:55
2 Answers 2
That is as expected. Since updating a row will write a new row version while leaving the old one in place, updating all rows of a table will double its size.
If you run several updates without giving autovacuum time in between to free the dead row version, the size can increase even more.
After your updates are done, run VACUUM (FULL)
on the table to rewrite it and get rid of the extra storage space.
-
Thanks for the explanation. I better understand how it works now.Marc– Marc2021年10月14日 09:22:28 +00:00Commented Oct 14, 2021 at 9:22
In stead of adding and updating the columns when table2 has been created, add the new columns to the select when you create table2.
CREATE TABLE table2 AS
SELECT t1.*,
... new_real1,
... new_real2,
... new_int
FROM table1 t1;