I have a server with 150GB of diskspace. Today I uploaded a dataset of 30GB. I cancelled the import due to internet dying, then noticed there was 29GB of space missing in the database (meaning the CSV was uploaded, but not deleted when I broke the operation). When uploading the data once again, it broke again and I lost another ~25GB. Now there isn't enough free space to upload the data.
This is hosted on AWS RDS, Postgres 10.6.
Is there a way to fix this? I read about VACUUM
. But will this delete records? I'm hosting at the moment ~70GB of data and don't want to lose any records. What's the best way to go about this?
1 Answer 1
PostgreSQL leaves dead data in the table; the space can be reused, but the files won't shrink (significantly).
The official method to reclaim space is VACUUM (FULL)
, but that rewrites the whole table, which will be unavailable for any access during that time. There are extensions called pg_squeeze
and pg_repack
which do the same thing with less disruption "behind the scenes".
All these methods have in common that they will require enough free space to create a copy of the table, so you probably won't get around increasing the storage space anyway.
Now for the good news:
If you run a plain VACUUM
on the table, which is not disruptive, the wasted space can be reused. So your next attempt to load the data won't increase the size of the table.
-
Thanks @LaurenzAlbe. I'm running this at the moment :). Will this also freeup the previously uploaded data sets e.g. the missing ~55GB?ilovejq– ilovejq2019年10月09日 07:14:47 +00:00Commented Oct 9, 2019 at 7:14
-
Plain
VACUUM
will not shrink the data files, but it will free the space within. So the disk usage won't go down, but your next attempt has room enough.Laurenz Albe– Laurenz Albe2019年10月09日 07:16:34 +00:00Commented Oct 9, 2019 at 7:16
vacuum full
is what you need, and no it will not delete any rows, it just frees up space