0

We have a PostgreSQL database that has grown significantly in size recently, from about 340GB to 571GB over the past couple of months, and we are not tracking any significant change in user behavior over that time. Our primary DBA has made a couple of recommendations, with his chief recommendation being to export the entire database and then re-import it, which from his tests on a second server cloned from our primary requires about 3 hours of downtime, and gets the size down to only 300GB.

My two main areas of concern would be finding out where this significant growth is coming from (using du -h I can at least see it's in the /data directory with no significant growth in tablespace or pg_wal), and understanding just how importing and exporting the database can get us almost 300GB of space recovery without actually losing any production data.

asked Jul 5, 2022 at 12:47
2
  • I'd start with tables, and move on to other structures as needed. Did you look at the PostgreSQL wiki? Commented Jul 5, 2022 at 12:53
  • If it is not in 'pg_wal' nor in 'base', then which subdirectory of 'data' is the extra space in? That should be easy for you to determine with du, but hard for us to guess. Commented Jul 5, 2022 at 20:03

2 Answers 2

1

The first thing I'd do is change into the data directory and run

du -sk *

This will show you in which of the subdirectories a lot of disk space is used. You can drill down by descending deeper and repeating the command.

Typically, the increase in disk usage comes from one of two causes:

  1. WAL in pg_wal cannot be removed. This could be because the archiver has a problem (look at pg_stat_archiver) or you have a stale replication slot (look at pg_replication_slots).

  2. Some tables or indexes are bloated.

    If you created a copy of the database with pg_dump/restore, you are halfway to the solution. Run something like this on both databases:

    SELECT oid::regclass AS object, relkind, pg_relation_size(oid) AS size
    FROM pg_class
    ORDER BY size DESC;
    

    Compare the output on both sides an watch for tables and indexes that are considerably larger on the original database.

    Fix the bloat by examining the possible causes. Once you have done that, get rid of the bload with VACUUM (FULL) (attention, this requires down time).

answered Jul 5, 2022 at 13:08
3
  • Doesn't vacuum full only restore space from the tablespace directory? We did find one old archive schema that had significant size, and ran a VACUUM FULL on that schema only, and that did restore 30GB of data, but only from the tablespace directory, it didn't reduce any space from the data directory. Commented Jul 5, 2022 at 13:26
  • Use du to figure out where the disk space is used. Commented Jul 5, 2022 at 13:35
  • @awestover89 The tablespace directory is usually just a subdirectory of the data directory. You can do things differently, but if you have you should explain what you did and why and where the various space measurements you reported came from. Commented Jul 5, 2022 at 19:58
0

Ultimately, we used the following to determine the issue:

We did an export and import onto a test database server so we had a copy of the database at it's full size, and a copy of the database at the smaller, post import size.

We then ran the following query to identify the largest tables:

SELECT schemaname as table_schema, relname as table_name,
pg_size_pretty(pg_relation_size(relid)) as data_size
FROM pg_catalog.pg_stat_all_tables
ORDER BY pg_relation_size(relid) DESC;

This clearly showed that on the main system the table pg_catalog.pg_largeobject was just over 200GB while on the test system following the export and import it was 0 bytes.

We are now working on a plan to better manage the growth of pg_largeobject.

answered Jul 15, 2022 at 17:02

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.