What is the correct way to cleanup PostgreSQL's WAL? I have a database with more than 100 GB and it has around 600 GB in pg_wal. Also I have 2 logical replications set up.
Primary and replicas are running PostgreSQL 10.
Primary and replicas have commented wal_keep_segments
and max_wal_size
.
pg_archivecleanup
did not work with %r
option, then I got the archive name by searching in pg_controldata
latest checkpoint's REDO WAL file and deleted the logs, but then one replica stopped with the error
could not receive data from WAL stream: FATAL: requested WAL segment xxx has already been removed
To solve this I deleted and recreated the replica.
1 Answer 1
pg_wal
cleans itself up. You should almost never touch pg_wal
by hand. If it is not cleaning itself up, you need to figure out why and fix the underlying issue.
One possible reason is that you have a replication slot that is holding it back. Either a replica is using a slot and is unable to keep up, or you have a slot that has no replica attached, for example, you destroyed the replica but didn't drop the slot it used to occupy. You can see what slots you have by querying pg_replication_slots
, and if necessary drop one with pg_drop_replication_slot
, both run on the master. You would look for the slot with the oldest non-NULL value of restart_lsn
.
Another reason could be that you have archive_mode
turned on, but your archive_command
is constantly failing or can't keep up. You will see warnings about this in your server log file if it is failing.
pg_archivecleanup
is used to clean up a WAL archive. pg_wal
is not the archive, it contains the live WAL files. You are lucky you didn't destroy your database by monkeying around in there.
-
1Thanks for answering, unfortunately I don't have any replication slot in master that is not being used, and all is in sync, also archive mode is off.Gabriel Weich– Gabriel Weich2019年02月09日 20:38:31 +00:00Commented Feb 9, 2019 at 20:38
-
Can you turn log_checkpoints on, reload the server, then run a few checkpoints and see what it logs? Can you double check that wal_keep_segments and archive_mode are actually off by using "SHOW" on a runninger server?jjanes– jjanes2019年02月09日 21:20:51 +00:00Commented Feb 9, 2019 at 21:20
-
1The last days I was monitoring the disk usage and it seems that after I recreated the replica the space was normalized.Gabriel Weich– Gabriel Weich2019年02月12日 16:41:27 +00:00Commented Feb 12, 2019 at 16:41