I'm using AWS RDS PostgreSQL 13 database in a master-slave configuration using logical replication with pglogical extension.
Recently, one of the application migrations has updated the schema of the master database, which in turn severed the replication process. No problems here, this is pretty much expected. However, after the replication stopped, the database has started to consume enormous amount of disk space:
In just one day it consumed around 17 GB of the free disk space and caused the entire server to fail (all disk space was used). What makes it weird is that entire database is only ~150 MB in size and has a very low number of write requests.
What strategy should I use to prevent such misbehavior from happening in the future? What would be the recommended approach to limit the amount of disk space used for replication? I don't want the master server to fail in any circumstances (however, I'm willing to "sacrifice" the replica if the need arises).
1 Answer 1
This is what max_slot_wal_keep_size
was implemented for in PostgreSQL v13. It sacrifices the replica when it demands too much WAL be retained on the master. I have never used it on RDS, but I do see it is available for users to set via the RDS control panel, so presumably it works.
Explore related questions
See similar questions with these tags.