I am planning to upgrade my database from Postgres 10 to Posgres 14. To reduce the downtime, I plan to use the replication for the same. Plan (in short) is to:
- Dump the data using
pg_dumpall
- Kill the Postgres service running on replication
- Setup new Postgres 14 service
- Restore the data from the dump created in Step 1
- Setup logical replication with the primary to get the data up to date
- Make replication the primary, and applications start connecting to the replication.
- Kill the Postgres service running on the old primary, and upgrade it too to 14.
- Setup streaming replication between the new primary (old replication) and new replication.
I have a few questions regarding this:
- Is it advisable to take a dump and restore, or let logical replication handle complete replication (apart from the DDL's) from scratch? In terms of time that it could take, which would be the better option? We have close to 650GB of data. I would think the dump is compressed, but restoring them (on the same machine) could still take time I believe.
- Is there something I need to setup to make sure that the replication starts from after the dump point or will it be taken care?
- One limitation I see with the logical replication is that it does not replicate sequences. For this I believe, before the switch over - I could increment the sequence by a very large number to make sure that there are no conflicts?
- I have also seen an extension
pglogical
that I could use for the setup. Although, I am yet to dig deeper. From your experience, does it give any added advantages \ would it make the setup easier? - Although I would think it depends on different parameters, how long it could take typically to complete the restore\replication for a database of this much size, until the primary and master are in sync?
1 Answer 1
Ad 1 and 2:
The easy thing would be not to copy the data yourself, but let logical replication handle that.
If you insist in using dump/restore, you will have to proceed like this:
create a logical replication slot with the
CREATE_REPLICATION_SLOT
command and thepgoutput
plugin in a replication connection, which will export a snapshotwhile the replication connection is still open, run
pg_dump
and pass the snapshot name with the--snapshot
optionuse that replication slot for logical replication
Otherwise, replication will not start at the right point.
Ad 3:
Setting the sequences to a high value is fine.
Ad 4:
I have no opinion there
Ad 5:
Nobody can predict that. You will have to do a test run.
-
I wanted to dump and restore the initial data. And then let logical replication replicate rest of the data including the changes made on primary before replication was started. So just trying to make speed up the replication, not having it replicate everything but only what's required after the point of dump.user1583803– user15838032021年10月29日 12:29:22 +00:00Commented Oct 29, 2021 at 12:29
-
I see. I have added instructions how this must be done to make it work properly.Laurenz Albe– Laurenz Albe2021年10月29日 12:41:55 +00:00Commented Oct 29, 2021 at 12:41
Explore related questions
See similar questions with these tags.
pg_upgrade
using the--link
option? That will be quite fast independently of the size of the instance