we face a problem when trying to pg_upgrade
pg12 > pg14.
our source instance has en_US.UTF-8
which it should not have had to begin with, and the new pg14 cluster is being initiated with de_DE.UTF-8
.
pg_upgrade
throws an error on
lc_collate values for database "postgres" do not match: old "en_US.UTF-8", new "de_DE.UTF-8"
not being experienced in such at all, we came up with this solution and I'd like to hear some feedback whether this is sane, or not
pg14/initdb -D new_data_dir
pg14/pg_ctl -D new_data_dir start
pg_dumpall_12 | sed -i "/en_US.UTF-8/de_DE.UTF-8/" | psql_14 -p 6543 |& tee some.out
this leaves Ctype still with en_US.UTF-8
so we add the below per database
psql -c "UPDATE pg_database SET datctype = 'de_DE.UTF-8' WHERE datname = <db_name> ;"
2 Answers 2
This is indeed the correct way to go ahead. Both are UTF-8 based so I would not expect any issues - you will have to make sure all client calls expect the new encoding. See also change character encoding
You can use dump/restore or logical replication for an upgrade that should change the collation.
I wouldn't use pg_dumpall
and edit the locale on the fly, because your sed
command might have undesirable side effects (what if a string in a table contains en_US.UTF-8
?). Use pg_dumpall -g
to migrate the global objects, then create the database with the appropriate locale and use pg_dump
to copy the database contents.
-
I have 200+ databases in some of the 14 instances we need to migrate. would you still recommend to avoid
pg_dumpall
or is the main thing to not do thesed
operation on the fly, so there is more control?vrms– vrms2024年12月23日 08:02:06 +00:00Commented Dec 23, 2024 at 8:02 -
Right; I am worried that the
sed
could change more than you want. Thepg_dump
of the 200 databases could be scripted.Laurenz Albe– Laurenz Albe2024年12月23日 08:19:35 +00:00Commented Dec 23, 2024 at 8:19