I'm using pgloader to perform a one-time migration from MySQL to Postgres.
For that purpose, I want to temporarily configure Postgres specifically for that workload.
- reduce WAL as much as possible:
wal_level = minimal
archive_mode = off
- Should I
autovacuum = off
?
What else can I do to improve things without severely putting my data at risk?
Ee.g. I don't want to fsync = off
. Although I can re-do this migration, this additional time would still be unwanted.
Additional setting after Feedback
synchronous_commit=off
(thanks @a_horse_with_no_name )
1 Answer 1
I found that the following changes to postgres.conf yielded the best results so far:
archive_mode = off # old: on
autovacuum = off # old: on
commit_delay = 100000 # old: 0
maintenance_work_mem = 1GB # old: 512MB
max_wal_size = 10GB # old: 1GB
synchronous_commit = off # old: on
wal_level = minimal # old: archive
work_mem = 1GB # old: 8MB
Also I had quite trouble getting the most out of a single pgloader invocation to migrate all the table. This would have been the most desired way, because pgloader that way also re-creates all the FK between the imported tables.
But the problems I was facing (pgloader hanging, not achieving the performance I was aiming for in terms of total time) I settled with this approach:
- run a separate pgloader instance for every individual table, in parallel, inside
tmux
- using
tmux wait-for
the longest import migration - run a separate
*.sql
script for (almost each) table for post-processing (add FKs, add triggers), again in parallel inside a tmux - run
vacuumdb --jobs=24 --analyze
To fit this purpose, the machine in question had been pimped:
- 14 cores (instead of 8)
- 64GB RAM (instead of 32GB; but they will eventually stay)
Also, for each individual pgloader invocation I had a seperate loader config with individual batch size
configuration. The default batch size of 25000
wouldn't work for the bigger tables.
In the end, for my purpose and workload, this cut down the time
- from 4h38m (1h11m importing, 1h41m adding FKs, 1h22 vacuuming)
- to 2h8m6s (74m30s importing, 25m50s adding FKs, 27m44s vacuuming)
(Actually, the latest numbers are slightly below 2h even, but I don't have more detailed stats for them)
synchronous_commit=off
and if you are on an older Postgres version most probably also need to increasecheckpoint_segments