I'm trying to setup a replica using repmgr
:
repmgr -D /var/lib/postgresql/9.3/main -p 5432 -U repmgr -R postgres \
--verbose standby clone psql.master.example.com
repmgr --verbose standby register
I've manage to sync the DB's but the standby replica won't start:
postgres@psql01a:~$ /usr/lib/postgresql/9.3/bin/postgres --single -D /var/lib/postgresql/9.3/main -P -d 1
2016年04月18日 14:02:05 UTC [30048]: [1-1] user=,db=,client= LOG: database system was shut down in recovery at 2016年04月18日 14:00:51 UTC
2016年04月18日 14:02:05 UTC [30048]: [2-1] user=,db=,client= LOG: entering standby mode
2016年04月18日 14:02:05 UTC [30048]: [3-1] user=,db=,client= DEBUG: checkpoint record is at 27B5/BA68B550
2016年04月18日 14:02:05 UTC [30048]: [4-1] user=,db=,client= DEBUG: redo record is at 27B5/B3626B20; shutdown FALSE
2016年04月18日 14:02:05 UTC [30048]: [5-1] user=,db=,client= DEBUG: next transaction ID: 0/2281005353; next OID: 230242292
2016年04月18日 14:02:05 UTC [30048]: [6-1] user=,db=,client= DEBUG: next MultiXactId: 879585; next MultiXactOffset: 1823275
2016年04月18日 14:02:05 UTC [30048]: [7-1] user=,db=,client= DEBUG: oldest unfrozen transaction ID: 2094018845, in database 134461654
2016年04月18日 14:02:05 UTC [30048]: [8-1] user=,db=,client= DEBUG: oldest MultiXactId: 1, in database 16546
2016年04月18日 14:02:05 UTC [30048]: [9-1] user=,db=,client= DEBUG: transaction ID wrap limit is 4241502492, limited by database with OID 134461654
2016年04月18日 14:02:05 UTC [30048]: [10-1] user=,db=,client= DEBUG: MultiXactId wrap limit is 2147483648, limited by database with OID 16546
2016年04月18日 14:02:05 UTC [30048]: [11-1] user=,db=,client= DEBUG: resetting unlogged relations: cleanup 1 init 0
2016年04月18日 14:02:05 UTC [30048]: [12-1] user=,db=,client= DEBUG: initializing for hot standby
2016年04月18日 14:02:05 UTC [30048]: [13-1] user=,db=,client= LOG: redo starts at 27B5/B3626B20
2016年04月18日 14:02:05 UTC [30048]: [14-1] user=,db=,client= DEBUG: recovery snapshots are now enabled
2016年04月18日 14:02:05 UTC [30048]: [15-1] user=,db=,client= CONTEXT: xlog redo running xacts: nextXid 2281009749 latestCompletedXid 2281009746 oldestRunningXid 2281009747; 2 xacts: 2281009748 2281009747
2016年04月18日 14:02:05 UTC [30048]: [16-1] user=,db=,client= PANIC: btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent data
2016年04月18日 14:02:05 UTC [30048]: [17-1] user=,db=,client= CONTEXT: xlog redo delete: index 1663/16546/215742765; iblk 363218, heap 1663/16546/215740352;
Aborted
Any idea how to start the replica?
1 Answer 1
The main problem was different configuration in postgres.conf
, after modifying shared_buffers
and several others (configuration based on pgtune
for given hardware):
maintenance_work_mem = 1GB
effective_cache_size = 22GB
work_mem = 15MB
wal_buffers = 8MB
shared_buffers = 7680MB
max_connections = 1024
After that I've hit another error:
2016年04月18日 15:36:39 UTC [5150-1] FATAL: could not create semaphores: No space left on device
2016年04月18日 15:36:39 UTC [5150-2] DETAIL: Failed system call was semget(5432064, 17, 03600).
2016年04月18日 15:36:39 UTC [5150-3] HINT: This error does *not* mean that you have run out of disk space.
It occurs when either the system limit for the maximum number of semaphore sets
(SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be
exceeded. You need to raise the respective kernel parameter. Alternatively,
reduce PostgreSQL's consumption of semaphores by reducing its max_connections
parameter. The PostgreSQL documentation contains more information about
configuring your system for PostgreSQL.
Which could be fixed by increasing kernel limits:
echo 250 32000 256 256 > /proc/sys/kernel/sem
(this shouldn't be needed for PostgreSQL> 9.3)
-
2The issue was probably the
max_connections
setting. It has to be identical in master and replica. (maybe the wal_buffers as well, not sure). The other settings can be different.ypercubeᵀᴹ– ypercubeᵀᴹ2016年04月19日 08:00:45 +00:00Commented Apr 19, 2016 at 8:00