7

Settings in master server:

max_wal_senders = 1
wal_level = 'archive'
archive_mode = on
archive_command = 'cd .'
wal_keep_segments = 10000

Settings in slave server: in recovery.conf file:

Standby_mode = 'on'
primary_conninfo = 'host=ipaddress of master user=repuser'
trigger_file = '/tmp/postgresql.trigger.5432'

log_connections=on is set up on both master and slave server

entry is made for replication user in pg_hba.conf file in master server

host replication repuser ipaddress/32 trust

when trying to replicate i get the following error

2014年07月14日 19:28:22 IST LOG: database system was shut down in recovery at 2014- 07-14 19:28:21 IST
2014年07月14日 19:28:22 IST LOG: entering standby mode
2014年07月14日 19:28:22 IST WARNING: WAL was generated with wal_level=minimal, data may be missing
2014年07月14日 19:28:22 IST HINT: This happens if you temporarily set wal_level=min imal without taking a new base backup.
2014年07月14日 19:28:22 IST LOG: consistent recovery state reached at 0/19FFE28
2014年07月14日 19:28:22 IST LOG: record with zero length at 0/19FFE28
2014年07月14日 19:28:22 IST FATAL: database system identifier differs between the p rimary and standby
2014年07月14日 19:28:22 IST DETAIL: The primary's identifier is 6022019027749040119 , the standby's identifier is 6033562405193904122.
2014年07月14日 19:28:23 IST LOG: connection received: host=[local]
2014年07月14日 19:28:23 IST FATAL: the database system is starting up
2014年07月14日 19:28:24 IST LOG: connection received: host=[local]
2014年07月14日 19:28:24 IST FATAL: the database system is starting up
2014年07月14日 19:28:25 IST LOG: connection received: host=[local]
2014年07月14日 19:28:25 IST FATAL: the database system is starting up
2014年07月14日 19:28:26 IST LOG: connection received: host=[local]
2014年07月14日 19:28:26 IST FATAL: the database system is starting up
2014年07月14日 19:28:27 IST LOG: connection received: host=[local]
2014年07月14日 19:28:27 IST FATAL: the database system is starting up
"postgresql-2014年07月14日_192822.log" 6630L, 756429C
2014年07月14日 19:28:22 IST LOG: database system was shut down in recovery at 2014年07月14日 19:28:21 IST
2014年07月14日 19:28:22 IST LOG: entering standby mode
2014年07月14日 19:28:22 IST WARNING: WAL was generated with wal_level=minimal, data may be missing
2014年07月14日 19:28:22 IST HINT: This happens if you temporarily set wal_level=minimal without taking a new base backup.
2014年07月14日 19:28:22 IST LOG: consistent recovery state reached at 0/19FFE28
2014年07月14日 19:28:22 IST LOG: record with zero length at 0/19FFE28
2014年07月14日 19:28:22 IST FATAL: database system identifier differs between the primary and standby
2014年07月14日 19:28:22 IST DETAIL: The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.
2014年07月14日 19:28:23 IST LOG: connection received: host=[local]
2014年07月14日 19:28:23 IST FATAL: the database system is starting up
2014年07月14日 19:28:24 IST LOG: connection received: host=[local]
2014年07月14日 19:28:24 IST FATAL: the database system is starting up
2014年07月14日 19:28:25 IST LOG: connection received: host=[local]
2014年07月14日 19:28:25 IST FATAL: the database system is starting up
2014年07月14日 19:28:26 IST LOG: connection received: host=[local]
2014年07月14日 19:28:26 IST FATAL: the database system is starting up
2014年07月14日 19:28:27 IST LOG: connection received: host=[local]
2014年07月14日 19:28:27 IST FATAL: the database system is starting up
2014年07月14日 19:28:28 IST LOG: connection received: host=[local]
2014年07月14日 19:28:28 IST FATAL: the database system is starting up
2014年07月14日 19:28:29 IST LOG: connection received: host=[local]
2014年07月14日 19:28:29 IST FATAL: the database system is starting up
2014年07月14日 19:28:30 IST LOG: connection received: host=[local]
2014年07月14日 19:28:30 IST FATAL: the database system is starting up
2014年07月14日 19:28:31 IST LOG: connection received: host=[local]
2014年07月14日 19:28:31 IST FATAL: the database system is starting up
2014年07月14日 19:28:32 IST LOG: connection received: host=[local]
2014年07月14日 19:28:32 IST FATAL: the database system is starting up
2014年07月14日 19:28:33 IST LOG: connection received: host=[local]
2014年07月14日 19:28:33 IST FATAL: the database system is starting up
2014年07月14日 19:28:34 IST LOG: connection received: host=[local]
2014年07月14日 19:28:34 IST FATAL: the database system is starting up
2014年07月14日 19:28:35 IST LOG: connection received: host=[local]
2014年07月14日 19:28:35 IST FATAL: the database system is starting up
2014年07月14日 19:28:36 IST LOG: connection received: host=[local]
2014年07月14日 19:28:36 IST FATAL: the database system is starting up
2014年07月14日 19:28:37 IST LOG: connection received: host=[local]
2014年07月14日 19:28:37 IST FATAL: the database system is starting up
2014年07月14日 19:28:37 IST FATAL: database system identifier differs between the primary and standby
2014年07月14日 19:28:37 IST DETAIL: The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.
2014年07月14日 19:28:38 IST LOG: connection received: host=[local]
2014年07月14日 19:28:38 IST FATAL: the database system is starting up
2014年07月14日 19:28:39 IST LOG: connection received: host=[local]
IST FATAL: database system identifier differs between the primary and standby
2014年07月14日 19:28:37 IST DETAIL: The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.

What do these lines mean?

when I run this command

service postgresql-9.3 start

I get the following error in startup.log file

2014年07月15日 11:25:59 IST FATAL: lock file "postmaster.pid" already exists
2014年07月15日 11:25:59 IST HINT: Is another postmaster (PID 25961) running in data directory "/opt/postgres/PostgreSQL/9.3/data"?

Extend the question

We are new to Postgresql open source. We couldn’t understand the answer to this question. Please if possible explain briefly. We have followed the instructions from the book 'Postgresql 9 Administration Cook book'. We followed these steps from this book.

Carry out the following steps:

  1. Identify your Master and Standby nodes, and ensure that they have been configured according to the best practice recipe.

  2. Configure replication security. Create or confirm the existence of the replication user on Master node

    CREATE USER repuser 
    SUPERUSER 
    LOGIN 
    CONNECTION LIMIT 1 
    ENCRYPTED PASSWORD 'changeme'; 
    
  3. Allow the replication user to authenticate. The following example allows access from any ip address using encrypted password authentication; you may wish to consider more restrictive options. Add the following line:

    host replication repuser 127.0.0.1/0 md5 
    
  4. Set logging options in postgresql.conf on both Master and Standby, so that you get increased information regarding replication connection attempts and associated failures.

    log_connections = on
    
  5. Set max_wal_senders on Master in postgresql.conf, or increment if the value is already non-zero.

    max_wal_senders = 1 
    wal_mode = 'archive' 
    archive_mode = on 
    archive_command = 'cd .' 
    
  6. Adjust wal_keep_segments on Master in postgresql.conf. Set this to a value no higher than the amount of freespace on the drive on which the pg_xlog directory is mounted, divided by 16MB. If pg_xlog isn't mounted on a separate drive, then don't assume all of the current freespace is available for transaction log files.

    wal_keep_segments = 10000 # e.g. 160 GB 
    
  7. Adjust hot Standby parameters if required (see later recipe)

  8. Take a base backup, very similar to the process for taking a physical backup as described in the backup chapter.

    a. Start the backup

     psql -c "select pg_start_backup('base backup for streaming rep')" 
    

    b. Copy the data files (excluding the pg_xlog directory)

     rsync -cva --inplace --exclude=*pg_xlog* \ 
     ${PGDATA}/ $STANDBYNODE:$PGDATA 
    

    c. Stop the backup

     psql -c "select pg_stop_backup(), current_timestamp" 
    
  9. Set the recovery.conf parameters on the Standby. Note that the primary_ conninfo must not specify a database name, though can contain any other PostgreSQL connection option. Note also that all options in recovery.conf are enclosed in quotes, whereas postgresql.conf parameters need not be.

    Standby_mode = 'on' 
    primary_conninfo = 'host=192.168.0.1 user=repuser' 
    trigger_file = '/tmp/postgresql.trigger.5432' 
    
  10. Start Standby server

  11. Carefully monitor replication delay until the catchup period is over. During the initial catchup period, the replication delay will be much higher than we would normally expect it to be. You are advised to set hot_Standby = off for the initial period only.

Craig Ringer
57.9k6 gold badges162 silver badges193 bronze badges
asked Jul 15, 2014 at 6:02
1
  • Answer updated. I think I was right about what happened, it just wasn't clear why with the information available. I'm still guessing since I can't know what exactly you really did, but hopefully this'll help. Commented Jul 17, 2014 at 8:40

1 Answer 1

17

You seem to be trying to replicate from one server to another that wasn't set up using a copy of the original server. That's why:

database system identifier differs between the primary and standby. The primary's identifier is 6022019027749040119, the standby's identifier is 6033562405193904122.

Because each newly initdb'd PostgreSQL gets a new random system identifier. When you copy an existing PostgreSQL install, it keeps the same system identifier. That's how PostgreSQL can keep track of whether one server can replay WAL from another.

You can only use physical replication if the replica is a copy (file-system level backup e.g. pg_basebackup) of the master. See the manual's detailed coverage on replication for more information.


Update:

The instructions shown above should be fine, but they're not as clear as they could be.

The standby server's data directory is supposed to be replaced by the base backup you create at step 8, if it exists in the first place.

You can't make an existing PostgreSQL instance into a standby for another without replacing its data directory. You need a copy of the master's data directory to run a standby. A common way to set that up is to take an existing standby, delete its data directory, replace it with a copy of the master's data directory, and then configure it as a replication slave. That's what I think step 8 is supposed to be doing.

Instead of doing that I think you probably used an existing data directory for the slave and tried to start it up as a replica of the master. That will not work, and will result in the errors you showed.

The main PostgreSQL documentation on replication is the recommended and primary resource for information. I suggest going there first.

You might also want to check out repmgr, which helps automate replication and failover tasks.

answered Jul 15, 2014 at 6:39

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.