0

I have some Postgresql 10 instances running on Windows Server that are in continuous recovery mode. Once in a while they just stop recovering without giving any errors, like in this example log file (in CSV format, i've removed some of the fields for clarity):

2022年08月23日 19:42:02.391,"restored log file ""000000010000029F0000001A"" from archive"
2022年08月23日 19:42:07.638,"restored log file ""000000010000029F0000001B"" from archive"
2022年08月23日 19:42:13.276,"restored log file ""000000010000029F0000001C"" from archive"
2022年08月23日 19:42:18.464,"restored log file ""000000010000029F0000001D"" from archive"
2022年08月23日 19:42:18.699,"redo done at 29F/1CFFF7F8"
2022年08月23日 19:42:18.708,"last completed transaction was at log time 2022年07月20日 12:49:38.247406-03"
2022年08月23日 19:42:24.304,"restored log file ""000000010000029F0000001C"" from archive"
2022年08月23日 19:42:48.625,"selected new timeline ID: 2"
2022年08月23日 19:43:13.718,"archive recovery complete"
2022年08月23日 19:43:27.746,"database system is ready to accept connections"

This happens even thou the next wal file to be restored in the sequence (000000010000029F0000001D, 000000010000029F0000001E) is present in the archive directory. The restore command I'm using is something like this:

restore_command = '"C:/program files/postgresql/10/bin/pg_standby.exe" -s 2 D:/archive/127 %f %p %r 2>>D:/archive/127/pg_standby.log'

My question is, are there any way I can find out what caused the instance stop recovering?

asked Aug 24, 2022 at 12:34
6
  • 1
    What is your setup? That looks like you didn't set standby_mode = on. Commented Aug 24, 2022 at 13:04
  • Did you look in your pg_standby.log? BTW, pg_standby is very obsolete, built-in standby mode has been around for a long time. Commented Aug 24, 2022 at 13:20
  • Hi @LaurenzAlbe. My setup is Windows Server 2019 Standard and PostgreSQL 10.18. Looks like adding standby_mode='on' to recovery.conf solved the problem. I think that what was causing the instance to stop recovering is this error: "could not rename file ""pg_wal/00000001000002AA00000042"" to ""pg_wal/00000001000002AA00000085"" These type of errors are now appearing once in while in between "restored log file" entries. Commented Aug 25, 2022 at 13:44
  • Hi @jjanes. We use the built in replication for in site disaster recovery. But for offsite disaster recovery we use log shipping. Commented Aug 25, 2022 at 13:46
  • That error is a different question. There has been a bug like that a while ago - you could ask a ne question with the complete error message in it. Commented Aug 25, 2022 at 13:52

1 Answer 1

2

If recovery stops and the server promotes without you explicitly telling it to do so, you are probably in archive recovery mode rather than in standby mode.

Since PostgreSQL v12, you activate standby mode by creating a file standby.signal rather than recovery.signal in the PostgreSQL data directory.

Before PostgreSQL v12, you have to set standby_mode = on in recovery.conf to achieve the same thing.

answered Aug 25, 2022 at 13:51

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.