I turned the archive_mode on on my postgresql configuration for testing a backup server . And since the wal files took lots of disk space, after my test I turned it off and also deleted the wal files. When I tried to restart postgresql I got the following error.
root@hooshang:/etc/postgresql/9.1/main# /etc/init.d/postgresql restart
* Restarting PostgreSQL 9.1 database server
* The PostgreSQL server failed to start. Please check the log output:
2014年10月16日 13:15:28 IRST LOG: database system was shut down at 2014年10月15日 15:51:53 IRST
2014年10月16日 13:15:28 IRST LOG: could not open file "pg_xlog/00000001000007DC00000037" (log file 2012, segment 55): No such file or directory
2014年10月16日 13:15:28 IRST LOG: invalid primary checkpoint record
2014年10月16日 13:15:28 IRST LOG: could not open file "pg_xlog/00000001000007DC00000029" (log file 2012, segment 41): No such file or directory
2014年10月16日 13:15:28 IRST LOG: invalid secondary checkpoint record
2014年10月16日 13:15:28 IRST PANIC: could not locate a valid checkpoint record
2014年10月16日 13:15:28 IRST LOG: startup process (PID 17467) was terminated by signal 6: Aborted
2014年10月16日 13:15:28 IRST LOG: aborting startup due to startup process failure
How can I fix this issue?
1 Answer 1
You have corrupted your database by manually deleting files from within the data directory. Never delete files from within the data directory manually.
Safely removing WAL
If you want to remove WAL, either let the server do it at CHECKPOINT
time, or use pg_archivecleanup
. Note that the server will remove WAL is no longer needs automatically, unless:
archive_mode
is on, butarchive_command
is failing, so the server keeps on retrying the archive attempts until they succeed or the admin intervenes;- It's still preserved by
wal_keep_segments
- (in 9.4) It's still needed by a replication slot.
If none of those apply, a CHECKPOINT
(either automatic, or manually issued via SQL) will remove all WAL that's not currently required. So you don't have to delete it by hand.
In unusual circumstances you might need to use pg_archivecleanup
, like if you've run out of disk space due to WAL accumulation after archiving has failed for a long time. You might decide to accept that you'll have to re-create your replicas as a result of throwing away WAL they still need and use pg_archivecleanup
to free space to get the master running.
But you should never delete WAL segments by hand.
Recovering with archived WAL
If you kept the archived WAL somewhere else, you might just be able to copy the files back into pg_xlog
, or create a recovery.conf
with a restore_command
to do so. See the manual on PITR and log shipping for details.
Recovering without archived WAL
If you don't have a copy of those WAL files somewhere else, you should restore from a backup if you have a recent backup, because you have corrupted your database.
If you do not have a backup, follow the instructions in the corruption wiki page and only once you have made a complete copy of the current state of the database, as a last resort only, use pg_resetxlog
to throw away the transaction logs and force the DB to start with incomplete transactions.
You must then pg_dump
the database, stop it, initdb
a new one, and restore to it. Do not keep using the database you damaged. Never keep using a database you used pg_resetxlog
on, and never use it except as a last resort.
-
Highly important is the last point about the very dangerous pg_resetxlog, which I have already seen in backup tools by 3rd party applications (!!).KookieMonster– KookieMonster2014年10月17日 11:51:51 +00:00Commented Oct 17, 2014 at 11:51
-
@KookieMonster Please tell me their names so I can contact the vendors. That's bad.Craig Ringer– Craig Ringer2014年10月17日 14:06:29 +00:00Commented Oct 17, 2014 at 14:06
-
They were told for sure :). As far as I know, they've changed it some time after. Surprising tough for a rather big company supporting PostgreSQL restores...KookieMonster– KookieMonster2014年10月17日 14:36:41 +00:00Commented Oct 17, 2014 at 14:36
-
@CraigRinger You are a savior. This answer helped me a lot after I removed pg_xlog directory.Nikunj Sardhara– Nikunj Sardhara2017年03月17日 09:36:07 +00:00Commented Mar 17, 2017 at 9:36
-
@NikunjSardhara Good news, it's been renamed to
pg_wal
in PostgreSQL 10, in the hopes that people won't delete it thinking its "just logs" anymore. Think that would've helped for you?Craig Ringer– Craig Ringer2017年03月17日 12:58:44 +00:00Commented Mar 17, 2017 at 12:58
Explore related questions
See similar questions with these tags.