I've inherited a large PostgreSQL 8.4.7 database that is currently not being backup at all. The database itself I believe is around 100GB, I've been reading about the different methods to backup this type of DB here
My questions are:
1) Typically how fast or slow is the pg_dump
dump process? I know its a subjective question and depends on hardware and the database, but roughly 100gb can I expect this to take 1 hour, 12 hours, 24+? I'd like to set an expectation with management and users before I begin.
2) The restore instructions indicate the following:
Before restoring a SQL dump, all the users who own objects or were granted permissions on objects in the dumped database must already exist. If they do not, then the restore will fail to recreate the objects with the original ownership and/or permissions.
Is it possible to gather this information as part of the backup? Or is there a separate backup process for users and permissions? Because I'm inheriting this database, Im not sure about all the users and permissions yet and would like to just backup everything for offsite storage.
1 Answer 1
2) pg_dumpall
is what you're looking for if you want all objects (globals, roles, etc.).
1) You noted that it is subjective. On my dev laptop (Win7 x64 .. i7 4 @ 2.2 ... 12 GB RAM ... normal, spinny disk @ 5400) I backup a 4GB pg cluster in about 15 seconds. I don't know if it scales linearly.
Finally, depending on your OS, I'd verify that it's not actually being backed up. I've seen pg on linux being backed up at the file system level using rsync
. Without getting into a debate about the relative merits of these differing methods, I'd verify that this isn't already implemented.
-
Thanks for the info on pg_dumpall. This server definitely isn't backed up off-site currently. They have an active-passive setup with a second server so the data is replicated, the only problem is both servers are in the same rack in the same server room! I want to take something offsite in case there is a fire or aliens abduct the servers one night...ProfessionalAmateur– ProfessionalAmateur2012年07月23日 16:15:27 +00:00Commented Jul 23, 2012 at 16:15
-
1@ProfessionalAmateur Yep, particularly
pg_dumpall --globals-only
Craig Ringer– Craig Ringer2012年07月24日 13:38:04 +00:00Commented Jul 24, 2012 at 13:38
pg_dump
like you expect to have to stop the server. You don't. You should be able to dump a live server just fine. There's some increase in load - but you can address that to a degree with anionice
and/orrenice
on thepostgres
backend that's feedingpg_dump
. The main impact is that the long transaction can hamperautovacuum
's work somewhat.