A part of my database (PostgreSQL 9.3) relies on extra tables (ex. County, City, Town, ... ). I don't manage these tables, they are updated regularly by a third party. Each time I get a new full dump, but I have a hard time to push the changes back into my DB.
I've played with pg_dump / restore and run into some constraint issues duplicate key value violates unique constraint
or cannot drop constraint ... because other objects depend on it
even with --disable-triggers
or --clean
option.
Is there an option I've missed? I've found there are some ways to turn constraint on/off but I've no idea if it's the good way to solve this or just a dirty hack? (I'm not a DBA expert.) To be honest, I'm quite surprised there isn't an easy way to achieve this. Maybe I've missed it! I naively thought I could run pg_restore as a big transaction and check constraints at the end of the script. Is it possible?
2 Answers 2
A UNIQUE
constraint is not a trigger. It is implemented by way of a unique index. So it cannot be turned off with --disable-triggers
.
"Other objects" that depend on a the unique constraint are typically foreign key constraints. Those cannot exist without a unique (or primary key) constraint on the referenced column(s). To enable the restore, you could remove all such fk constraints together with the unique constraint.
Of course, to restore referential integrity, you would then have to eliminate violating duplicates and re-create all removed constraints. If you cannot afford to have an inconsistent state, even temporarily, do it all in a single (automatically blocking) transaction.
If you cannot afford exclusive locks on involved tables, your only remaining option is to fix your data first. This is probably the best course of action either way. You could import your data to temporary tables with COPY
, remove duplicates, and then INSERT
into the target tables.
- How to update selected rows with values from a CSV file in Postgres?
- Optimizing bulk update performance in PostgreSQL
If, on the other hand, you run into missing values for foreign keys, you can improvise with NOT VALID
fk constraints:
Your problem should never occur to begin with. If you actually have a UNIQUE constraint in place, you cannot have duplicate values in your source database - unless it's seriously broken. If that's the case, fix your source db first ...
How about trying to solve this problem with Point-In-Time-Recovery? A dump has the problem that you got an old set of data anyway. I highly suggest to use a backup method giving you ways more recent version of the data.
You can also write yourself a simple change-log trigger storing all the information which has changed in a simple table. Here is how it works: http://www.cybertec.at/tracking-changes-in-postgresql/ The code in the site should give you a rough prototype and help you to fix things.
-
Intresting, thanks for the info. Unfortunately i don't have any control on the way the partial dump is generated. I'll edit my question to make it more clear on this point.FastRipeFastRotten– FastRipeFastRotten2014年10月27日 12:26:28 +00:00Commented Oct 27, 2014 at 12:26