We have a table such as
create table pages (
id uuid primary key not null,
parent_id uuid not null references pages(id),
shortname text not null,
shorturl text not null,
constraint globally_unique_shorturl unique (shorturl)
constraint locally_unique_shortname unique (parent_id,shortname)
)
where the logical intent is to keep a collection of web pages where the full URL for each page is constructed from shortname
fields defined by parent_id
references. Root of the system points to itself with parent_id
. As such, any given child must have unique shortname per parent (constraint locally_unique_shortname). In addition, we have globally unique shorturl (think bit.ly
) that obviously must be globally unique for the whole table. The target is to keep both shortname and shorturl as short as possible and as such, we generate short random identifiers for shorturl. In addition, the shortname is selectable by user to allow system to create semantic URLs for every page.
When I'm trying to add a new page to the system (or modify existing page), we do it in a transaction and we create new SAVEPOINT
immediately before trying to add a new page. If we get error
ERROR: duplicate key value violates unique constraint 'XYZ'
we know that we need to rollback to savepoint and then figure out if shorturl
or shortname
should be modified before retrying. Note that if shorturl
fails, we can generate a new one automatically, but if shortname
fails, we need to prompt user for a new value, so we really need to know which constraint failed. (In addition, this query fails pretty seldom so I'd prefer not having to try SELECT
with the new value to figure out if it's already taken because this mostly succeeds without any extra queries. Also note that such SELECT
would still race against other concurrent transactions and I'd still end up with the same issue every now and then.) We're using serializable transactions, in case it makes a difference.
However, this causes PostgreSQL to always store this error in the server log even when the code obviously handles this correctly (rollback is emitted immediately the query fails). These errors spam the log to make other possible real problems harder to notice. Obviously, I can just fall back to filtering the log before trying to monitor it, but that would feel pretty poor solution.
To workaround this, Disable log ERROR: duplicate key value violates unique constraint suggests that I should use syntax INSERT ... ON CONFLICT DO NOTHING
. This would be otherwise fine (I could try to do insert and if no rows were affected I can assume that I have a conflict and nothing will be added to server log) but I'm missing info about which constraint failed when there are multiple possible constraints.
Is there any nice way to solve this without having to modify source code of PostgreSQL or load some custom extension in PostgreSQL?
1 Answer 1
You have two unique constraints, and you say that the one on shortname
causes an error only rarely.
So use
INSERT ... ON CONFLICT ON (shorturl) DO NOTHING;
Then check if you got an error and how many rows were inserted.
if you got an error, prompt the user for a new
shortname
if no row was inserted, generate a new
shorturl
Then you will only get error messages about duplicate shortname
s in your log, which should reduce the log spam to a level that is not bothersome.
-
Pretty nice solution for this case! This wouldn't scale up to 3 or more constraints so I will not mark it as accepted. Even though the example in question only gave 2 constraints, I'm hoping to come up with generic solution.Mikko Rantalainen– Mikko Rantalainen2020年09月01日 16:33:56 +00:00Commented Sep 1, 2020 at 16:33
-
I think that solution would be to generate names so that collisions are unlikely, for example using a hash.Laurenz Albe– Laurenz Albe2020年09月02日 05:03:52 +00:00Commented Sep 2, 2020 at 5:03
-
Due birthday paradox, you need to use much longer hashes than strictly necessary if unique value is all you need. When you target short identifiers, you'll get collisions every now and then unless you have single serialized source for the identifiers.Mikko Rantalainen– Mikko Rantalainen2020年09月02日 09:18:25 +00:00Commented Sep 2, 2020 at 9:18
-
What's wrong with a database sequence?Laurenz Albe– Laurenz Albe2020年09月02日 12:10:19 +00:00Commented Sep 2, 2020 at 12:10
-
1You misunderstand. Logical replication does not replicate the statement, but the resulting string. Try it! Use longer strings to reduce the odds of collision.Laurenz Albe– Laurenz Albe2020年09月03日 08:00:10 +00:00Commented Sep 3, 2020 at 8:00
Explore related questions
See similar questions with these tags.