While recovering from a cloud failure, I found that some tables on a PostgreSQL
database are behaving strangely. These tables are indexed using a primary key, but a pg_dump
yielded duplicate fields, failing a pg_restore
on a backup server.
I have tried to REINDEX
:
REINDEX INDEX rank_details_pkey;
ERROR: could not create unique index "rank_details_pkey"
DETAIL: Table contains duplicated values.
The index is defined as:
<table info here>
Indexes:
"rank_details_pkey" PRIMARY KEY, btree (user_id)
And, oddly,
SELECT user_id, COUNT(*) FROM <table name> GROUP BY 1 HAVING COUNT(*) > 1;
user_id | count
---------+-------
(0 rows)
To conclude - I have duplicate values in my table which can not be found or cleared.
Any ideas how to fix this? This is a production server, so all fixes should be done without affecting service.
-
2You might want to look at the plan of the grouping query to check that it doesn't use the index.Peter Eisentraut– Peter Eisentraut2011年08月18日 18:37:36 +00:00Commented Aug 18, 2011 at 18:37
1 Answer 1
There are various ways this can happen in Oracle - I'm not sure about postgres, but I think I would call this an "integrity violation" rather than "corruption"
Perhaps you can do one of the things suggested here, ie set enable_indexscan = off
or
begin;
drop index rank_details_pkey;
select user_id, count(*) from rank_details group by user_id having count(*) > 1;
rollback;
But "there are likely some locking issues with this, so be careful with it in production"
The idea is to force the query to scan the table rather than just the index (which does not have the duplicates). You may also, and more simply, be able to acheive the same by:
select user_id, f(<some other column>), count(*)
from rank_details
group by user_id, f(<some other column>)
having count(*) > 1
where f() returns a constant, which may trick the planner into a table scan.