I have these two tables:
messages(id primary key, message_date, created_at, ...)
user_messages(user_id, message_id references messages(id))
I have some duplicate rows in messages:
select user_id, message_date, count(*)
from messages inner join user_messages
on messages.id = user_messages_message_id
group by user_id, message_date;
user_id | message_date | count(*)
1 | 2019年01月01日 | 2
1 | 2019年02月01日 | 3
1 | 2019年03月01日 | 2
How can I remove such duplicates, only retaining one of them, for example the one that its created_at
(not message_date
) is the minimum?
1 Answer 1
A lesser-known feature of Oracle databases- every row has a [hidden] column, called ROWID.
These meaningless,, character values can be used to isolate duplicates like this and get rid of them.
This query should get you candidates rows to be deleted:
select user_id, message_date, max(ROWID)
from messages m
inner join user_messages um
on m.id = um.message_id
group by user_id, message_date
having count( * ) > 1
order by 1, 2 ;
user_id | message_date | rowid
1 | 2019年01月01日 | AB12CD34EF...56
1 | 2019年02月01日 | AB12CD34EF...78
1 | 2019年03月01日 | AB12CD34EF...89
You can then delete rows using the ROWID directly.
delete from user_messages
where rowid = 'AB12CD34EF...56' ;
All that said, do not be tempted to use them for anything else!!
ROWID's are potentially volatile and so can change on you over time.
(user_id, message_id)
is the primary key.