This query below will never complete, since it always looks at the first 100,000 (user_type, user_ids). How can I write this to delete 100,000 records from the table with the FK reference (Product) to the UserTable table? In other words, the limit should happen after the join, but preferably in an efficient way? UserTable has an index on (user_type, user_id) and (shard_key) and Product has an index on (user_type, user_id)
It doesn't necessarily have to be exactly 100,000 - for example, each user could own up to 100 products and that would be fine if the records deleted were (avg. # of products * 100,000)
DELETE p.*
FROM (
SELECT user_type, user_id
FROM UserTable
WHERE MOD(shard_key,20) != 7
LIMIT 100000
) t
JOIN Product p ON (
t.user_type = p.user_type AND t.user_id = p.user_id
)
EDIT: This is MySQL 5.6
1 Answer 1
This workaround is OK for my case but isn't great, since it needs to scan the entire index of the User table, so wouldn't be very portable. The user:product ratio can be up to 1:1000, and it's feasible to scan the User table first using a covering index before doing the JOIN.
DELETE
FROM Product f
JOIN (
SELECT t.* FROM (
SELECT DISTINCT user_type, user_id
FROM UserTable
WHERE MOD(shard_key,20) != 7
) p
JOIN Product t ON (
t.user_type = p.user_type
AND
t.user_id = p.user_id
) LIMIT 10
) x
ON
x.user_type = f.user_type
AND
x.user = f.user_id;
My takeaway is: If you can reasonably scan the entire primary key table quickly using indexes, and join with the foreign key table afterwards (ideally also using indexes), that is a solution that ensures you will be able to find ALL foreign key references if you want to batch DELETE in situations where many foreign keys reference each primary key.In this case, the User table was ~1GB and the Product table ~350GB
Explore related questions
See similar questions with these tags.
delete from UserTable where ... and exists(select 1 from Product) limit 100000
?