2

I have a table item having just one column name:-

name
-----
toys
shirt
mobile
Shirt
speaker
Toys
.... 
....

I am trying to delete duplicate rows except one duplicated record with the help of below query

DELETE FROM
(
SELECT `name`,`rownumber` FROM (SELECT GREATEST(0,@num := IF(`name` = @NAME, 
@num + 1, 1),LEAST(1, LENGTH(@NAME := `name`))) AS rownumber,`name` FROM 
item ORDER BY `name`) AS result1
) AS result 
WHERE rownumber >1

The Inner query (Starts from SELECT name and ends with result1) returns individual auto incremented number for each repeating group but when i run entire query then it does not work.

I would like to know what changes i need to make in the inner query (contained in round braces) to delete duplicate rows.

asked Jul 23, 2012 at 10:35
2
  • 2
    If no PK, then you need to tell us what is a duplicate row? is it duplicate if only 1 column in row 1 matches the same column in row 2? Or is it duplicate if all columns match? Do you care about case? etc. Commented Jul 23, 2012 at 10:50
  • This is duplicate as item name "toys" and "shirt" is being repeated in a table with Camel case (First letter capital). So i want to delete those records in Camel case. Commented Jul 23, 2012 at 10:52

2 Answers 2

1

I assume you have no Foreign key constarints on that table so the following procedure will work:

Make another identical table:

CREATE TABLE stage
( name VARCHAR(255) NOT NULL
) ;

Copy there the non-duplicate names:

INSERT INTO stage 
 (name)
SELECT name
FROM item
GROUP BY name ;

Delete everything from the original table:

TRUNCATE TABLE item ;

And make the name column unique (or the primary key):

ALTER TABLE item
 ADD CONSTRAINT item_pk
 PRIMARY KEY (name) ;

Now write the non-duplicate data back:

INSERT INTO item
 (name)
SELECT name
FROM stage ;

And drop the (temporary) table:

DROP TABLE stage ;
RolandoMySQLDBA
185k34 gold badges327 silver badges541 bronze badges
answered Jul 23, 2012 at 12:25
5
  • 1
    The OP is regarding 'toys' and 'Toys' as duplicates, so step 2 (the copy) should be INSERT INTO stage (name) SELECT lower(name) FROM item GROUP BY lower(name); Commented Jul 23, 2012 at 19:05
  • @ypercube: I also knew this way of using temporary table to delete duplicates. But i was expecting someone to help me on the query written in my question. I don't want to use any other way because using third/temporary table is not a good practice for this. Commented Jul 24, 2012 at 6:43
  • @RolandoMySQLDBA: Can you please take a look at my query written in the question and suggest? Commented Jul 24, 2012 at 6:46
  • I upvoted this one yesterday because it is the most efficient. It also makes the most sense due to the fact that the item table does not have other columns to work with in terms of iteration. I recommended something like this back in March 2011 : dba.stackexchange.com/a/1502/877 and Sep 2011 : dba.stackexchange.com/a/6269/877. Good job, @ypercube. Commented Jul 24, 2012 at 13:52
  • @RolandoMySQLDBA: thnx. With no Primary key or unique index, I thought this is the most sensible way to go. Commented Jul 24, 2012 at 14:21
1

In the spirit of @yercube's answer, I have an answer that has an added twist.

CREATE TABLE stage
(
 id int not null auto_increment,
 name varchar(20),
 primary key (id)
);
CREATE TABLE stage2 LIKE stage;
INSERT INTO stage (name) SELECT name FROM item;
INSERT INTO stage2 (id) SELECT min_id FROM
(SELECT MIN(id) min_id,name FROM stage GROUP BY name) A;
UPDATE stage2 A INNER JOIN stage B USING (id) SET A.name=B.name;
TRUNCATE TABLE item;
INSERT INTO item (name) SELECT name FROM stage2;
DROP TABLE stage;
DROP TABLE stage2;

This will load stage2 with the first occurrence of each name from item, zap the item table, and load the unique occurrences back.

If you look back in @yercube's answer and compare it to my answer, his is much more simplistic because

  • @yercube uses one temp table, while I use two
  • I had to create a column for iteration control, @yercube did not need to
  • @yercube has fewer steps
  • both answers achieve the same thing

I do not expect my answer to be accepted. The sole purpose of my answer was demonstrate that other answers lose the concise clarity needed to solve your problem. Again, hats off to @yercube.

answered Jul 24, 2012 at 14:49
1
  • Thanks for your answer. But i still don't want to use any third/temp table. I am just looking for an answer that actually help me to use the same query i have written in my question. Commented Jul 25, 2012 at 6:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.