0

I'm having trouble writing a query for the table below.

CREATE TABLE `c_example` (
 `id` int(11) NOT NULL AUTO_INCREMENT,
 `c_id` int(11) DEFAULT NULL,
 `guid` varchar(32) DEFAULT NULL,
 `provider_id` int(11) DEFAULT NULL,
 `test_timestamp` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
 PRIMARY KEY (`id`),
) ENGINE=MyISAM AUTO_INCREMENT=5731 DEFAULT CHARSET=utf8;

In this table the data will look like

1,1,313f516ad35e477482aa435ae53c6b3a,1,2016年05月09日 22:53:14
2,1,313f516ad35e477482aa435ae53c6b3a,2,2016年05月09日 22:53:14
3,1,313f516ad35e477482aa435ae53c6b3a,3,2016年05月09日 22:53:14
4,1,f76d2c14a86a42378badd04e4051e569,1,2016年05月10日 22:53:14
5,1,f76d2c14a86a42378badd04e4051e569,2,2016年05月10日 22:53:14
6,1,f76d2c14a86a42378badd04e4051e569,3,2016年05月10日 22:53:14
7,2,c040cecfbff648b2a3ecbd9d0688efec,1,2016年05月11日 22:53:14
8,2,c040cecfbff648b2a3ecbd9d0688efec,2,2016年05月11日 22:53:14
9,2,c040cecfbff648b2a3ecbd9d0688efec,3,2016年05月11日 22:53:14
10,2,79c87ba31df14ec89f0da39e98b6a088,1,2016年05月12日 22:53:14
11,2,79c87ba31df14ec89f0da39e98b6a088,2,2016年05月12日 22:53:14
12,2,79c87ba31df14ec89f0da39e98b6a088,3,2016年05月12日 22:53:14

Now, I want to delete all the rows where the unique combination of c_id, guid that has a count> 1 and delete the ones with the oldest timestamp, so I'm left with.

4,1,f76d2c14a86a42378badd04e4051e569,1,2016年05月10日 22:53:14
5,1,f76d2c14a86a42378badd04e4051e569,2,2016年05月10日 22:53:14
6,1,f76d2c14a86a42378badd04e4051e569,3,2016年05月10日 22:53:14
10,2,79c87ba31df14ec89f0da39e98b6a088,1,2016年05月12日 22:53:14
11,2,79c87ba31df14ec89f0da39e98b6a088,2,2016年05月12日 22:53:14
12,2,79c87ba31df14ec89f0da39e98b6a088,3,2016年05月12日 22:53:14

I'm having trouble coming up with the delete statement.

asked Jul 1, 2016 at 21:55
3
  • Any chance of doing a quick mysqldump and providing us with INSERT statements? Commented Jul 1, 2016 at 22:08
  • You mean unique combination of (c_id, provider_id), not unique combination of (c_id, guid). Commented Jul 3, 2016 at 9:57
  • No unique combination of (c_id, guid) - I managed to do it in 3 separate statements, looping through the results of the previous statements. In keep thinking I should be able to do it in one statement. Commented Jul 3, 2016 at 17:27

1 Answer 1

1

To solve this problem I did the following.

Created the table as per the question above. I then populated it with this data (a bit extra added for testing purposes).

INSERT INTO c_example 
VALUES 
(1, 1, '313f516ad35e477482aa435ae53c6b3a', 1, '2016-05-09 21:53:14'),
(2, 1, '313f516ad35e477482aa435ae53c6b3a', 2, '2016-05-09 21:53:14'),
(3, 1, '313f516ad35e477482aa435ae53c6b3a', 3, '2016-05-09 21:53:14'),
(4, 1, 'f76d2c14a86a42378badd04e4051e569', 1, '2016-05-10 21:53:14'),
(5, 1, 'f76d2c14a86a42378badd04e4051e569', 2, '2016-05-10 21:53:14'),
(6, 1, 'f76d2c14a86a42378badd04e4051e569', 3, '2016-05-10 21:53:14'),
(7, 2, 'c040cecfbff648b2a3ecbd9d0688efec', 1, '2016-05-11 21:53:14'),
(8, 2, 'c040cecfbff648b2a3ecbd9d0688efec', 2, '2016-05-11 21:53:14'),
(9, 2, 'c040cecfbff648b2a3ecbd9d0688efec', 3, '2016-05-11 21:53:14'),
(10, 2, '79c87ba31df14ec89f0da39e98b6a088', 1, '2016-05-12 21:53:14'),
(11, 2, '79c87ba31df14ec89f0da39e98b6a088', 2, '2016-05-12 21:53:14'),
(12, 2, '79c87ba31df14ec89f0da39e98b6a088', 3, '2016-05-12 21:53:14'),
(13, 3, '79c87ba31df14ec89f0da39e98b6a088', 3, '2016-05-12 21:53:14'),
(14, 4, '79c87ba31df14ec89f0da39e98b6a088', 3, '2016-05-12 21:53:14');

First, we need to obtain the c_ids and test_timestamps of the records we are going to keep:

SELECT c_id, MAX(test_timestamp) AS mydate1 
FROM c_example t1 
GROUP BY t1.c_id
ORDER BY c_id, MAX(test_timestamp)

Result:

+------+---------------------+
| c_id | mydate1 |
+------+---------------------+
| 1 | 2016年05月10日 21:53:14 |
| 2 | 2016年05月12日 21:53:14 |
| 3 | 2016年05月12日 21:53:14 |
| 4 | 2016年05月12日 21:53:14 |
+------+---------------------+

Now, we use these to find the ids corresponding to records with with these c_ids and test_timestamps.

SELECT t1.id, t1.c_id, t1.guid, t1.test_timestamp FROM c_example t1
JOIN
(
 SELECT c_id, MAX(test_timestamp) AS mydate1 
 FROM c_example t1 
 GROUP BY t1.c_id
 ORDER BY c_id, MAX(test_timestamp)
) t2 ON t1.c_id = t2.c_id AND t1.test_timestamp = t2.mydate1

Result:

+----+------+----------------------------------+---------------------+
| id | c_id | guid | test_timestamp |
+----+------+----------------------------------+---------------------+
| 4 | 1 | f76d2c14a86a42378badd04e4051e569 | 2016年05月10日 21:53:14 |
| 5 | 1 | f76d2c14a86a42378badd04e4051e569 | 2016年05月10日 21:53:14 |
| 6 | 1 | f76d2c14a86a42378badd04e4051e569 | 2016年05月10日 21:53:14 |
| 10 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 2016年05月12日 21:53:14 |
| 11 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 2016年05月12日 21:53:14 |
| 12 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 2016年05月12日 21:53:14 |
| 13 | 3 | 79c87ba31df14ec89f0da39e98b6a088 | 2016年05月12日 21:53:14 |
| 14 | 4 | 79c87ba31df14ec89f0da39e98b6a088 | 2016年05月12日 21:53:14 |
+----+------+----------------------------------+---------------------+

These are the records we want to keep. You could just SELECT t1.id for only the id - I just selected the extra fields for clarity.

So, you might think that this query would work (it does in PostgreSQL)

DELETE FROM c_example
WHERE c_example.id NOT IN
(
 SELECT t1.id FROM c_example t1
 JOIN
 (
 SELECT c_id, MAX(test_timestamp) AS mydate1 
 FROM c_example t1 
 GROUP BY t1.c_id
 ORDER BY c_id, MAX(test_timestamp)
 ) t2 ON t1.c_id = t2.c_id AND t1.test_timestamp = t2.mydate1
);

But, due to the delightful (ahem...) quirks of MySQL's brand of SQL, instead of working, one is presented with this error message:

ERROR 1093 (HY000): You can't specify target table 'c_example' for update in FROM clause

As it was put so aptly here "Sometimes I wonder what drugs the MySQL devs are on..." (found while searching for a solution - also to be found in that thread).

So, you have to add a further level of nesting to the query and finally, a query which works:

DELETE FROM c_example 
WHERE c_example.id NOT IN
(
 SELECT t3.id FROM
 (
 SELECT t1.id, t1.c_id, t1.guid, t1.test_timestamp FROM c_example t1
 JOIN
 (
 SELECT c_id, MAX(test_timestamp) AS mydate1 
 FROM c_example t1 
 GROUP BY t1.c_id
 ORDER BY c_id, MAX(test_timestamp)
 ) t2 ON t1.c_id = t2.c_id AND t1.test_timestamp = t2.mydate1
 )AS t3
);

--

Result:

Query OK, 6 rows affected (0.66 sec)

And run the SELECT to verify.

mysql> SELECT * FROM c_example;
+----+------+----------------------------------+-------------+---------------------+
| id | c_id | guid | provider_id | test_timestamp |
+----+------+----------------------------------+-------------+---------------------+
| 4 | 1 | f76d2c14a86a42378badd04e4051e569 | 1 | 2016年05月10日 21:53:14 |
| 5 | 1 | f76d2c14a86a42378badd04e4051e569 | 2 | 2016年05月10日 21:53:14 |
| 6 | 1 | f76d2c14a86a42378badd04e4051e569 | 3 | 2016年05月10日 21:53:14 |
| 10 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 1 | 2016年05月12日 21:53:14 |
| 11 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 2 | 2016年05月12日 21:53:14 |
| 12 | 2 | 79c87ba31df14ec89f0da39e98b6a088 | 3 | 2016年05月12日 21:53:14 |
| 13 | 3 | 79c87ba31df14ec89f0da39e98b6a088 | 3 | 2016年05月12日 21:53:14 |
| 14 | 4 | 79c87ba31df14ec89f0da39e98b6a088 | 3 | 2016年05月12日 21:53:14 |
+----+------+----------------------------------+-------------+---------------------+
8 rows in set (0.00 sec)

Which should be the right answer - if this doesn't solve your issue, please clarify the question.

answered Jul 1, 2016 at 22:48

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.