0

We have the following query which returns the most recent fares from our database :

SELECT f1.id 
 FROM ( SELECT * FROM fares ) AS f1
INNER JOIN ( SELECT origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 , max(created_at) AS maxDate 
 FROM fares
 GROUP BY origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 ) AS f2 ON f2.origin_id = f1.origin_id 
 AND f2.destination_id = f1.destination_id 
 AND f2.market_id = f1.market_id 
 AND f2.cabin = f1.cabin 
 AND f2.tripType = f1.tripType
 WHERE f1.created_at = f2.maxDate
 GROUP BY f1.origin_id
 , f1.destination_id
 , f1.market_id
 , f1.cabin
 , f1.tripType

In our test dataset (SQL Fiddle) this returns two rows. When we change the query to work as a delete query it deletes everything but three rows, while it should delete everything but the two rows:

DELETE FROM fares 
 WHERE id NOT IN 
 ( SELECT f1.id 
 FROM ( SELECT * FROM fares ) AS f1
 INNER JOIN ( SELECT origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 , max(created_at) AS maxDate 
 FROM fares 
 GROUP BY origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 ) AS f2 ON f2.origin_id = f1.origin_id 
 AND f2.destination_id = f1.destination_id 
 AND f2.market_id = f1.market_id 
 AND f2.cabin = f1.cabin 
 AND f2.tripType = f1.tripType
 WHERE f1.created_at = f2.maxDate
 GROUP BY f1.origin_id
 , f1.destination_id 
 , f1.market_id
 , f1.cabin
 , f1.tripType
 )

We noticed that the three rows that remain are also the product of the select query when the GROUP BY part is ommited.

Now the question is why does the SELECT query in the DELETE statement behave different than the SELECT statement by itself?

Reproduction:

  • Take the schema from this SQLFiddle: http://sqlfiddle.com/#!9/e943ee/2
  • Run the query supplied in the SQLFiddle, only two rows are returned
  • Add the delete query to the schema & rebuild the schema
  • Execute "SELECT * FROM fares" and three rows are returned (where there should have been only two)
Kondybas
4,81020 silver badges16 bronze badges
asked Jun 22, 2020 at 8:22

1 Answer 1

1

If you want to purge an outdated data then your query is a bit complicated:

DELETE FROM fares AS f1 
 LEFT JOIN ( SELECT origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 , max(created_at) AS maxDate 
 FROM fares 
 GROUP BY origin_id
 , destination_id
 , market_id
 , cabin
 , tripType
 ) AS f2 ON f2.origin_id = f1.origin_id 
 AND f2.destination_id = f1.destination_id 
 AND f2.market_id = f1.market_id 
 AND f2.cabin = f1.cabin 
 AND f2.tripType = f1.tripType
 AND f2.maxDate = f1.created_at /* was omitted */
 WHERE f2.maxDate IS NULL
)
answered Jun 22, 2020 at 8:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.