5
\$\begingroup\$

The purpose of this SQL query would be to extract the record 10192 only, because the end_time is both in the past and its foreign counterpart in the Node table still has the status 1. I don't want to select the record 10111 because it hasn't expired.

I've just finished writing this SQL query, it works! Although I'm sure it can be written better and I'm interested to here what you have to say.

SELECT MAX(l.id) AS highestID, l.nid, n.status
 FROM mbl AS l
 INNER JOIN node AS n
 ON l.nid = n.nid
 WHERE n.status = 1
 AND ( SELECT end_time 
 FROM mbl
 WHERE id = ( SELECT MAX(id) 
 FROM mbl
 WHERE nid = l.nid
 GROUP BY nid
 )
 ) <= NOW()
 GROUP BY l.nid

Mlb table data:

id | nid | start_time | end_time
105 | 5 | 20/07/2014 01:00:00 | 24/07/2014 01:00:00
589 | 52 | 20/07/2014 01:00:00 | 31/07/2015 01:32:00
10108 | 874 | 20/07/2014 01:00:00 | 24/07/2014 01:00:00
10111 | 874 | 20/07/2014 01:00:00 | 24/07/2015 01:00:00 
10156 | 967 | 20/07/2012 01:00:00 | 24/07/2013 01:00:00
10161 | 967 | 20/07/2013 01:00:00 | 20/07/2014 01:00:00
10192 | 967 | 20/07/2014 01:00:00 | 24/07/2014 01:00:00

Node table data:

nid | status
5 | 0
52 | 1
874 | 1
967 | 1
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Jul 31, 2014 at 10:43
\$\endgroup\$
3
  • \$\begingroup\$ Welcome to Code Review! \$\endgroup\$ Commented Jul 31, 2014 at 11:53
  • \$\begingroup\$ And the results you want are? What does this actually tell you? What is id referencing? Remember, auto-generated ids are essentially a shortcut/convenience; you're almost always going to have some collection of data that functions as a unique value anyways (...which might be temporal or otherwise updateable, so they're really handy, but in most cases are essentially derived connectors). Are you trying to get the most recently inserted row (id tricks are hacky for this)? The most recently completed row (max end_time in the past)? The most recently started row? Etc... \$\endgroup\$ Commented Aug 1, 2014 at 2:45
  • \$\begingroup\$ I want the most recent row where an end_time is in the past and the status is 1 in the Node table. \$\endgroup\$ Commented Aug 1, 2014 at 13:48

3 Answers 3

6
\$\begingroup\$

First, let's go over some problems in your original query:

SELECT MAX(l.id) AS highestID, l.nid, n.status
..............
GROUP BY l.nid

... if there's more than one n.status for a given l.nid, you get a "random" row, decided by the optimizer (and not controllable by you). I'm presuming that you only have one row here (we'll exploit this in a bit).

(SELECT end_time 
 FROM mbl
 WHERE id = (SELECT MAX(id) 
 FROM mbl
 WHERE nid = l.nid
 GROUP BY nid)) <= NOW()

I'm presuming that id is still some sort of autogenerated value. The problem is, outside of some very specific contexts, the only thing a db-generated id should be used for is its uniqueness. Especially in situations where rows may be updated, max/min/ordering by ids rapidly becomes irrelevant. You seem to be attempting to make sure the most-recently-added row has an end_time value in the past - if you have some sort of inserted_at or similar timestamp, that would be a better column to use. It would also be helpful to know why you're looking for the greatest id in this case. (However, as I don't know the other columns in you table, I'm stuck with it for the moment).

It turns out that your statement can be reduced to this:

SELECT Most_Recent.highestId, Most_Recent.nid, Node.status
FROM (SELECT nid, MAX(id) AS highestId
 FROM Mbl
 GROUP BY nid) Most_Recent
JOIN Node
 ON Node.nid = Most_Recent.nid
JOIN Mbl
 ON Mbl.nid = Most_Recent.nid
 AND Mbl.id = Most_Recent.highestId
 AND Mbl.end_time <= NOW()

.... which not only eliminates a table reference, but your original version had issues with potential RBAR (row-by-agonizing-row)/Cartesian product plans. Note that we're still required to join to Mbl twice, with the second time being solely to make sure that the most-recent row also has an end_time value in the past (this can't be done in the subquery, because that would collect the most-recent-id of rows with end_time in the past)

answered Jul 31, 2014 at 12:50
\$\endgroup\$
3
  • \$\begingroup\$ The Mbl table has the columns id, nid, start_time and end_time. The Mbl table may have multiple rows with the same nid but they may have expired (the end_time is in the past). I'm joining the Node table to access the status column because I'm only interested in the rows with a status of 1. I don't have an inserted_at column in any of my tables. \$\endgroup\$ Commented Jul 31, 2014 at 13:51
  • 2
    \$\begingroup\$ @thebubbles - then perhaps we should get some sample starting data and desired results? I have a feeling your query isn't matching your expectations, or what you really want/need. For instance, you could add AND Node.status = 1 as a condition to the join... \$\endgroup\$ Commented Jul 31, 2014 at 14:00
  • \$\begingroup\$ I've added more detail to my question. \$\endgroup\$ Commented Jul 31, 2014 at 15:19
4
\$\begingroup\$

First thing's first. The indentation is all jacked up. There's not really a standard for formatting SQL, but as a rule of thumb, Select, From, and Where for any (sub)query should be at the same indentation level.

SELECT MAX(l.id) AS highestID, l.nid, n.status
FROM mbl AS l
INNER JOIN node AS n
 ON l.nid = n.nid
WHERE n.status = 1
 AND (
 SELECT end_time 
 FROM mbl
 WHERE id = (
 SELECT MAX(id) 
 FROM mbl
 WHERE nid = l.nid
 GROUP BY nid
 )
 ) <= NOW()
GROUP BY l.nid
answered Jul 31, 2014 at 11:52
\$\endgroup\$
3
  • 1
    \$\begingroup\$ Thanks, I just copied and pasted it from my code. It's cleaned up as per your answer! \$\endgroup\$ Commented Jul 31, 2014 at 15:21
  • \$\begingroup\$ And that's exactly the right thing to do here @thebubbles. We want to see your real code. We make bad code better and good code great. =) \$\endgroup\$ Commented Jul 31, 2014 at 15:28
  • \$\begingroup\$ FROM, WHERE, and GROUP BY should definitely be at the same indentation level. Personally, I would choose to indent INNER JOIN relative to that, and outdent SELECT. \$\endgroup\$ Commented Jul 31, 2014 at 19:16
2
\$\begingroup\$

Aliases

This may seem like nitpicking, but since you already got two good answers, I figured I would point out this only.

FROM mbl AS l
INNER JOIN node AS n

l and n mean nothing of themselves. Think of Mr. Maintainer looking back at your script, scratching his head. It may not seem like a big deal in this small of a query, but I have seen this sort of notation used in large scripts and it becomes very confusing. With these table names being only 3-4 letters long, I don't think an alias is even relevant.

answered Jul 31, 2014 at 15:48
\$\endgroup\$
1
  • 1
    \$\begingroup\$ I completely agree with you. I've altered a few things in the SQL to preserve privacy of the project it's being used in. The original naming conventions may allow someone to match it to the project it's being used on. \$\endgroup\$ Commented Jul 31, 2014 at 15:55

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.