6
\$\begingroup\$

Could someone review an accepted answered I gave on Stack Overflow?

The use-case is as follows:

Given a messaging system where a user can receive a message from a single user and send messages to one or more users, return the most recent communication (sent or received) between a passed userId and the individual(s) that user communicated with.

For the example, I have three tables:

Users

id user_name
1 Walker
2 John
3 Kate

Messages

id senderid body time
1 1 ignored 1 2010年04月01日 00:00:00.000
2 1 ignored 2 2010年04月02日 00:00:00.000
3 3 ignored 3 2010年04月03日 00:00:00.000
4 1 msg A to john and kate 2010年04月10日 00:00:00.000
5 3 msg b from kate to walker and john 2010年04月11日 00:00:00.000

messages_recipients

id messageid userid
1 1 2
2 1 3
3 2 2
4 3 1
5 4 2
6 4 3
7 5 1
8 5 2

The data is tailored in such a way that I want a list of communications between user Walker and the people Walker has spoken with.

You can see a list of these messages by running the following SQL statement:

SELECT 
 u2.user_name AS Sender, 
 u1.user_name AS Receiver, 
 m.body, 
 m.time
FROM 
 messages m
JOIN 
 messages_recipients mr ON m.id = mr.messageid
JOIN 
 users u1 ON mr.userid = u1.id
JOIN 
 users u2 ON m.senderid = u2.id
ORDER BY 
 time DESC

Now that we have the test scenario, the part I want reviewed: returning the most recently communicated message between Walker, John, and Kate.

BEGIN
 DECLARE @UserId INT = 1
 --A. Main Query
 SELECT
 CASE 
 WHEN mtemp.senderid = 1 --@UserId 
 THEN 
 CONCAT('Message To: ', receivers.user_name)
 ELSE 
 CONCAT('Message From: ' , senders.user_name)
 END AS MessageType, 
 mtemp.body, 
 mtemp.time 
 FROM 
 messages mtemp 
 INNER JOIN users senders ON 
 mtemp.senderid = senders.id 
 INNER JOIN 
 (
 --B. Inner Query determining most recent message (based on time) 
 -- between @UserID and the person @UserID 
 -- Communicated with (either as sender or receiver)
 select userid,max(maxtime) as maxmaxtime from
 (
 --C.1. First part of Union Query Aggregating sent/received messages on passed @UserId
 SELECT 
 m2.body,
 kk.* 
 FROM 
 `messages` m2 INNER JOIN
 (
 SELECT DISTINCT
 userid,
 MAX(m.time) AS MaxTime
 FROM
 messages m INNER JOIN
 messages_recipients mr ON m.id = mr.messageid AND
 m.senderid = 1 --@UserId
 GROUP BY
 mr.userid
 ) kk on m2.time = kk.MaxTime and m2.senderid = 1 --@UserId
 UNION
 --C.2. Second part of Union Query Aggregating sent/received messages on passed @UserId
 SELECT 
 m1.body,
 jj.* 
 FROM 
 `messages` m1 INNER JOIN
 ----C.2a. Inner most query of users users who sent message to userid
 (SELECT DISTINCT
 senderid as userid,
 MAX(m.time) AS MaxTime
 FROM
 messages m INNER JOIN
 messages_recipients mr ON m.id = mr.messageid AND
 mr.userid = 1 --@UserId
 GROUP BY
 m.senderid) jj on m1.time = jj.MaxTime and m1.senderid = jj.userid 
 ) MaximumUserTime
 group by 
 MaximumUserTime.userid 
 ) AggregatedData on mtemp.time = AggregatedData.maxmaxtime 
 INNER JOIN users receivers on AggregatedData.userid = receivers.id 
 ORDER BY `time` DESC
END

To test in phpMyAdmin, you'll have to remove the comments and the BEGIN/END DECLARE statements as well. I just wanted to post this as if it would look in a procedure.

When I run this query I get the following results:

MessageType body time
Message From: Kate msg b from kate to walker and john 2010年04月11日 00:00:00.000
Message To: John msg A to john and kate 2010年04月10日 00:00:00.000

That's the most recent communications concerning Walker among all those users who have communicated with Walker.

Is there a better way to run this query?

asked Apr 19, 2011 at 14:24
\$\endgroup\$
0

1 Answer 1

6
\$\begingroup\$

My solution has a similar complexity to yours (14 steps in EXPLAIN), assuming MySQL's query optimizer is smart enough. However, in my opinion, this formulation will be much easier to understand.

SELECT IF(recipientid,
 CONCAT('Message To: ', recipient.user_name),
 CONCAT('Message From: ', sender.user_name)) AS MessageType,
 body,
 time
 FROM
 ( -- Join messages with recipients, relabeling userids in terms of interlocutor and self
 SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 UNION
 SELECT messages.id, time, body, senderid, NULL, senderid, userid
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 ) AS thread_latest
 LEFT OUTER JOIN users AS recipient
 ON recipient.id = recipientid
 LEFT OUTER JOIN users AS sender
 ON sender.id = senderid
 WHERE
 -- Discard all but the latest message in each thread
 NOT EXISTS (
 SELECT messageid
 FROM
 (
 SELECT messageid, time, userid AS interlocutor, senderid AS self
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 UNION
 SELECT messages.id, time, senderid, userid
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 ) AS thread_later
 WHERE
 thread_later.self = thread_latest.self AND
 thread_later.interlocutor = thread_latest.interlocutor AND
 thread_later.time > thread_latest.time
 ) AND
 self = 1 --@UserId
 ORDER BY time DESC;

The main insight is that once you relabel senders and recipients in terms of interlocutor and self, it's just a simple matter of filtering out the results. Retain only those messages where self is the user in question. Then, every row that has the same interlocutor conceptually constitutes a thread.

Notice that there is a subquery that appears twice. We can make it clearer by creating a view.

CREATE VIEW threads AS
 -- Messages I sent
 SELECT messageid, time, body, NULL AS senderid, userid AS recipientid, userid AS interlocutor, senderid AS self
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 UNION
 -- Messages I received
 SELECT messages.id, time, body, senderid, NULL, senderid, userid
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid;
SELECT IF(recipientid,
 CONCAT('Message To: ', recipient.user_name),
 CONCAT('Message From: ', sender.user_name)) AS MessageType,
 body,
 time
 FROM
 threads AS thread_latest
 LEFT OUTER JOIN users AS recipient
 ON recipient.id = recipientid
 LEFT OUTER JOIN users AS sender
 ON sender.id = senderid
 WHERE
 NOT EXISTS (
 SELECT messageid
 FROM threads AS thread_later
 WHERE
 thread_later.self = thread_latest.self AND
 thread_later.interlocutor = thread_latest.interlocutor AND
 thread_later.time > thread_latest.time
 ) AND
 self = 1 --@UserId
 ORDER BY time DESC;

I'll take this opportunity to point out that this query is where PostgreSQL really shines. Two features in PostgreSQL (since version 8.4) make it easy. The WITH clause lets you define a helper view in the query itself. More importantly, window functions let you partition the threads by interlocutor, which is precisely the tricky part about this problem.

WITH threads(messageid, time, body, senderid, recipientid, interlocutor, self) AS (
 -- Messages I sent
 SELECT messageid, time, body, NULL, userid, userid, senderid
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
 UNION
 -- Messages I received
 SELECT messages.id, time, body, senderid, NULL, senderid, userid
 FROM messages_recipients
 INNER JOIN messages
 ON messages.id = messageid
)
SELECT CASE WHEN recipientid IS NOT NULL
 THEN 'Message To: ' || recipient.user_name
 ELSE 'Message From: ' || sender.user_name
 END AS MessageType,
 body,
 time
 FROM (
 SELECT *,
 RANK() OVER (PARTITION BY interlocutor ORDER BY time DESC) AS thread_pos
 FROM threads
 WHERE self = 1 --@UserId
 ) AS my_threads
 LEFT OUTER JOIN users AS recipient
 ON recipient.id = recipientid
 LEFT OUTER JOIN users AS sender
 ON sender.id = senderid
 WHERE thread_pos = 1 -- Only the latest message per thread
 ORDER BY time DESC;
answered Aug 16, 2013 at 10:12
\$\endgroup\$
1
  • \$\begingroup\$ Thanks for taking the time to give a detailed answer to this 2+ year old question. \$\endgroup\$ Commented Aug 16, 2013 at 15:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.