1

Problem summary

Today part of my app crashed, but nothing else stopped with it. I need to find the first and last tracked message ids for each channel during the downtime. I log unix timestamps alongside each message record, so I will be using those to determine which messages were not tracked during this period.

Let's say the downtime started at timestamp x and ended at timestamp y. I want to return the last tracked message in channel z before timestamp x, and the first tracked message after timestamp y. I hope that makes sense.

Here are my tables

mysql> select * from messages limit 5;
+---------+---------+-----------+
| message | channel | timestamp |
+---------+---------+-----------+
| 123 | 456 | 165789921 |
| 124 | 457 | 165789921 |
| 125 | 458 | 165789922 |
| 126 | 459 | 165789922 |
| 127 | 460 | 165789924 |
+---------+---------+-----------+
mysql> explain select * from messages;
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| 1 | SIMPLE | messages | NULL | ALL | NULL | NULL | NULL | NULL | 211250 | 100.00 | NULL |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
CREATE TABLE messages (
 message VARCHAR(18) NOT NULL UNIQUE,
 `channel` VARCHAR(18) NOT NULL,
 `timestamp` TINYTEXT NOT NULL,
 ...,
 PRIMARY KEY (message),
 FOREIGN KEY (`channel`) REFERENCES channels (`channel`)
);
mysql> select * from channels limit 5;
+---------+-----+
| channel | ... |
+---------+-----+
| 456 | ... |
| 457 | ... |
| 458 | ... |
| 459 | ... |
| 460 | ... |
+---------+-----+
mysql> explain select * from channels;
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| 1 | SIMPLE | channels | NULL | index | NULL | guild | 74 | NULL | 1434 | 100.00 | Using index |
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
CREATE TABLE channels (
 `channel` VARCHAR(18) NOT NULL UNIQUE,
 ...,
 PRIMARY KEY(`channel`),
 FOREIGN KEY (guild) REFERENCES guilds (guild)
);

What I've tried

I've tried using a LEFT JOIN with WHERE clauses, but I'm no MySQL expert to say the least. There might be a way to do it this way, I just can't with my current knowledge. The issue I was facing with this approach is, I'm not sure how to access the original channel id for matching in the SELECT query.

DOES NOT WORK - Obviously, why else would I be here...

SET @min := 1657956885428;
SET @max := 1657966145675;
SELECT * FROM (SELECT channel FROM channels) as channels
LEFT JOIN (SELECT channel, message as downtime_start_after FROM messages WHERE timestamp < @min ORDER BY timestamp ASC LIMIT 1) AS after USING (`channel`)
LEFT JOIN (SELECT channel, message as downtime_end_before FROM messages WHERE timestamp > @max ORDER BY timestamp DESC LIMIT 1) AS before USING (`channel`);
+---------+----------+-----------+
| channel | ...after | ...before |
+---------+----------+-----------+
| 456 | NULL | NULL |
| 457 | 124 | NULL |
| 458 | NULL | 130 |
| 459 | NULL | NULL |
| 460 | NULL | NULL |
+---------+----------+-----------+

Ideal result

Both parameters after and before should be the message IDs which were tracked last and first respectively. I intend to use these to determine whether a message in a channel may have been within this period.

+---------+----------+-----------+
| channel | ...after | ...before |
+---------+----------+-----------+
| 456 | 123 | 128 |
| 457 | 124 | 129 |
| 458 | 125 | 130 |
| 459 | 126 | 131 |
| 460 | 127 | 132 |
+---------+----------+-----------+

of course, the data here is just filler

Thanks in advance. Lmk if I should include any additional information!

EDIT:

I spent some more time working on a query, and managed to find something which works. I'd be more than glad to hear any advice people have for optimisations.

SET @down := 1657956885428;
SELECT * FROM (SELECT channel FROM channels) as channels
LEFT JOIN (SELECT * FROM (SELECT channel, message as `after` FROM messages WHERE `timestamp` < @down ORDER BY `timestamp` DESC ) as after_messages GROUP BY (`channel`)) as `after` USING (channel)
LEFT JOIN (SELECT * FROM (SELECT channel, message as `before` FROM messages WHERE `timestamp` > @down ORDER BY `timestamp` ASC ) as before_messages GROUP BY (`channel`)) as `before` USING (channel);
+---------+-------+--------+
| channel | after | before |
+---------+-------+--------+
| 456 | 123 | 128 |
| 457 | 124 | 129 |
| 458 | 125 | 130 |
| 459 | 126 | 131 |
| 460 | 127 | 132 |
+---------+-------+--------+
asked Jul 17, 2022 at 0:09
2
  • Please get rid of the extraneous columns and shorten the numbers. Commented Jul 17, 2022 at 1:21
  • Thanks for the advice @RickJames. I've made those changes now. Commented Jul 17, 2022 at 23:22

1 Answer 1

1

the downtime started at timestamp x and ended at timestamp y. I want to return the last tracked message in channel z before timestamp x

Does this achieve that?

SELECT ...
 FROM t
 WHERE channel = z
 AND timestamp <= x
 ORDER BY timestamp DESC
 LIMIT 1

Change three things to get the first item after z, then put them together thus:

( SELECT ... x ... )
UNION ALL
( SELECT ... y ... )
ORDER BY timestamp

Timestamp looks like a timestamp, why use TINYTEXT? Consider BIGINT`. And add these indexes:

INDEX(timestamp)
INDEX(channel, timestamp)

will give you two rows.

More

A rewrite of the query you presented; this should run a lot faster:

[code]

SELECT channel, 
 (
 SELECT message
 FROM messages
 WHERE channel = c.channel
 AND timestamp < @down
 ORDER BY timestamp DESC
 LIMIT 1 
 ) AS 'after', 
 (
 SELECT message
 FROM messages
 WHERE channel = c.channel
 AND timestamp > @down
 ORDER BY timestamp ASC
 LIMIT 1 
 ) AS 'before',
 FROM channels AS c
 ORDER BY channel -- you may want this
 ;

And, again, messages needs INDEX(channel, timestamp).

If there is a message at exactly @down, it will be missed. Consider changing > to >= or < to <= (but not both).

answered Jul 17, 2022 at 1:21
5
  • That's simple enough. However it doesn't really solve the problem. There are over a thousand records in the channels table, and I don't intend to do manually input the channel id for each query. I would like a query which can use the id in each channel record to use in the WHERE clause of a looping?-SELECT query. I found a way to resolve it - please reference the edit - though I'm pretty sure it's quite inefficient. Commented Jul 17, 2022 at 14:02
  • @jayc331 - Thanks for the updates. You started with a LIMIT 1, but that seems to be gone. Do you want ONE message on either side? Or ALL messages on each side? Commented Jul 18, 2022 at 0:32
  • Just the two extreme values, so ONE either side. My edit shows the desired result if you wanna refer to that. Commented Jul 27, 2022 at 10:06
  • @jayc331 - I added a faster way to write it. Commented Jul 28, 2022 at 4:56
  • Fantastic, works like a charm. Thanks for the fast responses. Commented Jul 28, 2022 at 15:17

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.