Problem summary
Today part of my app crashed, but nothing else stopped with it. I need to find the first and last tracked message ids for each channel during the downtime. I log unix timestamps alongside each message record, so I will be using those to determine which messages were not tracked during this period.
Let's say the downtime started at timestamp x
and ended at timestamp y
. I want to return the last tracked message in channel z
before timestamp x
, and the first tracked message after timestamp y
. I hope that makes sense.
Here are my tables
mysql> select * from messages limit 5;
+---------+---------+-----------+
| message | channel | timestamp |
+---------+---------+-----------+
| 123 | 456 | 165789921 |
| 124 | 457 | 165789921 |
| 125 | 458 | 165789922 |
| 126 | 459 | 165789922 |
| 127 | 460 | 165789924 |
+---------+---------+-----------+
mysql> explain select * from messages;
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
| 1 | SIMPLE | messages | NULL | ALL | NULL | NULL | NULL | NULL | 211250 | 100.00 | NULL |
+----+-------------+----------+------------+------+---------------+------+---------+------+--------+----------+-------+
CREATE TABLE messages (
message VARCHAR(18) NOT NULL UNIQUE,
`channel` VARCHAR(18) NOT NULL,
`timestamp` TINYTEXT NOT NULL,
...,
PRIMARY KEY (message),
FOREIGN KEY (`channel`) REFERENCES channels (`channel`)
);
mysql> select * from channels limit 5;
+---------+-----+
| channel | ... |
+---------+-----+
| 456 | ... |
| 457 | ... |
| 458 | ... |
| 459 | ... |
| 460 | ... |
+---------+-----+
mysql> explain select * from channels;
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
| 1 | SIMPLE | channels | NULL | index | NULL | guild | 74 | NULL | 1434 | 100.00 | Using index |
+----+-------------+----------+------------+-------+---------------+-------+---------+------+------+----------+-------------+
CREATE TABLE channels (
`channel` VARCHAR(18) NOT NULL UNIQUE,
...,
PRIMARY KEY(`channel`),
FOREIGN KEY (guild) REFERENCES guilds (guild)
);
What I've tried
I've tried using a LEFT JOIN with WHERE clauses, but I'm no MySQL expert to say the least. There might be a way to do it this way, I just can't with my current knowledge. The issue I was facing with this approach is, I'm not sure how to access the original channel id for matching in the SELECT query.
DOES NOT WORK - Obviously, why else would I be here...
SET @min := 1657956885428;
SET @max := 1657966145675;
SELECT * FROM (SELECT channel FROM channels) as channels
LEFT JOIN (SELECT channel, message as downtime_start_after FROM messages WHERE timestamp < @min ORDER BY timestamp ASC LIMIT 1) AS after USING (`channel`)
LEFT JOIN (SELECT channel, message as downtime_end_before FROM messages WHERE timestamp > @max ORDER BY timestamp DESC LIMIT 1) AS before USING (`channel`);
+---------+----------+-----------+
| channel | ...after | ...before |
+---------+----------+-----------+
| 456 | NULL | NULL |
| 457 | 124 | NULL |
| 458 | NULL | 130 |
| 459 | NULL | NULL |
| 460 | NULL | NULL |
+---------+----------+-----------+
Ideal result
Both parameters after
and before
should be the message IDs which were tracked last and first respectively. I intend to use these to determine whether a message in a channel may have been within this period.
+---------+----------+-----------+
| channel | ...after | ...before |
+---------+----------+-----------+
| 456 | 123 | 128 |
| 457 | 124 | 129 |
| 458 | 125 | 130 |
| 459 | 126 | 131 |
| 460 | 127 | 132 |
+---------+----------+-----------+
of course, the data here is just filler
Thanks in advance. Lmk if I should include any additional information!
EDIT:
I spent some more time working on a query, and managed to find something which works. I'd be more than glad to hear any advice people have for optimisations.
SET @down := 1657956885428;
SELECT * FROM (SELECT channel FROM channels) as channels
LEFT JOIN (SELECT * FROM (SELECT channel, message as `after` FROM messages WHERE `timestamp` < @down ORDER BY `timestamp` DESC ) as after_messages GROUP BY (`channel`)) as `after` USING (channel)
LEFT JOIN (SELECT * FROM (SELECT channel, message as `before` FROM messages WHERE `timestamp` > @down ORDER BY `timestamp` ASC ) as before_messages GROUP BY (`channel`)) as `before` USING (channel);
+---------+-------+--------+
| channel | after | before |
+---------+-------+--------+
| 456 | 123 | 128 |
| 457 | 124 | 129 |
| 458 | 125 | 130 |
| 459 | 126 | 131 |
| 460 | 127 | 132 |
+---------+-------+--------+
-
Please get rid of the extraneous columns and shorten the numbers.Rick James– Rick James2022年07月17日 01:21:39 +00:00Commented Jul 17, 2022 at 1:21
-
Thanks for the advice @RickJames. I've made those changes now.jayc331– jayc3312022年07月17日 23:22:13 +00:00Commented Jul 17, 2022 at 23:22
1 Answer 1
the downtime started at timestamp x and ended at timestamp y. I want to return the last tracked message in channel z before timestamp x
Does this achieve that?
SELECT ...
FROM t
WHERE channel = z
AND timestamp <= x
ORDER BY timestamp DESC
LIMIT 1
Change three things to get the first item after z, then put them together thus:
( SELECT ... x ... )
UNION ALL
( SELECT ... y ... )
ORDER BY timestamp
Timestamp looks like a timestamp, why use TINYTEXT? Consider
BIGINT`. And add these indexes:
INDEX(timestamp)
INDEX(channel, timestamp)
will give you two rows.
More
A rewrite of the query you presented; this should run a lot faster:
[code]
SELECT channel,
(
SELECT message
FROM messages
WHERE channel = c.channel
AND timestamp < @down
ORDER BY timestamp DESC
LIMIT 1
) AS 'after',
(
SELECT message
FROM messages
WHERE channel = c.channel
AND timestamp > @down
ORDER BY timestamp ASC
LIMIT 1
) AS 'before',
FROM channels AS c
ORDER BY channel -- you may want this
;
And, again, messages
needs INDEX(channel, timestamp)
.
If there is a message at exactly @down, it will be missed. Consider changing > to >= or < to <= (but not both).
-
That's simple enough. However it doesn't really solve the problem. There are over a thousand records in the channels table, and I don't intend to do manually input the channel id for each query. I would like a query which can use the id in each channel record to use in the WHERE clause of a looping?-SELECT query. I found a way to resolve it - please reference the edit - though I'm pretty sure it's quite inefficient.jayc331– jayc3312022年07月17日 14:02:18 +00:00Commented Jul 17, 2022 at 14:02
-
@jayc331 - Thanks for the updates. You started with a LIMIT 1, but that seems to be gone. Do you want ONE message on either side? Or ALL messages on each side?Rick James– Rick James2022年07月18日 00:32:53 +00:00Commented Jul 18, 2022 at 0:32
-
Just the two extreme values, so ONE either side. My edit shows the desired result if you wanna refer to that.jayc331– jayc3312022年07月27日 10:06:57 +00:00Commented Jul 27, 2022 at 10:06
-
@jayc331 - I added a faster way to write it.Rick James– Rick James2022年07月28日 04:56:58 +00:00Commented Jul 28, 2022 at 4:56
-
Fantastic, works like a charm. Thanks for the fast responses.jayc331– jayc3312022年07月28日 15:17:05 +00:00Commented Jul 28, 2022 at 15:17
Explore related questions
See similar questions with these tags.