Using MySQL 5.7, I am trying to create a summary of events given a list of events.
The event table looks like this:
mysql> select camtimestamp,camname from events where camtimestamp >= "2019-12-31" and camtimestamp < "2020-01-02" limit 10;
+---------------------+---------+
| camtimestamp | camname |
+---------------------+---------+
| 2019年12月31日 00:47:04 | wls1 |
| 2019年12月31日 04:57:41 | wls1 |
| 2019年12月31日 05:15:22 | wls1 |
| 2019年12月31日 05:35:00 | wls1 |
| 2019年12月31日 05:46:56 | wls1 |
| 2019年12月31日 05:51:30 | wls1 |
| 2019年12月31日 05:54:19 | wls1 |
| 2019年12月31日 05:56:14 | wls1 |
| 2019年12月31日 05:58:45 | wls1 |
| 2019年12月31日 05:58:46 | wls1 |
+---------------------+---------+
10 rows in set (0.00 sec)
I want the summary to look like this except I want the camtimestamp in the first column to increment by the hour (or what ever increments I choose) and the count to be a summary of all events between the first row and the second.
mysql> select camtimestamp,camname,count(camname) from events where camtimestamp >= "2019-12-31" and camtimestamp < "2020-01-02" group by camtimestamp,camname limit 10;
+---------------------+---------+----------------+
| camtimestamp | camname | count(camname) |
+---------------------+---------+----------------+
| 2019年12月31日 00:47:04 | wls1 | 1 |
| 2019年12月31日 04:57:41 | wls1 | 1 |
| 2019年12月31日 05:15:22 | wls1 | 1 |
| 2019年12月31日 05:35:00 | wls1 | 1 |
| 2019年12月31日 05:46:56 | wls1 | 1 |
| 2019年12月31日 05:51:30 | wls1 | 1 |
| 2019年12月31日 05:54:19 | wls1 | 1 |
| 2019年12月31日 05:56:14 | wls1 | 1 |
| 2019年12月31日 05:58:45 | wls1 | 1 |
| 2019年12月31日 05:58:46 | wls1 | 1 |
+---------------------+---------+----------------+
10 rows in set (0.00 sec)
What I am really going for is this (as a sample) result:
Date wls1 wls2
2020年01月05日 12:00:00 5 7
2020年01月05日 13:00:00 10 3
2020年01月05日 14:00:00 2 0
2020年01月05日 15:00:00 0 0
2020年01月05日 16:00:00 0 2
2020年01月05日 17:00:00 0 4
2020年01月05日 18:00:00 1 23
2020年01月05日 19:00:00 2 3
2020年01月05日 20:00:00 3 4
2020年01月05日 21:00:00 4 5
I have dabbled with creating a table that contains only timestamps from now until 2040 in hourly increments and using a join in my SELECT statement, but I can't seem to get that join between the two tables quite right.
EDIT
The answer below gets most of the way to answering my question. It certainly shows me that I have a ways to go to learn about mySQL and general db work.
After slightly modifying the SELECT statement it is as follows:
SELECT
startpit,
SUM(IF(camname='wls1', 1,0)) as wls1,
SUM(IF(camname='wls2', 1,0)) as wls2
FROM
(
SELECT
FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(camtimestamp)/@interval) * @interval
) AS startpit,
camname,
count(*) as num
FROM events,
(
SELECT @interval := 3600) AS init
GROUP BY startpit, camname
) AS baseview
GROUP BY startpit;
It produces this output:
+---------------------+------+------+
| startpit | wls1 | wls2 |
+---------------------+------+------+
| 2019年05月01日 21:00:00 | 0 | 1 |
| 2019年05月01日 22:00:00 | 0 | 1 |
| 2019年05月02日 06:00:00 | 0 | 1 |
| 2019年05月02日 07:00:00 | 1 | 1 |
| 2019年05月02日 08:00:00 | 1 | 1 |
| 2019年05月02日 09:00:00 | 1 | 1 |
| 2019年05月02日 10:00:00 | 1 | 1 |
| 2019年05月02日 11:00:00 | 1 | 1 |
| 2019年05月02日 12:00:00 | 1 | 1 |
| 2019年05月02日 13:00:00 | 1 | 1 |
+---------------------+------+------+
I can see how to get these results and am warned about the only_full_group_by setting in MySQL 5.7.
There are still time gaps in the output where I would prefer to have zeros in both camname columns. I can see that my PHP will have to dynamically create the SQL select statements and dynamically process the results/output.
Thank you for taking the time to answer my question!
-
In my last sample query result at the end, I show that between 12:00 and 13:00, there are 5 wls1 events and 7 wls2 events. That's what I meant. Sorry for the confusion.user3055756– user30557562020年01月12日 02:47:25 +00:00Commented Jan 12, 2020 at 2:47
1 Answer 1
2 Tricks needed:
- UNIX_TIMESTAMP() allows application of integer division to datetimes
- Pivoting isn't part of SQL proper, but
group_concat
can do something similar.
Putting it together:
select
startpit,
FROM_UNIXTIME(UNIX_TIMESTAMP(startpit)+@interval) as endpit,
group_concat(concat(camname,': ',num)) as events
from (
select
from_unixtime(
floor(
unix_timestamp(camtimestamp)/@interval
)*@interval
) as startpit,
camname,
count(*) as num
from events,
(select @interval:=3600) as init
-- where
group by floor(unix_timestamp(camtimestamp)/@interval)*@interval, camname
) as baseview
group by startpit;
See this SQLfiddle (I replaced changed every second rows from 'wsl1' to 'wsl2' since your test data has only one cam.
EDIT
MySQL 8.0 defaults to a very buggy implementation of strict grouping. It will e.g. think, that FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(camtimestamp)/@interval) * @interval
is (quote) "not functionally dependent" on FLOOR(UNIX_TIMESTAMP(camtimestamp)/@interval) * @interval
. This can easily be resolved by using the name of the calcualted column (startpit
). So
SELECT
startpit,
FROM_UNIXTIME(UNIX_TIMESTAMP(startpit)+@interval) AS endpit,
GROUP_CONCAT(CONCAT(camname,': ',num)) AS the_events
FROM
(
SELECT
FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(camtimestamp)/@interval) * @interval
) AS startpit,
camname,
count(*) as num
FROM event,
(
SELECT @interval := 3600) AS init
-- where
GROUP BY startpit, camname
) AS baseview
GROUP BY startpit;
works with MySQL 8.0, while the functionally identical original does not. This is not a flaw in the SQL, but in how MySQL 8.0 understands it. Unexperienced SQL programmers might consider this a good thing.
EDIT 2
There is no pivoting in SQL, i.e. the number of columns can not depend on the data so the format with a variable number of columns and an integer in each is not reachable via pure SQL.
Assuming, that the SQL is generated by another layer higher in the stack you can reach it by creating SQL along the lines of
SELECT
startpit,
FROM_UNIXTIME(UNIX_TIMESTAMP(startpit)+@interval) AS endpit,
SUM(IF(camname='wls1', 1,0)) as wls1,
SUM(IF(camname='wls2', 1,0)) as wls2,
SUM(IF(camname='wls3', 1,0)) as wls3
-- and so on
FROM
(
SELECT
FROM_UNIXTIME(FLOOR(UNIX_TIMESTAMP(camtimestamp)/@interval) * @interval
) AS startpit,
camname,
count(*) as num
FROM event,
(
SELECT @interval := 3600) AS init
-- where
GROUP BY startpit, camname
) AS baseview
GROUP BY startpit;
You can even put creating this SQL as a string into a stored procedure and then use PREPARE STATEMENT
and EXECUTE
to run it. Be advised, that tis creates a stored procedure where the number of output columns depends on the data, which is unusual at least.
-
OK. I see that it is counting events between startpit and endpit. It calculates endpit based on startpit. How would I eliminate gaps in this approach? For example, the sample data has a gap of a couple of hours between 00:47 and 04:57? For the 01:00, 02:00, 03:00 startpits, I would like to have an event count of 0. I am planning to take the output of this query to create a graph with google's API.user3055756– user30557562020年01月12日 06:10:17 +00:00Commented Jan 12, 2020 at 6:10
-
Also I see other applications for the group_concat() function, but becaause I need am integer in that column for graphing purposes, the mixed data in the field won't suit my purposes.user3055756– user30557562020年01月12日 06:19:03 +00:00Commented Jan 12, 2020 at 6:19
-
The SQLfiddle website is cool!user3055756– user30557562020年01月12日 06:19:29 +00:00Commented Jan 12, 2020 at 6:19
-
@user3055756 unfortunately, this SQL is flawed. It might provide answers, but they won't always be the correct ones! If you run it against MySQL version 8, it fails with the error
... this is incompatible with sql_mode=only_full_group_by
.ONLY_FULL_GROUP_BY
is now the default (as it should be) for MySQL and not using it risks returning (ahem...) inconsistent results (i.e. the wrong results). What version of SQL are you running - very important for MySQL questions - it has introduced many important new features.Vérace– Vérace2020年01月12日 07:12:16 +00:00Commented Jan 12, 2020 at 7:12 -
p.s. I would recommend that you use dbfiddle.uk for your SQL in future - it's actively maintained, has way more servers and is the fiddle of choice here on DBA.SE!Vérace– Vérace2020年01月12日 07:16:46 +00:00Commented Jan 12, 2020 at 7:16