I have query in which returns averaged values by site and month. What I want to know is the min and max of those monthly averages by site and (the difficult part) the month when each occurred.
Here is an example:
CREATE TABLE events (
esite integer NOT NULL,
edate timestamp with time zone NOT NULL,
evalue integer NOT NULL
);
INSERT INTO events Values
(1, '2016-01-03', 11),
(2, '2016-01-05', 90),
(1, '2016-01-08', 7),
(2, '2016-01-10', 40),
(1, '2016-01-15', 12),
(1, '2016-01-18', 66),
(2, '2016-01-22', 54),
(2, '2016-02-03', 70),
(2, '2016-02-05', 56),
(1, '2016-02-08', 61),
(2, '2016-02-10', 23),
(1, '2016-02-15', 30),
(1, '2016-02-18', 15),
(1, '2016-02-22', 41);
I'm looking for a query that returns (by site) the min and max monthly average evalue
and the months at which the min and max values occurred. I can get this by using the query below:
select esite, date_trunc('month', edate) as emonth, round(avg(evalue),2) as evalue_avg from events
Group by esite, emonth
Produces the following output:
esite | emonth | evalue_avg
2 | January, 01 2016 00:00:00 | 61.33
1 | January, 01 2016 00:00:00 | 24
1 | February, 01 2016 00:00:00 | 36.75
2 | February, 01 2016 00:00:00 | 49.67
Now for the part I'm having difficultly with, I need produce the following result - basically the min, max values and the date(month) at which each occurred by site.
Desired Output:
-- (one row per site)
esite | avg_min | eval_avg_min_date | avg_max | eval_avg_max
1 | 24.00 | January, 01 2016 00:00:00 | 36.75 | February, 01 2016 00:00:00
2 | 49.67 |February, 01 2016 00:00:00 | 61.33 | January, 01 2016 00:00:00
I've searched around and have seen some examples using windowing and lateral joins, but I haven't been successful in getting any of them to work. This might be a pivot but since I don't have a fixed number of sites, this tends to be difficult with PostgreSQL.
I'm guessing if anyone has an easy way to do this using Postgres 9.4+.
-
Something like "select min(avg_value), max(avg_value), month from _grp group by month"? Note: not tested.Vérace– Vérace2016年05月17日 14:31:22 +00:00Commented May 17, 2016 at 14:31
-
@Verace. I'm looking for a little more than that and I've updated the question to hopefully communicate it better. :)warchitect– warchitect2016年05月17日 16:00:36 +00:00Commented May 17, 2016 at 16:00
2 Answers 2
The question is old, but the CommunityBot keeps bumping it. Let's add a proper answer.
Move your avg calculation into a CTE, and use the results in two joined DISTINCT ON
queries:
WITH cte AS (
SELECT esite, date_trunc('month', edate) AS emonth, round(avg(evalue), 2) AS evalue_avg
FROM events
GROUP BY esite, emonth
)
SELECT *
FROM (
SELECT DISTINCT ON (esite)
esite, evalue_avg, emonth
FROM cte
ORDER BY esite, evalue_avg, emonth
) a
JOIN (
SELECT DISTINCT ON (esite)
esite, evalue_avg, emonth
FROM cte
ORDER BY esite, evalue_avg DESC, emonth
) z USING (esite);
db<>fiddle here
(Would even work in the outdated Postgres 9.4.)
See:
I would recommend you do something like this - and don't go down the route of having a row per month.
What happens when you have 5 (10...15... x) years' data?
This is superior IMHO. You can always use the CROSSTAB
table function, but I would recommend against it.
SELECT
esite,
MIN(evalue),
MAX(evalue),
AVG(evalue)::INT,
EXTRACT(MONTH FROM edate) AS emonth,
CASE
WHEN EXTRACT(MONTH FROM edate) = 1 THEN 'January'
WHEN EXTRACT(MONTH FROM edate) = 2 THEN 'February'
-- <... fill in rest of months here ...>
END AS litMonth
FROM events
GROUP BY esite, emonth, litMonth
ORDER BY esite, emonth -- , min, max, avg;
This gives the result:
esite; min; max; avg; emonth; litmonth
----------------------------------------------------
1; 7; 66; 24; 1; January
1; 15; 61; 37; 2; February
2; 40; 90; 61; 1; January
2; 23; 70; 50; 2; February
Note, that I have CAST
the AVG as an INTEGER
- you can of course ROUND
this off to whatever you like. You can add EXTRACT(YEAR FROM edate) AS eyear,
after the EXTRACT(MONTH...
for more clarity and a more elegant result.
-
Thanks for your effort but its still not quiet what I'm looking for. I'm looking for the min and max of the monthly averages (which are already calculated by the first query) and then getting the date of when those min and max values occurred (still within the aggregate result). So my issue is how to get the first query into my desired output and not necessarily with the date formatting.warchitect– warchitect2016年05月17日 18:38:10 +00:00Commented May 17, 2016 at 18:38
-
Suppose two/three/..n minima/maxima are the same for a given esite for a given month?Vérace– Vérace2016年05月17日 22:00:32 +00:00Commented May 17, 2016 at 22:00
Explore related questions
See similar questions with these tags.