PostgreSQL query with max and min date plus associated id per row

Question 1

I have the following table:

CREATE TABLE trans (
 id SERIAL PRIMARY KEY,
 trans_date date,
 trans_time time 
);

I want to have the following view

CREATE OR REPLACE VIEW daily_trans AS
SELECT trans_date,
 max(trans_time) as first, 
 min(trans_time) as last,
 calculate_status(min(trans_time), max(trans_time)) as status 
GROUP BY trans_date

with columns that specify the ids of the max and min trans_time.

How do I do that?

Question 2

You can calculate this in one step with window functions:

CREATE OR REPLACE VIEW daily_trans AS
SELECT DISTINCT
 trans_date
 , first_value(trans_time) OVER w AS first_time
 , first_value(id) OVER w AS first_id
 , last_value(trans_time) OVER w AS last_time
 , last_value(id) OVER w AS last_id
 , calculate_status(min(trans_time) OVER w
 , max(trans_time) OVER w) AS status 
FROM trans
WINDOW w AS (PARTITION BY trans_date ORDER BY trans_time, id
 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY 1;

I added id as secondary ORDER column in the window clause, to make the sort order stable in case of identical times per day.

If you are not familiar with window functions, be sure to read this chapter of the manual.

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is a not so commonly used clause to adjust the window frame, because we want last_value() to always return the last value of the whole partition (the date), not the last value up to the current (including peers) according to the sort order.

How you to combine DISTINCT with window functions:

PostgreSQL: running count of rows for a query 'by minute'

Question 3

Roughly, you want something like

select min_trans_time, min_trans.id as min_trans_time_id,
 max_trans_time, max_trans.id as max_trans_time_id
from (
 select trans_date,
 max(trans_time) as max_trans_time, 
 min(trans_time) as min_trans_time,
 from trans 
 group by trans_date) min_max_trans,
 trans as min_trans,
 trans as max_trans
where min_trans.trans_time = min_max_trans.min_trans_time
and max_trans.trans_time = min_max_trans.max_trans_time

Which might not be stable if several transes share the same trans_time (that is, running the query on the same set of data might yield different results. One easy way of solving this, if it's a concern, is to pick the max or min id, for instance. Of course, this might bias the results :).

You can also use Window Functions if you are using PostgreSQL 8.4 or later; they will provide a clearer query (clear if you are familiar with window functions :), and they let you do stuff which is pretty hard to do with the above strategy (for instance, getting the 2nd-higher value instead of the maximum). However, in my short experience, they performed worse than the self-join approach I propose here. Erwin's answer contains a version of the query using window functions. It would be interesting to note which query executes better and/or can be optimized better (by adding indexes, etc.).

score 13 · Accepted Answer · 2011-12-27 19:50:34Z

You can calculate this in one step with window functions:

CREATE OR REPLACE VIEW daily_trans AS
SELECT DISTINCT
 trans_date
 , first_value(trans_time) OVER w AS first_time
 , first_value(id) OVER w AS first_id
 , last_value(trans_time) OVER w AS last_time
 , last_value(id) OVER w AS last_id
 , calculate_status(min(trans_time) OVER w
 , max(trans_time) OVER w) AS status 
FROM trans
WINDOW w AS (PARTITION BY trans_date ORDER BY trans_time, id
 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY 1;

I added id as secondary ORDER column in the window clause, to make the sort order stable in case of identical times per day.

If you are not familiar with window functions, be sure to read this chapter of the manual.

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is a not so commonly used clause to adjust the window frame, because we want last_value() to always return the last value of the whole partition (the date), not the last value up to the current (including peers) according to the sort order.

How you to combine DISTINCT with window functions:

PostgreSQL: running count of rows for a query 'by minute'

Stack Exchange Network

PostgreSQL query with max and min date plus associated id per row

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

PostgreSQL query with max and min date plus associated id per row

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions