9

I have the following table:

CREATE TABLE trans (
 id SERIAL PRIMARY KEY,
 trans_date date,
 trans_time time 
);

I want to have the following view

CREATE OR REPLACE VIEW daily_trans AS
SELECT trans_date,
 max(trans_time) as first, 
 min(trans_time) as last,
 calculate_status(min(trans_time), max(trans_time)) as status 
GROUP BY trans_date 

with columns that specify the ids of the max and min trans_time.

How do I do that?

Erwin Brandstetter
186k28 gold badges463 silver badges636 bronze badges
asked Dec 27, 2011 at 19:15

2 Answers 2

13

You can calculate this in one step with window functions:

CREATE OR REPLACE VIEW daily_trans AS
SELECT DISTINCT
 trans_date
 , first_value(trans_time) OVER w AS first_time
 , first_value(id) OVER w AS first_id
 , last_value(trans_time) OVER w AS last_time
 , last_value(id) OVER w AS last_id
 , calculate_status(min(trans_time) OVER w
 , max(trans_time) OVER w) AS status 
FROM trans
WINDOW w AS (PARTITION BY trans_date ORDER BY trans_time, id
 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY 1;

I added id as secondary ORDER column in the window clause, to make the sort order stable in case of identical times per day.

If you are not familiar with window functions, be sure to read this chapter of the manual.

ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING is a not so commonly used clause to adjust the window frame, because we want last_value() to always return the last value of the whole partition (the date), not the last value up to the current (including peers) according to the sort order.

How you to combine DISTINCT with window functions:

answered Dec 27, 2011 at 19:50
0
5

Roughly, you want something like

select min_trans_time, min_trans.id as min_trans_time_id,
 max_trans_time, max_trans.id as max_trans_time_id
from (
 select trans_date,
 max(trans_time) as max_trans_time, 
 min(trans_time) as min_trans_time,
 from trans 
 group by trans_date) min_max_trans,
 trans as min_trans,
 trans as max_trans
where min_trans.trans_time = min_max_trans.min_trans_time
and max_trans.trans_time = min_max_trans.max_trans_time

Which might not be stable if several transes share the same trans_time (that is, running the query on the same set of data might yield different results. One easy way of solving this, if it's a concern, is to pick the max or min id, for instance. Of course, this might bias the results :).

You can also use Window Functions if you are using PostgreSQL 8.4 or later; they will provide a clearer query (clear if you are familiar with window functions :), and they let you do stuff which is pretty hard to do with the above strategy (for instance, getting the 2nd-higher value instead of the maximum). However, in my short experience, they performed worse than the self-join approach I propose here. Erwin's answer contains a version of the query using window functions. It would be interesting to note which query executes better and/or can be optimized better (by adding indexes, etc.).

answered Dec 27, 2011 at 19:35
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.