Querying sums of grouped consecutive rows in PostgreSQL 9

Question 1

I have data about people traveling in different countries like this:

country | begintimestamp | distance 
Germany | 2015年01月01日 00:00:00 | 100
Germany | 2015年01月01日 01:12:13 | 30
France | 2015年01月01日 02:13:14 | 40
France | 2015年01月01日 03:14:15 | 20
Spain | 2015年01月01日 04:15:16 | 10
France | 2015年01月01日 05:16:17 | 30
France | 2015年01月01日 05:17:18 | 5
Germany | 2015年01月01日 06:18:19 | 3

What I need is to be able to receive a result like this - the distance of consecutive rows summed with the earliest begintimestamp:

country | begintimestamp | distance
Germany | 2015年01月01日 00:00:00 | 130 // 100+30, the distance of two first rows summed.
France | 2015年01月01日 02:13:14 | 60 // 40+20
Spain | 2015年01月01日 04:15:16 | 10 // 
France | 2015年01月01日 05:16:17 | 35 // 30+5
Germany | 2015年01月01日 06:18:19 | 3

I've tried to play around with PG window functions but have not been able to come up with anything that would lead me closer to the result.

Question 2

select min(country) as country,
 min(begintimestamp) as first_begin_ts, 
 sum(distance) as distance
from (
 select t1.*, 
 sum(group_flag) over (order by begintimestamp) as grp
 from (
 select *,
 case
 when lag(country) over (order by begintimestamp) = country then null
 else 1
 end as group_flag
 from travel
 ) t1
) t2
group by grp
order by first_begin_ts;

The inner most query (alias t1) creates a number each time the country changes). The second level query (alias t2) then does a running sum on those flags, which essentially gives each consecutive set of countries a different number. The outer most query then groups by that number and sums the distance. The min(country) is necessary to make the group by operator happy, but as all rows with the same grp have the same country anyway, it doesn't matter.

SQLFiddle: http://sqlfiddle.com/#!15/fe341/1

user1822user1822 · Accepted Answer · 2015-02-03 21:53:46Z

select min(country) as country,
 min(begintimestamp) as first_begin_ts, 
 sum(distance) as distance
from (
 select t1.*, 
 sum(group_flag) over (order by begintimestamp) as grp
 from (
 select *,
 case
 when lag(country) over (order by begintimestamp) = country then null
 else 1
 end as group_flag
 from travel
 ) t1
) t2
group by grp
order by first_begin_ts;

The inner most query (alias t1) creates a number each time the country changes). The second level query (alias t2) then does a running sum on those flags, which essentially gives each consecutive set of countries a different number. The outer most query then groups by that number and sums the distance. The min(country) is necessary to make the group by operator happy, but as all rows with the same grp have the same country anyway, it doesn't matter.

SQLFiddle: http://sqlfiddle.com/#!15/fe341/1

Stack Exchange Network

Querying sums of grouped consecutive rows in PostgreSQL 9

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Querying sums of grouped consecutive rows in PostgreSQL 9

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions