5

This question is an extension to a question I've previously asked that was overly simplified. The more accurate example is demonstrated in this SQLFiddle, where I demonstrate a working (but slow) solution, followed by my attempt to adapt the previous answer to the actual problem.

The actual problem comes because the two tables contain events for multiple timelines.

CREATE TABLE foo (ts int, id text, foo text);
INSERT INTO foo (ts, id, foo)
VALUES
 (1, 'A', 'Lorem'),
 (1, 'B', 'ipsum'),
 (4, 'B', 'dolor'),
 (5, 'A', 'sit'),
 (8, 'A', 'amet'),
 (8, 'B', 'consectetur');
CREATE TABLE bar (ts int, id text, bar text);
INSERT INTO bar (ts, id, bar)
VALUES
 (1, 'A', 'adipiscing'),
 (5, 'B', 'elit'),
 (6, 'A', 'sed'),
 (9, 'B', 'do ');

Each table has events for timelines 'A' and 'B'. The goal is to combined the results in to a single result set showing the "state" of each timeline. The two timelines are orthogonal.

ts id foo bar
1 A Lorem adipiscing
5 A sit adipiscing
6 A sit sed
8 A amet sed
1 B ipsum (null)
4 B dolor (null)
5 B dolor elit
8 B consectetur elit
9 B consectetur do
asked Jul 9, 2015 at 21:55

2 Answers 2

4

In addition to the solution of the simple case, add a PARTITION clause to the window functions in the inner query, to get group numbers per partition (per "timeline"). Combine group numbers with the respective timeline (id in your example) keep partitions separate in the second step:

SELECT id, ts
 , min(foo) OVER (PARTITION BY id, foo_grp) AS foo
 , min(bar) OVER (PARTITION BY id, bar_grp) AS bar
FROM (
 SELECT id, ts, f.foo, b.bar
 , count(f.foo) OVER (PARTITION BY id ORDER BY ts) AS foo_grp
 , count(b.bar) OVER (PARTITION BY id ORDER BY ts) AS bar_grp
 FROM foo f
 FULL JOIN bar b USING (id, ts)
 ) sub
ORDER BY 1, 2;

Result as requested (except with id first).
SQL Fiddle

Your attempt to adapt the previous solution was very close. It didn't work because of (削除) PARTITION BY f.id (削除ここまで) / (削除) PARTITION BY b.id (削除ここまで) instead of PARTITION BY id. You really want the combined id to include missing rows in the result - that's where the last non-null value has to be filled in for the missing (NULL) value.

If performance is your paramount requirement consider a server-side function like demonstrated in the previous answer.

answered Jul 9, 2015 at 22:11
2
  • Unless I'm missing something, that looks almost exactly like what I have in the second half of my SQLFiddle, and it returns different results than the first half. Commented Jul 9, 2015 at 22:27
  • @ChristopherCurrie: I added some explanation. And no, same result as your first half. See: sqlfiddle.com/#!15/e6ecb/7 Commented Jul 9, 2015 at 22:28
0

My existing solution is as follows:

SELECT *
FROM (
 SELECT ts, id, foo, bar
 FROM foo
 LEFT JOIN LATERAL (
 SELECT distinct on (id) bar
 FROM bar
 WHERE bar.id = foo.id
 AND bar.ts <= foo.ts 
 ORDER BY id, ts desc
 ) b ON true
 UNION
 SELECT ts, id, foo, bar
 FROM bar
 LEFT JOIN LATERAL (
 SELECT distinct on (id) foo
 FROM foo
 WHERE bar.id = foo.id
 AND foo.ts <= bar.ts
 ORDER BY id, ts desc
 ) f ON true 
) sub
ORDER BY id, ts;

This query returns the results shown in question, but the explain on the results is pretty grisly, with only 300 rows in the 'foo' table and 12k rows in the 'bar'.

answered Jul 9, 2015 at 22:09
2
  • Wouldn't UNION ALL result in duplicate rows, if both tables had an event at the same timestamp? Commented Jul 9, 2015 at 22:24
  • Right, I missed that you did not use FULL JOIN in this approach. Either way, the alternative should be substantially faster. Commented Jul 9, 2015 at 22:49

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.