Optimisation postgres windows query

Asked 8 years ago

Viewed 69 times

I need some guidance with this query being too slow.

SELECT DISTINCT ON (id)
 id,
 views - lead(views)
 OVER (PARTITION BY id
 ORDER BY update_date DESC) vdiff
FROM videoupdate;

With 10 million+ rows it takes ~30 seconds. I have created a multicolumn index that reduced the original time from 1 minute. I want to see difference between views for each row partitioned by id. Some thoughts I had:

After table update create TABLE AS with the query and select from it.
Move old data to backup and shrink table.
Look up data warehouse?
Change database schema?

postgresql

Improve this question

edited Sep 9, 2017 at 16:57

Evan Carroll's user avatar

Evan Carroll

65.7k50 gold badges259 silver badges510 bronze badges

asked Sep 9, 2017 at 13:47

Misa's user avatar

Misa Misa

191 bronze badge

1

I'm not sure what that returns because it's undefined. You're using DISTINCT ON without and ORDER BY? I'm assuming it's ordered by update _date DESC

Evan Carroll
– Evan Carroll

2017年09月09日 14:17:32 +00:00
Commented Sep 9, 2017 at 14:17
I figured I didn't need to add ORDER BY when I have it in the window function. It should be ordered by update_date, yes. I believe I tested this. Nonetheless, the query still takes too long. @EvanCarroll

Misa
– Misa

2017年09月09日 14:33:08 +00:00
Commented Sep 9, 2017 at 14:33
1

You could try to remove the distinct on and use a row_number() over the same window as the lead() function and use that row number to get the distinct ID.

user1822
– user1822

2017年09月09日 16:22:35 +00:00
Commented Sep 9, 2017 at 16:22
@a_horse_with_no_name I don't think that will work, because he needs the lead() and WHERE runs before the window function. So the distinct on is still the best bet.

Evan Carroll
– Evan Carroll

2017年09月09日 17:03:02 +00:00
Commented Sep 9, 2017 at 17:03
No I mean something like this: privatebin.net/…

user1822
– user1822

2017年09月09日 22:02:35 +00:00
Commented Sep 9, 2017 at 22:02

| Show 1 more comment

1 Answer 1

Sorted by: Reset to default

Following @a_horse_with_no_name's suggestion again, because he's really smart though super, super-resilient to using the Post Your Answer functionality.

SELECT DISTINCT ON(id),
 id,
 views - lead(views) OVER (PARTITION BY id ORDER BY update_date DESC) AS vdiff
FROM (
 SELECT id,
 views,
 update_desc,
 row_number() OVER (PARTITION BY id ORDER BY update_date DESC) AS rn
 FROM videoupdate
) AS t
WHERE rn <=2
ORDER BY id, update_desc DESC;

Improve this answer

edited Sep 9, 2017 at 17:03

answered Sep 9, 2017 at 16:46

Evan Carroll's user avatar

Evan Carroll Evan Carroll

65.7k50 gold badges259 silver badges510 bronze badges

Thank you very much for your answer! I will try it out as soon as possible. @evancarrol

Misa
– Misa

2017年09月09日 17:02:27 +00:00
Commented Sep 9, 2017 at 17:02
@Misa not yet, it's not right.

Evan Carroll
– Evan Carroll

2017年09月09日 17:02:45 +00:00
Commented Sep 9, 2017 at 17:02
@Misa try that. =)

Evan Carroll
– Evan Carroll

2017年09月09日 17:04:04 +00:00
Commented Sep 9, 2017 at 17:04

Add a comment |

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

postgresql

See similar questions with these tags.

lang-sql

Stack Exchange Network

Optimisation postgres windows query

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Optimisation postgres windows query

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions