postgresql: get last/newest value for each group of rows

Question 1

I have the following notifications table in postgresql:

- notification_id, bigint
- group_key, character varying(24)
- associated, jsonb object
- timestamp

The table now looks like this

+------------------------------------------------------+
| notification_id | group_key | associated | timestamp |
+------------------------------------------------------+
| 1 | key1 | {123} | ... |
| 2 | key1 | {456} | ... |
| 3 | key2 | {789} | ... |
+------------------------------------------------------+

The following query gives the following results:

SELECT
 a.group_key,
 MAX(a.notification_id) as max,
 COUNT(a.notification_id) as total
FROM
 data.notifications a
GROUP BY
 a.group_key
ORDER BY
 MAX(a.notification_id) DESC
+-------------------------+
| group_key | max | total |
+-------------------------+
| key1 | 2 | 2 |
| key2 | 3 | 1 |
+-------------------------+

Explanation: the notifications with ID 1 and 2 belong to the same notification group (key1). I want to fetch the max notification ID for the group (in this case 2), the number of notifications in the group (in this case also 2).

What I want now is also the content of the column 'associated' of the last notification, so in this case from notification with ID 2 (so in this case '{456}').

This column contains a jsonb object with various keys. I want to do joins from other tables with those keys, but I don't know how to get the last JSONB object for each notification group.

I tried using LAST_VALUES but I constantly get errors in pgadmin, saying that the column "associated" should be in the group by clause, but when I do this, I don't get the correct resultset returned.

SELECT
 a.group_key,
 MAX(a.notification_id) as max,
 COUNT(a.notification_id) as total,
 LAST_VALUE(a.associated) OVER (PARTITION BY a.group_key ORDER BY MAX(a.notification_id) DESC) as associated
FROM
 data.notifications a
GROUP BY
 a.group_key
ORDER BY
 MAX(a.notification_id) DESC

Question 2

This is a classic case where using the ROW_NUMBER() window function can be used to get the latest row within a grouping (PARTITION) like so:

WITH CTE_Notifications_Sorted AS
(
 SELECT notification_id, group_key, associated, timestamp, 
 COUNT(notification_id) OVER (PARTITION BY group_key) AS notification_id_count,
ROW_NUMBER() OVER (PARTITION BY group_key ORDER BY notification_id DESC) AS PartitionSortId -- Generates a unique ID for each row within the grouping (partition) of group_key ordered by the notification_id descending
 FROM data.notifications
)
SELECT group_key, notification_id AS notification_id_max, notification_id_count AS total, associated, timestamp
FROM CTE_Notifications_Sorted
WHERE PartitionSortId = 1 -- Filter out everything but the latest row of each group_key partition

By using the COUNT() function in a window function as well in the above CTE, you're able to remove your GROUP BY clause in the final SELECT.

Now that you have the associated column for the correct row, if you need to parse out a specific value you can follow this StackOverflow answer or leverage one of these JSON Functions and Operators.

Question 3

Thank you for your quick answer. I'll giv it a try!

Question 4

@yesterday Sorry I got pulled away while writing it up, I'm going to add more details because obviously the answer doesn't have the aggregation functions you were doing, but those can be added in as well. Also I think you'll want to use the LAST_VALUE function on top of the JSONB column in the final select.

Question 5

@Yesterday Ok I updated my answer. I looked a little closer and saw what you were trying to do. I believe my answer should give you what you need. I used COUNT() as a window function also that way you don't need to do any grouping in your final select. (Also note, instead of the column name max (which is a reserved keyword) for the max notification_id, I called it notification_id_max.)

J.D. J.D. 41.1k12 gold badges63 silver badges145 bronze badges · Answer 1 · 2021-02-12 14:15:19Z

This is a classic case where using the ROW_NUMBER() window function can be used to get the latest row within a grouping (PARTITION) like so:

WITH CTE_Notifications_Sorted AS
(
 SELECT notification_id, group_key, associated, timestamp, 
 COUNT(notification_id) OVER (PARTITION BY group_key) AS notification_id_count,
ROW_NUMBER() OVER (PARTITION BY group_key ORDER BY notification_id DESC) AS PartitionSortId -- Generates a unique ID for each row within the grouping (partition) of group_key ordered by the notification_id descending
 FROM data.notifications
)
SELECT group_key, notification_id AS notification_id_max, notification_id_count AS total, associated, timestamp
FROM CTE_Notifications_Sorted
WHERE PartitionSortId = 1 -- Filter out everything but the latest row of each group_key partition

By using the COUNT() function in a window function as well in the above CTE, you're able to remove your GROUP BY clause in the final SELECT.

Now that you have the associated column for the correct row, if you need to parse out a specific value you can follow this StackOverflow answer or leverage one of these JSON Functions and Operators.

@yesterday Sorry I got pulled away while writing it up, I'm going to add more details because obviously the answer doesn't have the aggregation functions you were doing, but those can be added in as well. Also I think you'll want to use the LAST_VALUE function on top of the JSONB column in the final select.
@Yesterday Ok I updated my answer. I looked a little closer and saw what you were trying to do. I believe my answer should give you what you need. I used COUNT() as a window function also that way you don't need to do any grouping in your final select. (Also note, instead of the column name max (which is a reserved keyword) for the max notification_id, I called it notification_id_max.)

Stack Exchange Network

postgresql: get last/newest value for each group of rows

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

postgresql: get last/newest value for each group of rows

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions