I have two queries which are doing basically the same thing, but have different grouping, the first query (query 1) is used to populate a chart, and the second to populate a table.
Query 1:
SELECT key_id
,sum(salary)
,sum(bonus)
,created_at
FROM table
WHERE emp_id = 1
GROUP BY key_id, created_at
Query 2:
SELECT key_id
,sum(salary)
,sum(bonus)
,count(*) OVER() AS full_count
FROM table
WHERE emp_id = 1
GROUP BY key_id
I have created a function that returns those two queries in a json format, chart: [...], table: [...]
. The problem is that I need to query the same table twice, because of my grouping. Is there any way of dealing with this situations?
-
If the number of rows coming back is small enough, it's completely legitimate to do query 1 and then do a second roll up in code/memory instead of query 2.jpmc26– jpmc262016年01月26日 00:06:14 +00:00Commented Jan 26, 2016 at 0:06
3 Answers 3
The new 9.5 version has added CUBE
, GROUPING SETS
and ROLLUP
extensions to GROUP BY
which can be used for such queries. If you have to do this in previous versions, one way is using a CTE, with something like (note that it's not the same exact output as your query, since I didn't know the type of created_at
. I guess it's a timestamp and didn't want to mix it with the integer count):
WITH grp AS
( SELECT key_id,
sum(salary) AS salary,
sum(bonus) AS bonus,
created_at,
count(*) AS full_count
FROM table
WHERE emp_id = 1
GROUP BY key_id, created_at
)
SELECT *
FROM grp
UNION ALL
SELECT key_id,
sum(salary),
sum(bonus),
NULL,
sum(full_count)
FROM grp
GROUP BY key_id
ORDER BY key_id, created_at NULLS LAST;
-
1@AndriyM: I think so, too. It's the difference I tried to address with
count_distinct_created_at_per_key
vs.count_rows_per_key
in my query.Erwin Brandstetter– Erwin Brandstetter2016年01月25日 19:11:33 +00:00Commented Jan 25, 2016 at 19:11 -
@Erwin and AndriyM thanx, I missed that.ypercubeᵀᴹ– ypercubeᵀᴹ2016年01月25日 19:58:06 +00:00Commented Jan 25, 2016 at 19:58
Depending on how you want to present your data, you could do it in a single query, a single query-level, even:
SELECT key_id
, created_at
, sum(salary) AS sum_salary -- numbers per (key_id, created_at)
, sum(bonus) AS sum_bonus
, count(*) AS count_rows
, sum(sum(salary)) OVER w AS total_salary_per_key -- totals per (key_id)
, sum(sum(bonus)) OVER w AS total_bonus_per_key
, count(*) OVER w AS count_distinct_created_at_per_key
, sum(count(*)) OVER w AS count_rows_per_key
FROM tbl
WHERE emp_id = 1
GROUP BY key_id, created_at
WINDOW w AS (PARTITION BY key_id);
This lists totals per key
(repeatedly) for every aggregated row per (key_id, created_at)
.
Detailed explanation for the technique:
If you actually want to implement ROLLUP
see @ypercube's answer or this related one:
Since each row's values are aggregated twice, you could also try duplicating each row in a manner that would allow you to perform both level's aggregations in one pass. The query below uses LATERAL for that purpose:
SELECT
v.key_id,
v.created_at,
SUM(salary) AS total_salary,
SUM(bonus) AS total_bonus,
COUNT(*) AS full_count
FROM
table AS t,
LATERAL
(VALUES
(t.key_id, t.created_at, t.salary, t.bonus),
(t.key_id, NULL, t.salary, t.bonus)
) AS v
GROUP BY
v.key_id,
v.created_at
ORDER BY
v.key_id ASC,
v.created_at ASC NULLS LAST
;
As you can see, one of the copies has NULL instead of created_at
and the query groups by the copied key_id
and created_at
rather then the original ones. The effect of this is that one half of the row set effectively gets grouped by key_id
alone, and so you get aggregated results for both levels in one go.
I have to admit, though, that I have no PostgreSQL instance to test this solution and thus no idea how it will perform compared to the other suggestions posted.
Explore related questions
See similar questions with these tags.