Avoid double query on the same table

Question 1

I have two queries which are doing basically the same thing, but have different grouping, the first query (query 1) is used to populate a chart, and the second to populate a table.

Query 1:

SELECT key_id
 ,sum(salary)
 ,sum(bonus)
 ,created_at
FROM table
WHERE emp_id = 1
GROUP BY key_id, created_at

Query 2:

SELECT key_id
 ,sum(salary)
 ,sum(bonus)
 ,count(*) OVER() AS full_count
FROM table
WHERE emp_id = 1
GROUP BY key_id

I have created a function that returns those two queries in a json format, chart: [...], table: [...]. The problem is that I need to query the same table twice, because of my grouping. Is there any way of dealing with this situations?

Question 2

If the number of rows coming back is small enough, it's completely legitimate to do query 1 and then do a second roll up in code/memory instead of query 2.

Question 3

The new 9.5 version has added CUBE, GROUPING SETS and ROLLUP extensions to GROUP BY which can be used for such queries. If you have to do this in previous versions, one way is using a CTE, with something like (note that it's not the same exact output as your query, since I didn't know the type of created_at. I guess it's a timestamp and didn't want to mix it with the integer count):

WITH grp AS
 ( SELECT key_id,
 sum(salary) AS salary,
 sum(bonus) AS bonus,
 created_at,
 count(*) AS full_count
 FROM table
 WHERE emp_id = 1
 GROUP BY key_id, created_at
 )
SELECT *
FROM grp
UNION ALL
SELECT key_id, 
 sum(salary),
 sum(bonus), 
 NULL,
 sum(full_count) 
FROM grp
GROUP BY key_id 
ORDER BY key_id, created_at NULLS LAST;

Question 4

@AndriyM: I think so, too. It's the difference I tried to address with count_distinct_created_at_per_key vs. count_rows_per_key in my query.

Question 5

@Erwin and AndriyM thanx, I missed that.

Question 6

Depending on how you want to present your data, you could do it in a single query, a single query-level, even:

SELECT key_id
 , created_at
 , sum(salary) AS sum_salary -- numbers per (key_id, created_at)
 , sum(bonus) AS sum_bonus
 , count(*) AS count_rows
 , sum(sum(salary)) OVER w AS total_salary_per_key -- totals per (key_id)
 , sum(sum(bonus)) OVER w AS total_bonus_per_key
 , count(*) OVER w AS count_distinct_created_at_per_key
 , sum(count(*)) OVER w AS count_rows_per_key
FROM tbl
WHERE emp_id = 1
GROUP BY key_id, created_at
WINDOW w AS (PARTITION BY key_id);

This lists totals per key (repeatedly) for every aggregated row per (key_id, created_at).

Detailed explanation for the technique:

Best way to get result count before LIMIT was applied in PHP/PostgreSQL

If you actually want to implement ROLLUP see @ypercube's answer or this related one:

Grouping() equivalent in PostgreSQL?

Question 7

Since each row's values are aggregated twice, you could also try duplicating each row in a manner that would allow you to perform both level's aggregations in one pass. The query below uses LATERAL for that purpose:

SELECT
 v.key_id,
 v.created_at,
 SUM(salary) AS total_salary,
 SUM(bonus) AS total_bonus,
 COUNT(*) AS full_count
FROM
 table AS t,
 LATERAL
 (VALUES
 (t.key_id, t.created_at, t.salary, t.bonus),
 (t.key_id, NULL, t.salary, t.bonus)
 ) AS v
GROUP BY
 v.key_id,
 v.created_at
ORDER BY
 v.key_id ASC,
 v.created_at ASC NULLS LAST
;

As you can see, one of the copies has NULL instead of created_at and the query groups by the copied key_id and created_at rather then the original ones. The effect of this is that one half of the row set effectively gets grouped by key_id alone, and so you get aggregated results for both levels in one go.

I have to admit, though, that I have no PostgreSQL instance to test this solution and thus no idea how it will perform compared to the other suggestions posted.

ypercubeTM ypercubeTM 99.7k13 gold badges217 silver badges306 bronze badges · Accepted Answer · 2016-01-25 14:07:49Z

The new 9.5 version has added CUBE, GROUPING SETS and ROLLUP extensions to GROUP BY which can be used for such queries. If you have to do this in previous versions, one way is using a CTE, with something like (note that it's not the same exact output as your query, since I didn't know the type of created_at. I guess it's a timestamp and didn't want to mix it with the integer count):

WITH grp AS
 ( SELECT key_id,
 sum(salary) AS salary,
 sum(bonus) AS bonus,
 created_at,
 count(*) AS full_count
 FROM table
 WHERE emp_id = 1
 GROUP BY key_id, created_at
 )
SELECT *
FROM grp
UNION ALL
SELECT key_id, 
 sum(salary),
 sum(bonus), 
 NULL,
 sum(full_count) 
FROM grp
GROUP BY key_id 
ORDER BY key_id, created_at NULLS LAST;

@AndriyM: I think so, too. It's the difference I tried to address with count_distinct_created_at_per_key vs. count_rows_per_key in my query.

Stack Exchange Network

Avoid double query on the same table

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Avoid double query on the same table

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions