Let's say I need a query to see sales results by Month and I want the result-set to include results for each month whether there were any sales in that month or not.
And let's say my table "sales" has the columns "sales_date" and "sales_amount" and I want a result-set to look like:
Jan, 100.00
Feb, 240.00
Mar, 0.00
Apr, 430.00
So, if I want to sum the values by month and there aren't any records for Mar, I still want to see a result for Mar that's 0. How would I do that?
-
What version of postgres?Jack Douglas– Jack Douglas2011年09月12日 16:27:07 +00:00Commented Sep 12, 2011 at 16:27
2 Answers 2
The usual way to get a series in postgres is with generate_series. This function produces a series of integers or timestamps - you can use either but assuming your 'dates' are really timestamptz
, here's how you might go about it if you are on 8.4 or above:
testbed:
create table sales(sales_date timestamptz, sales_amount numeric);
insert into sales(sales_date, sales_amount) values('2011-01-15 12:00', 100);
insert into sales(sales_date, sales_amount) values('2011-02-15 12:00', 240);
insert into sales(sales_date, sales_amount) values('2011-04-15 12:00', 400);
insert into sales(sales_date, sales_amount) values('2011-04-16 12:00', 30);
query:
with w as ( select month, sum(sales_amount) as total
from (select date_trunc('month',sales_date) as month, sales_amount from sales) z
group by month )
select to_char(month, 'fmMon') as month, coalesce(total, 0) as total
from (select generate_series(min(month), max(month), '1 month'::interval) as month from w) m
left outer join w using(month);
result:
month | total
-------+-------
Jan | 100
Feb | 240
Mar | 0
Apr | 430
--edit: a bit of extra detail on the query:
produce a summary of sales by month (but no month present if no sales):
with w as ( select month, sum(sales_amount) as total from ( select date_trunc('month',sales_date) as month, sales_amount from sales ) z group by month )
which could alternatively be written as:
with w as ( select date_trunc('month',sales_date) as month, sum(sales_amount) as total from sales group by date_trunc('month',sales_date) )
produce an unbroken series of months (without sales) from the minimum to the maximum:
select generate_series(min(month), max(month), '1 month'::interval) as month from w
outer join the unbroken series to the summary of sales by month to produce an unbroken series with sales (or
null
sales if no sales present):left outer join w using(month)
for the months with
null
sales, change thenull
to a0
:coalesce(total, 0)
-
Having a list of months is one aspect to the solution. I think you should draw more attention to the
left join
and thecoalesce
function.Stephen Denne– Stephen Denne2011年09月27日 11:26:10 +00:00Commented Sep 27, 2011 at 11:26 -
Is that better do you think?Jack Douglas– Jack Douglas2011年09月27日 12:44:14 +00:00Commented Sep 27, 2011 at 12:44
-
Yes. I think it is a great answer. Is there any reason you used a CTE over a subquery, or decided to pull out the sum/group by query instead of the generated series of months?Stephen Denne– Stephen Denne2011年09月27日 19:31:17 +00:00Commented Sep 27, 2011 at 19:31
-
The CTE is used twice so seemed better than 2 subqueries. The generated series could also be pulled out into a CTE but as it is only used once I favour the subquery - which is just a personal preference because I find choosing names for CTEs harder than it ought to be. 'w' is fine if there is only one :)Jack Douglas– Jack Douglas2011年09月27日 19:56:24 +00:00Commented Sep 27, 2011 at 19:56
You need to outer-join a query that calculates the actual sums to a query that generates a dummy row for each month, using the month as the join key.
Here is how I would do it in Oracle using CONNECT BY LEVEL:
select foo.monthdate
, bar.amount_sum
from (select add_months(to_date('01-JAN-2011'),level-1) as MONTHDATE
from dual
connect by level <= 8
) foo
, (select trunc(sale_date,'MM') as SALE_MONTH
, sum(amount) as AMOUNT_SUM
from sales
where sale_date >= '01-JAN-2011'
and sale_date < '01-SEP-2011'
group by trunc(sale_date,'MM')
) bar
where foo.monthdate = bar.sale_month(+)
order by foo.monthdate
Unfortunately there's no simple translation of CONNECT BY LEVEL to Postgres (edit: apparently there is, see Jack Douglas' answer), but a simple alternative is to construct a date dimension table...
create table date_dim
(
date_id number
, daydate date
, week_begin date
, month_begin date
, year number
, month number
, day_of_month number
, day_of_year number
--et cetera...
);
...and populate that table with every date for a hundred years in both directions. Then you can generate the dates you need very simply (converting dates to spelled-out months left as an exercise for the reader):
select daydate
from date_dim
where daydate >= '01-JAN-2011'
and daydate < '01-SEP-2011'
and day_of_month = 1