I'm trying to do this table which will give me a date, and then sum up sales from a separate table, but the sales values get multiplied 14 times for some reason. Will this need a temp table, or am I missing something obvious in my join?
-- Set the base dates
DECLARE @StartDate DATETIME = '01-Jun-2022 00:00:00'
DECLARE @EndDate DATETIME = '01-Jul-2022 00:00:00'
DECLARE @PeriodID VARCHAR(50) = '10'
DECLARE @MarketID VARCHAR(50) = 'GB'
SELECT [c].[CustomerID] 'Customer ID'
,[o].[OrderID]
, [c].[FirstName]
, [c].[LastName]
, [c].[Field3]
, FORMAT([c].[Date1], 'dd/MM/yyyy') 'Join Date'
, FORMAT(MAX([bep1].[EndDate]), 'dd/MM/yyyy') 'BEP1 Date'
, FORMAT(MAX([bep2].[EndDate]), 'dd/MM/yyyy') 'BEP2 Date'
, FORMAT(MAX([bep3].[EndDate]), 'dd/MM/yyyy') 'BEP3 Date'
, FORMAT(MAX([bep4].[EndDate]), 'dd/MM/yyyy') 'BEP4 Date'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 0 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth1 Sales'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 1 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth2 Sales'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 2 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth3 Sales'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 3 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth4 Sales'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 4 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth5 Sales'
, SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 5 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth6 Sales'
, SUM([od].[CommissionableVolume]) AS 'Sales Volume'
FROM [dbo].[Customers] [c]
LEFT JOIN [dbo].[PeriodVolumes] [pv] ON [c].[CustomerID] = [pv].[CustomerID] AND [pv].[PeriodTypeID] = 20
LEFT JOIN [dbo].[Periods] [bep1] ON [bep1].[PeriodID] = [pv].[Volume43] AND [bep1].[PeriodTypeID] = 20 AND [pv].[Volume43] > 0
LEFT JOIN [dbo].[Periods] [bep2] ON [bep2].[PeriodID] = [pv].[Volume44] AND [bep2].[PeriodTypeID] = 20 AND [pv].[Volume44] > 0
LEFT JOIN [dbo].[Periods] [bep3] ON [bep3].[PeriodID] = [pv].[Volume45] AND [bep3].[PeriodTypeID] = 20 AND [pv].[Volume45] > 0
LEFT JOIN [dbo].[Periods] [bep4] ON [bep4].[PeriodID] = [pv].[Volume46] AND [bep4].[PeriodTypeID] = 20 AND [pv].[Volume46] > 0
INNER JOIN [dbo].[Orders] [o] ON REPLACE(REPLACE([o].[Other11],CHAR(9),''),' ','') = [c].[CustomerID]
INNER JOIN [dbo].[OrderDetails] [od] ON [od].[OrderID] = [o].[OrderID]
WHERE (1=1)
AND ([c].[Field3] = @MarketId OR @MarketId = 'All')
AND [c].[Date1] BETWEEN @StartDate AND @EndDate
AND [c].[CustomerID] = '3009737'
GROUP BY [c].[CustomerID]
,[o].[OrderID]
, [c].[FirstName]
, [c].[LastName]
, [c].[Field3]
, FORMAT([c].[Date1], 'dd/MM/yyyy')
ORDER BY FORMAT([c].[Date1], 'dd/MM/yyyy') ASC
,[c].[CustomerID] ASC
Rick James
80.7k5 gold badges52 silver badges119 bronze badges
1 Answer 1
One rule to keep in mind: JOINs
are done before aggregation. That is,...
- All
JOINing
is performed; a temporary table is created. If some of the JOINs represent 1:many relationships, the number of rows may be larger than any of the original tables. - Rows that don't match the
WHERE
are tossed. - Now the
GROUP BY
is applied. TheSUM()
is applied to the rows in that temp table. Because of any 1:many relationships, some number may show up multiple times, hence inflating the "sum".
Usually, the solution is to do aggregations (and GROUP BYs
) on the table as a separate step. Then JOIN
to the rest.
SELECT ..., sumb, ...
FROM ( SELECT a, SUM(b) sumb FROM x GROUP BY a ) AS y
JOIN ... ON y.a = ...
answered Sep 5, 2022 at 17:48
lang-sql
JOIN
s and / orGROUP BY
clause are incorrect, causing the increased number of rows. Without seeing the Table definitions, and example data for each, it's hard to tell exactly what part. Comment out theGROUP BY
, allJOIN
s and theSELECT
list. Then only add in the key field of each Table as you go, and uncomment a singleJOIN
one by one until you start to see the data duplicated unexpectedly. Then you've found your problem. If the raw data appears correct after adding back all theJOIN
s, then you have just an incorrectGROUP BY
clause and need to look closer at it.