1

I'm trying to do this table which will give me a date, and then sum up sales from a separate table, but the sales values get multiplied 14 times for some reason. Will this need a temp table, or am I missing something obvious in my join?

 -- Set the base dates
DECLARE @StartDate DATETIME = '01-Jun-2022 00:00:00'
DECLARE @EndDate DATETIME = '01-Jul-2022 00:00:00'
DECLARE @PeriodID VARCHAR(50) = '10'
DECLARE @MarketID VARCHAR(50) = 'GB'
 SELECT [c].[CustomerID] 'Customer ID'
 ,[o].[OrderID]
 , [c].[FirstName]
 , [c].[LastName]
 , [c].[Field3]
 , FORMAT([c].[Date1], 'dd/MM/yyyy') 'Join Date'
 , FORMAT(MAX([bep1].[EndDate]), 'dd/MM/yyyy') 'BEP1 Date'
 , FORMAT(MAX([bep2].[EndDate]), 'dd/MM/yyyy') 'BEP2 Date'
 , FORMAT(MAX([bep3].[EndDate]), 'dd/MM/yyyy') 'BEP3 Date'
 , FORMAT(MAX([bep4].[EndDate]), 'dd/MM/yyyy') 'BEP4 Date'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 0 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth1 Sales'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 1 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth2 Sales'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 2 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth3 Sales'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 3 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth4 Sales'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 4 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth5 Sales'
 , SUM(CASE WHEN DATEDIFF(MONTH , [c].[Date1], EOMONTH([o].[OrderDate])) = 5 THEN [od].[CommissionableVolume] ELSE 0 END ) AS 'Mth6 Sales'
 , SUM([od].[CommissionableVolume]) AS 'Sales Volume'
 
 FROM [dbo].[Customers] [c]
 LEFT JOIN [dbo].[PeriodVolumes] [pv] ON [c].[CustomerID] = [pv].[CustomerID] AND [pv].[PeriodTypeID] = 20
 LEFT JOIN [dbo].[Periods] [bep1] ON [bep1].[PeriodID] = [pv].[Volume43] AND [bep1].[PeriodTypeID] = 20 AND [pv].[Volume43] > 0
 LEFT JOIN [dbo].[Periods] [bep2] ON [bep2].[PeriodID] = [pv].[Volume44] AND [bep2].[PeriodTypeID] = 20 AND [pv].[Volume44] > 0
 LEFT JOIN [dbo].[Periods] [bep3] ON [bep3].[PeriodID] = [pv].[Volume45] AND [bep3].[PeriodTypeID] = 20 AND [pv].[Volume45] > 0
 LEFT JOIN [dbo].[Periods] [bep4] ON [bep4].[PeriodID] = [pv].[Volume46] AND [bep4].[PeriodTypeID] = 20 AND [pv].[Volume46] > 0
 INNER JOIN [dbo].[Orders] [o] ON REPLACE(REPLACE([o].[Other11],CHAR(9),''),' ','') = [c].[CustomerID] 
 INNER JOIN [dbo].[OrderDetails] [od] ON [od].[OrderID] = [o].[OrderID]
 WHERE (1=1)
 
 AND ([c].[Field3] = @MarketId OR @MarketId = 'All')
 AND [c].[Date1] BETWEEN @StartDate AND @EndDate
 AND [c].[CustomerID] = '3009737'
 
 GROUP BY [c].[CustomerID]
 ,[o].[OrderID]
 , [c].[FirstName]
 , [c].[LastName]
 , [c].[Field3]
 , FORMAT([c].[Date1], 'dd/MM/yyyy')
 
 ORDER BY FORMAT([c].[Date1], 'dd/MM/yyyy') ASC
 ,[c].[CustomerID] ASC
Rick James
80.7k5 gold badges52 silver badges119 bronze badges
asked Sep 5, 2022 at 11:39
4
  • Please consider following these suggestions. Commented Sep 5, 2022 at 11:42
  • Your JOINs and / or GROUP BY clause are incorrect, causing the increased number of rows. Without seeing the Table definitions, and example data for each, it's hard to tell exactly what part. Comment out the GROUP BY, all JOINs and the SELECT list. Then only add in the key field of each Table as you go, and uncomment a single JOIN one by one until you start to see the data duplicated unexpectedly. Then you've found your problem. If the raw data appears correct after adding back all the JOINs, then you have just an incorrect GROUP BY clause and need to look closer at it. Commented Sep 5, 2022 at 12:33
  • 2
    Your code is MS SQL-specific, and it will fail in MySQL. Check your DBMS carefully. Commented Sep 5, 2022 at 16:58
  • To avoid JOIN multiplying you must aggregate single table (or minimal tableset) in CTE/subquery then join another tables. Commented Sep 5, 2022 at 16:59

1 Answer 1

4

One rule to keep in mind: JOINs are done before aggregation. That is,...

  1. All JOINing is performed; a temporary table is created. If some of the JOINs represent 1:many relationships, the number of rows may be larger than any of the original tables.
  2. Rows that don't match the WHERE are tossed.
  3. Now the GROUP BY is applied. The SUM() is applied to the rows in that temp table. Because of any 1:many relationships, some number may show up multiple times, hence inflating the "sum".

Usually, the solution is to do aggregations (and GROUP BYs) on the table as a separate step. Then JOIN to the rest.

SELECT ..., sumb, ...
 FROM ( SELECT a, SUM(b) sumb FROM x GROUP BY a ) AS y
 JOIN ... ON y.a = ...
answered Sep 5, 2022 at 17:48

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.