I have a transaction table let's call it Offers Table. Please don't read too much into the business but let's say a customer wants to buy a car and the customer searches variety of websites but at the moment for simplicity sake let's assume they search only 4 websites. Let's say the customer searches WebsiteA, websiteB, WebsiteC and WebsiteD.
The Offers table will have all the search results of this customer like what car the customer was searching for and what price did each website offer etc..
In our case, a customer searches for only one car per day. Each customer is only interested in only one particular car and they look for better offers multiple times in a month on those 4 websites.
Let's say my customer, XYZ is only interested in Audi A7 and on Feb 1st, 2024 he searched the price of Audi A7 on 4 websites WebsiteA, WebsiteB, WebsiteC and WebsiteD.
WebsiteA offered the car for 30,000, WebsiteB offered the car for 30,500 , websiteC offered the car for 31,000 and WebsiteD offered the car for 30,532.
We can see that WebsiteA has offered the cheaper price out of the 4 websites for AudiA7.
Let's say on the same day another customer PQR, who's also interested in Audi A7 and he also searches the same 4 websites and they offered 29,990 30,050 30250 30330 respectively. Again WebsiteA has offered cheaper prices out of the 4 websites for Audi A7 for customer PQR.
The Third customer ABC, he's also interested in Audi A7 and searched the same 4 websites on the same day, she got 29200 28600 29000 30000 respectively on 4 websites. In this instance WebsiteB has offered the cheapest price.
Now I need to calculate on that particular day How many times a particular website offered Audi A7 cheaper.
From the 3 examples above, WebsiteA has offered cheapest price twice, WebsiteB has offered it once while WebsiteC and WebsiteD has never offered cheapest price.
Let me expand the table with a few more examples and provide you the desired output.
Customer | SearchDate | CarName | Website | CarPrice |
---|---|---|---|---|
XYZ | 02-01-2024 | Audi A7 | WebsiteA | 30000 |
XYZ | 02-01-2024 | Audi A7 | WebsiteB | 30500 |
XYZ | 02-01-2024 | Audi A7 | WebsiteC | 31000 |
XYZ | 02-01-2024 | Audi A7 | WebsiteD | 30532 |
PQR | 02-01-2024 | Audi A7 | WebsiteA | 29990 |
PQR | 02-01-2024 | Audi A7 | WebsiteB | 30050 |
PQR | 02-01-2024 | Audi A7 | WebsiteC | 30250 |
PQR | 02-01-2024 | Audi A7 | WebsiteD | 30330 |
ABC | 02-01-2024 | Audi A7 | WebsiteA | 29200 |
ABC | 02-01-2024 | Audi A7 | WebsiteB | 28600 |
ABC | 02-01-2024 | Audi A7 | WebsiteC | 29000 |
ABC | 02-01-2024 | Audi A7 | WebsiteD | 30000 |
XYZ | 02-02-2024 | Audi A7 | WebsiteA | 28000 |
XYZ | 02-02-2024 | Audi A7 | WebsiteB | 27500 |
XYZ | 02-02-2024 | Audi A7 | WebsiteC | 26000 |
XYZ | 02-02-2024 | Audi A7 | WebsiteD | 26500 |
PQR | 02-02-2024 | Audi A7 | WebsiteA | 29000 |
PQR | 02-02-2024 | Audi A7 | WebsiteB | 28800 |
PQR | 02-02-2024 | Audi A7 | WebsiteC | 28500 |
PQR | 02-02-2024 | Audi A7 | WebsiteD | 28300 |
ABC | 02-02-2024 | Audi A7 | WebsiteA | 26998 |
ABC | 02-02-2024 | Audi A7 | WebsiteB | 27615 |
ABC | 02-02-2024 | Audi A7 | WebsiteC | 26999 |
ABC | 02-02-2024 | Audi A7 | WebsiteD | 27599 |
In my original table I have many more fields and many more products, many more customers but for simplicity sake I have provided 2 days information for 3 customers and for a single product.
Based on this offers data I need to load a reporting table with the following information.
SearchDate | CarName | Website | NumberOfWins |
---|---|---|---|
02-01-2024 | Audi A7 | WebsiteA | 2 |
02-01-2024 | Audi A7 | WebsiteB | 1 |
02-01-2024 | Audi A7 | WebsiteC | 0 |
02-01-2024 | Audi A7 | WebsiteD | 0 |
02-02-2024 | Audi A7 | WebsiteA | 1 |
02-02-2024 | Audi A7 | WebsiteB | 0 |
02-02-2024 | Audi A7 | WebsiteC | 1 |
02-02-2024 | Audi A7 | WebsiteD | 1 |
I need to incrementally load the reporting table (i.e. Output table).
I am using SQL Server 2016 version.
I have written the following SQL Query in 2 steps.
WITH Ranked_DATA AS (
SELECT SearchDate
, Website
, CarName
, ROW_NUMBER () OVER (
PARTITION BY Customer, SearchDate, CarName
ORDER BY CarPrice
) AS RN
)
SELECT SearchDate, Website, CarName,COUNT(*)
FROM Ranked_DATA
WHERE RN = 1
GROUP BY SearchDate, Website,CarName
This T-SQL is returning what I need however I wanted to see if there's any better way of writing the SQL? Can we condense the SQL into one single step? I had to rank the table first in the CTE and then filter out the rank 1s and count them or there any better alternatives.
-
1\$\begingroup\$ Removing "all of the words" is very much not a good idea. I for one appreciate the context, and if I didn't I could just skip past it. \$\endgroup\$Reinderien– Reinderien2024年03月06日 03:37:12 +00:00Commented Mar 6, 2024 at 3:37
-
1\$\begingroup\$ @Cody your query doesn't produce the results you say you require, so how can it be working? \$\endgroup\$Dale K– Dale K2024年03月06日 06:05:50 +00:00Commented Mar 6, 2024 at 6:05
2 Answers 2
The ROW_NUMBER
in the CTE method is probably the most appropriate as you're also requiring the 0s to be shown in the results table. It has the advantage of a single read through the data. It does do some extra computation (e.g., ranking 2, 3, and 4) but this is not a large overhead for small numbers.
For the results to include the 0s, I would modify the second/outer SELECT
to have a CASE
expression rather than COUNT
+ filtering out via WHERE
.
WITH Ranked_DATA AS (
SELECT SearchDate
, Website
, CarName
, ROW_NUMBER () OVER (
PARTITION BY Customer, SearchDate, CarName
ORDER BY CarPrice
) AS RN
FROM Offers
)
SELECT SearchDate, Website, CarName,
SUM(CASE WHEN RN = 1 THEN 1 ELSE 0 END) AS NumWins
FROM Ranked_DATA
GROUP BY SearchDate, Website,CarName
ORDER BY SearchDate, Website,CarName;
See this db<>fiddle for comparison.
For illustration purposes I show that you don't necessarily need to use your windowing function, which generates one row for every row of the input; the intermediate table can be a min()
aggregate with only one row per group. This approach does need a join
, though, so it's dubious that there would be any performance benefit.
select outputs.SearchDate, outputs.Website, outputs.CarName,
sum(case
when MinCarPrice=CarPrice then 1 else 0
end) as NumberOfWins
from Offers as outputs
left join (
select Customer, SearchDate, CarName, min(CarPrice) as MinCarPrice
from Offers
group by Customer, SearchDate, CarName
) minima on outputs.Customer = minima.Customer
and outputs.SearchDate = minima.SearchDate
and outputs.CarName = minima.CarName
group by outputs.SearchDate, outputs.Website, outputs.CarName;
SearchDate Website CarName NumberOfWins
2024年02月01日 WebsiteA Audi A7 2
2024年02月01日 WebsiteB Audi A7 1
2024年02月01日 WebsiteC Audi A7 0
2024年02月01日 WebsiteD Audi A7 0
2024年02月02日 WebsiteA Audi A7 1
2024年02月02日 WebsiteB Audi A7 0
2024年02月02日 WebsiteC Audi A7 1
2024年02月02日 WebsiteD Audi A7 1