I have been handed an (ancient) database in SQLServer format containing rental records. These records can be modified by the customer while the rental is still in flight, if they want the drill for another 24 hours for instance. To indicate this, a new row is written to the database covering that additional period, and a pointer is left in the original record to say "this one modified me". The result is something like this:
pk custId prodId overrideByPk startDate endDate
----------------------------------------------------------
1 1 1 0 1-1-2000 2-1-2000
2 1 1 3 1-2-2000 2-2-2000
3 1 1 4 3-2-2000 4-2-2000
4 1 1 0 4-2-2000 5-2-2000
These records represent two rental terms. The first ran for a single day in January, and the second ran from the 1st to the 5th of February. My goal is to produce output like this...
pk custId prodId startDate endDate
--------------------------------------------
1 1 1 1-1-2000 2-1-2000
2 1 1 1-2-2000 5-2-2000
It seems, based on the data I've looked at so far, that a record with a overrideByPk=0
is the last record in a string, even in a string of one row. Note that my example has separate date periods, but that's only for clarity. The customer may indeed return the item and the pick it back up the same day, so you can't look for discontinuities in the dates.
I can do this in C# code by getting all the rows with zero, and then looking for those overrideByPk
's repeatedly, but there are thousands of these and I am concerned about performance and code complexity. I seem to recall solving this in SQL in the past, but can no longer remember the trick (if it existed).
1 Answer 1
If you are using SQL 2012 as your tag indicates, then you can do this with a recursive CTE, although it's a little different since you find the child first and then the parents. But Take a look at the sample below.
The Setup
DECLARE @Rentals TABLE
(
PK INT NOT NULL
, CustID INT NOT NULL
, ProdID INT NOT NULL
, OverrideByPK INT NOT NULL
, StartDate DATE NOT NULL
, EndDate DATE NOT NULL
);
INSERT INTO @Rentals
( PK
, CustID
, ProdID
, OverrideByPK
, StartDate
, EndDate
)
VALUES ( 1,1,1,0,'1/1/2000','2/1/2000')
, ( 2,1,1,3,'1/2/2000','2/2/2000')
, ( 3,1,1,4,'3/2/2000','4/2/2000')
, ( 4,1,1,0,'4/2/2000','5/2/2000');
The Query
WITH CTE_Rentals AS
(
SELECT R.PK AS PK
, R.OverrideByPK AS NextPK
, R.PK AS OldestPK
FROM @Rentals AS R
WHERE R.OverrideByPK = 0
UNION ALL
SELECT R.PK AS LastPK
, R.overrideByPK
, CR.OldestPK
FROM @Rentals AS R
INNER JOIN CTE_Rentals AS CR ON R.OverrideByPK = CR.PK
WHERE R.OverrideByPK <> 0
)
, CTE_Single AS
(
SELECT C.OldestPK AS LastPK
, MIN(C.PK) AS FirstPK
FROM CTE_Rentals AS C
GROUP BY C.OldestPK
)
SELECT S.FirstPK
, FirstRental.CustID
, FirstRental.ProdID
, FirstRental.StartDate
, LastRental.EndDate
FROM CTE_Single AS S
INNER JOIN @Rentals AS FirstRental ON FirstRental.PK = S.FirstPK
INNER JOIN @Rentals AS LastRental ON LastRental.PK = S.LastPK;