I have a table that looks like this:
SKU DateOut DateIn Status
123 1/2/2015 5/2/2016 1
123 10/2/2016 18/2/2016 1
123 20/2/2016 0
Status 0 means no more rows for the given SKU.
I need to find out number of days online/stocked which would be generated by taking difference of DateIN
of a row and dateout
of the last row with in past 30 days from today.
Desired results in this case:
Let's say today is 7/3/2016
and I am looking for past 30 days i.e until date 7/2/2016
so number of days online would be
(20/2/2016 - 18/2/2016) + (10/2/2016 - 7/2/2016)
2+3
-->5 days
I also need dates for which it was online.
fromdate todate
18/2/2016 20/2/2016
7/2/2016 10/2/2016
dateout means got offline(out of stock) this day and datein means stocked in back.
status 0 means never stocked again and 1 means there would be a next row.
how to do it in sql?
Also, number of days we are looking at should be in past 30 days only. I tried using case statement but none could really do all with it.
Edit: This is what I had tried from my end but of course it does not give me the desired results and misses out on some cases like when data was stocked multiple times within past 30days.
Create TEMPORARY TABLE Temp.NumberOfDaysOnline AS
SELECT so.sku as sku,
CASE
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout IS NOT NULL AND so.dateout AND so.dateout < DATE_SUB(current_date(), 30) THEN 0
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout IS NOT NULL AND so.dateout >= DATE_SUB(current_date(), 30) THEN DATEDIFF(so.dateout, DATE_SUB(current_date(), 30))
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout < DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN DATEDIFF(so.datein, DATE_SUB(current_date(), 30))
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout >= DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN DATEDIFF(so.datein, so.dateout)
ELSE
0
END AS DaysOnline,
CASE
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout < DATE_SUB(current_date(), 30) THEN NULL
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout >= DATE_SUB(current_date(), 30) THEN DATE_SUB(current_date(), 30)
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout < DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN DATE_SUB(current_date(), 30)
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout >= DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN so.dateout
ELSE
0
END AS FromDate,
CASE
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout < DATE_SUB(current_date(), 30) THEN 0
WHEN so.datein IS NULL AND so.status = 0 AND so.dateout >= DATE_SUB(current_date(), 30) THEN so.dateout
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout < DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN so.datein
WHEN so.datein IS NOT NULL AND so.status = 1 AND so.dateout >= DATE_SUB(current_date(), 30) AND so.datein >= DATE_SUB(current_date(), 30) THEN so.datein
ELSE
0
END AS ToDate,
FROM
Erp.soldOut so
WHERE
NOT(
so.status = 1
AND
so.datein < DATE_SUB(current_date(), 30)
AND
so.dateout < DATE_SUB(current_date(), 30)
)
Edit: Another example:
SKU DateOut DateIn Status
245 15/5/2016 19/5/2016 1
245 30/5/2016 3/6/2016 1
278 2/6/2016 5/6/2016 1
245 10/6/2016 18/6/2016 1
278 10/6/2016 0
Now in this table, SKU 245 was brought into stock on 19th of May first.
It was in stock from 19/5/2016
to 30/5/2016
. and the cycle continues until 10th on which it got out of stock and never stocked in back (since the status is 0 and dateIN is null)
Now let' say we are looking at data only until 4/6/2016 considering that it it the 30th day back in time,
Number of days online for past 30 days (until 4/6/2016 considering that it it the 30th day back in time) for SKU 245 would be
- DATEDIFF(10/6/16 - 4/6/2016)
- DATEDIFF(21/6/16 - 18/6/2016)
//TAKING 4/6/2016
and not 3/6/2016
since we are looking at data until 4/6/2016 and not before that so even if it was in stock on 3/6/2016
as well which is when it got into stock, it does not matter to us.
i.e
DATEDIFF(10/6/16 - 4/6/2016)
+
DATEDIFF(21/6/16 - 18/6/2016)
= 6 + 3
= 9 days
.
SKU 245
was stocked In on 18/6/2016
and status 1
means it is in stock ever since. Never got out of stock since then and number of days online for this would be 30
(since it online for all the 30 days
we are looking at if it never got out of stock.
-
Comments are not for extended discussion; this conversation has been moved to chat.Paul White– Paul White ♦2017年08月29日 10:41:51 +00:00Commented Aug 29, 2017 at 10:41
1 Answer 1
To address this problem, I did the following (I'm using PostgreSQL for this, but SQL Server has CTE
s and RECURSIVE CTE
s and window/analytic functions. I doubt if this can be done in MySQL - maybe using variables - not sure). PostgreSQL is as powerful as the big boys - MySQL is really a toy!
Created a table:
CREATE TABLE stock (sku int, datein date, dateout date, status int);
Populated it thus:
INSERT INTO stock VALUES (123, '2015-02-01', '2015-02-05', 1);
INSERT INTO stock VALUES (123, '2015-02-10', '2015-02-10', 1);
INSERT INTO stock VALUES (123, '2015-02-20', NULL, 0);
Then, I constructed the first part of my query. The actual dates to and from will obviously vary depending on the requirements - I assumed that (from the sample data), the OP wanted to look at the month of February. I constructed the following RECURSIVE CTE
to construct a list of the days in February - PostgreSQL has the nifty generate_series function, but that's proprietary and I wanted to make my query portable (to the greatest extent possible). This is the best explanation of RCTE
s I came across.
WITH RECURSIVE dates (test_date) AS
(
SELECT '2015-02-01'::DATE
UNION ALL
SELECT test_date + 1 FROM dates
WHERE test_date < '2015-02-28'
)
Now, I thought that the organisation of the raw data was poor, so I used CTE
s to rearrange the data.
Get the outdates from the raw data:
out_dates AS
(
SELECT dateout AS dout FROM stock
)
Get the indates from the raw data:
in_dates AS
(
SELECT datein AS din FROM stock
)
Then I "transpose" the data - have to use two steps because one cannot use a Window function in a WHERE
clause
transpose AS
(
SELECT lag(dateout) over() AS datein, datein AS dateout FROM stock
-- WHERE lag(datein) over() IS NOT NULL AND dateout IS NOT NULL <<-- can't use here.
ORDER BY dateout, datein
)
And then
ranges AS
(
SELECT * FROM transpose
WHERE datein IS NOT NULL AND dateout IS NOT NULL
),
The result of this query is:
datein|dateout
2015年02月05日|2015年02月10日
2015年02月18日|2015年02月20日
Note that this corresponds to the in and out dates, but now the data is much simpler to handle because it's structure is more logical - it's chronologically coherent - thanks to the use of the LAG
window function.
Now, we bring it all together with the next query:
x AS
(
SELECT test_date, CASE WHEN has_match THEN 1 ELSE 0 END::int AS flag
FROM
(
SELECT test_date,
EXISTS
(
SELECT 1
FROM ranges
WHERE dates.test_date BETWEEN ranges.datein AND ranges.dateout
) AS has_match
FROM dates
) range_checks
)
SELECT * FROM x;
This gives a series of records for the month that are of this form:
test_date|flag
--------------
2015年02月01日|0
2015年02月02日|0
2015年02月03日|0
2015年02月04日|0
2015年02月05日|1
2015年02月06日|1
<other records snipped for brevity>
or use this as the final query
z AS
(
SELECT * FROM dates d
LEFT JOIN ranges r
ON (d.test_date BETWEEN r.datein AND r.dateout)
)
SELECT * FROM z;
for records of this form:
test_date|datein|dateout
2015年02月01日||
2015年02月02日||
2015年02月03日||
2015年02月04日||
2015年02月05日|2015年02月05日|2015年02月10日
2015年02月06日|2015年02月05日|2015年02月10日
<rest of the records snipped for brevity>