Calculate overlap of two datetime objects

Question 1

I would like to find out how much time lays in between 22.00 and 6.00 o’clock for a given event per day.

So if the event start at 21.00 and ends at 23.59 the result would be 1.59. For a start at 22.00 and end at 7.00 it would be 2.00 + 6.00 = 8.00 hours. So basically I want to calculate the night shift part per day.

I came up with the following implementation. It works but I wonder if there is an easier way that uses less overhead/ lines of code.

from datetime import datetime, timedelta, time
from typing import Dict, Generator
def daterange(start_date: datetime, end_date: datetime) -> Generator[datetime, None, None]:
 for n in range(int((end_date - start_date).days)):
 yield start_date + timedelta(n)
def get_overlap(dt1: datetime, dt2: datetime) -> timedelta:
 # Check that the two datetimes are on the same day
 if dt1.date() != dt2.date():
 raise ValueError('Datetimes must be on the same day')
 # Calculate the start and end times of the two periods of interest
 early_shift_start = datetime.combine(dt1.date(), time(hour=0))
 early_shift_end = datetime.combine(dt1.date(), time(hour=6))
 late_shift_start = datetime.combine(dt1.date(), time(hour=22))
 late_shift_end = datetime.combine(dt1.date(), time(hour=23, minute=59, second=59))
 # Calculate the amount of overlap between the two periods
 overlap = max(min(dt2, early_shift_end) - max(dt1, early_shift_start), timedelta())
 overlap += max(min(dt2, late_shift_end) - max(dt1, late_shift_start), timedelta())
 return overlap
def split_on_midnight() -> Dict[str, float]:
 start_date = datetime(2023, 3, 1, 12, 30, 0) # example start datetime
 end_date = datetime(2023, 3, 3, 16, 45, 0) # example end datetime
 total_duration = timedelta()
 for date in daterange(start_date, end_date + timedelta(days=1)):
 midnight = datetime.combine(date, time.min)
 chunk_start = max(start_date, midnight)
 chunk_end = min(end_date, midnight + timedelta(days=1) - timedelta(seconds=1))
 if chunk_start < chunk_end:
 duration = get_overlap(chunk_start, chunk_end)
 total_duration += duration
 print(f"Chunk from {chunk_start} to {chunk_end}: {duration}")
 return {"total_duration": total_duration.total_seconds() / 3600}
print(split_on_midnight())

Question 2

Set up your code for ease of testing. This is useful both for your own debugging and when asking others for help or code review. Instead of writing an English paragraph giving us example inputs and expected outputs (or hardcoding a single example in your program), organize the entire program for ease of testing and experimentation, from the ground up. In any project of significance, I would do that with a testing tool (I like pytest, but there are other good options as well). In a simple situation like an online code review, including a main() method with a simple testing apparatus works fine, as shown below.

Speaking of testing, the example in your code has a bug. Given datetime(2023, 3, 1, 12, 30, 0) and datetime(2023, 3, 3, 16, 45, 0), your code returned 57598 seconds. But the answer should be 57600 seconds: 28800 + 28800 (two 8-hour night shifts, one for 3/1 to 3/2 and the other for 3/2 to 3/3). I did not try to track down the source of the bug, because my overall reaction to your code was the following: it's looks like the code has been put together in a thoughtful way, but there is a lot of complexity in it, and my gut tells me there is a simpler way to address the problem (your question text makes me think that your gut was telling you something similar).

If something is a datetime, don't call it a date. You have a few variables that are misnamed in that fashion: start_date and end_date are the two most prominent examples. This may seem pedantic (indeed it is), but conceptual clarity is rewarded in programming in the form of fewer bugs, more readable code, and more efficient communication with others working on a project (this becomes even more true as projects grow in size and complexity). If it's a datetime, don't call it a date; if it's a dict, don't call it JSON (the later is text); and vice versa. Often, one can achieve greater clarity simply with a different naming convention: for example, start and end are shorter names, completely clear in context, and don't imply anything incorrect about their underlying nature.

An alternative approach. My idea for this problem was to maintain a tiny list of datetimes. The list would initially contain the input start and end times, plus the upcoming start or end point for the next night shift. Sort the list in reverse order. Pop off the most recent datetime. If the current start and the popped item cover the night shift, add their duration to the total. As needed, add another night shift endpoint to the list and sort again. Stop when the tiny list no longer contains the input end time.

I started with some constants:

from datetime import datetime, timedelta, time
NIGHT_START_TIME = time(22, 0)
NIGHT_END_TIME = time(6, 0)

Then a function to take the input start and yield appropriate night shift endpoints in the future:

def nightshift_endpoints_gen(start):
 start_date = start.date()
 combine = datetime.combine
 n = 0
 while True:
 dt = combine(start_date + timedelta(days = n), NIGHT_START_TIME)
 if dt > start:
 yield dt
 n += 1
 yield combine(start_date + timedelta(days = n), NIGHT_END_TIME)

With those building blocks, computing the overall duration is not too bad:

def nightshift_duration(start, end):
 dts = [end, start]
 e = start
 gen = nightshift_endpoints_gen(start)
 tot = 0
 while end in dts:
 dts.append(next(gen))
 dts.sort(reverse = True)
 s, e = (e, dts.pop())
 start_time = s.time()
 if start_time >= NIGHT_START_TIME or start_time < NIGHT_END_TIME:
 tot += (e - s).total_seconds()
 return int(tot)

And a testing apparatus:

def main():
 TESTS = (
 # Example from your code.
 (
 datetime(2023, 3, 1, 12, 30, 0),
 datetime(2023, 3, 3, 16, 45, 0),
 16 * 3600,
 ),
 # Examples from your comments.
 (
 datetime(2020, 2, 28, 21, 0, 0),
 datetime(2020, 2, 28, 23, 59, 0),
 119 * 60,
 ),
 (
 datetime(2020, 2, 2, 22, 0, 0),
 datetime(2020, 2, 3, 7, 0, 0),
 8 * 3600,
 ),
 # A time span entirely within the night shift.
 (
 datetime(2020, 2, 3, 3, 0, 0),
 datetime(2020, 2, 3, 5, 5, 0),
 125 * 60,
 ),
 )
 for start, end, exp in TESTS:
 got = nightshift_duration(start, end)
 if got == exp:
 print(got, 'ok')
 else:
 print(got, exp)
if __name__ == '__main__':
 main()

An alternative approach: solve directly. A different idea is to solve directly rather than iterating over all days from start to end. The idea here is to decompose the total duration into three chunks: (1) the partial night shift that start might be part of, (2) the partial night shift that end might be part of, and (3) the full night shift(s) that might sit in between all of that stuff.

# Five night shifts (equal signs), plus start and end (carets).
======== ======== ======== ======== ========
 ^ ^
# The chunks:
Chunk 1: start to the first night-shift-end.
Chunk 2: the last night-shift-start to end.
Chunk 3: three full night shifts.

To solve the problem along those lines, we could use a function that can take a datetime and return its closest "neighbor" to the right (future) or left (past), where a neighbor is a night shift startpoint or endpoint. We can use that function to compute the partial chunks (1 and 2) and to find references points to compute the number of full night shifts in the middle chunk (we will always use night shift starts for that purpose). To handle all of that we can break down the situation into three possibilities:

# Notation
dt : datetime
NS : night shift startpoint
NE : night shift endpoint
|| : midnight
# Three possibilities:
NE dt NS #1: dt is not in a night shift
NS dt || NE #2: dt in night shift, before midnight
NS || dt NE #3: dt in night shift, after midnight

Start with some constants:

from datetime import datetime, timedelta, time
NS = NIGHT_START_TIME = time(22, 0)
NE = NIGHT_END_TIME = time(6, 0)
NON_NIGHTSHIFT = 0
BEFORE_MIDNIGHT = 1
AFTER_MIDNIGHT = 2
LEFT, RIGHT = (0, 1)
SAME_DAY = timedelta(0)
NEXT_DAY = timedelta(1)
PREV_DAY = timedelta(-1)
NEIGHBOR_PARAMS = {
 # in_nightshift() : [LEFT neighbor, RIGHT neighbor]
 NON_NIGHTSHIFT : [(SAME_DAY, NE), (SAME_DAY, NS)],
 BEFORE_MIDNIGHT : [(SAME_DAY, NS), (NEXT_DAY, NE)],
 AFTER_MIDNIGHT : [(PREV_DAY, NS), (SAME_DAY, NE)],
}
SECONDS_PER_NIGHTSHIFT = 8 * 3600

Then two utility functions, one to determine a datetime's night shift status and another to return the "neighbor" to the right or left:

def in_nightshift(dt):
 t = dt.time()
 return (
 BEFORE_MIDNIGHT if t >= NIGHT_START_TIME else
 AFTER_MIDNIGHT if t < NIGHT_END_TIME else
 NON_NIGHTSHIFT
 )
def neighbor(dt, direction):
 params = NEIGHBOR_PARAMS[in_nightshift(dt)]
 td, time = params[direction]
 return datetime.combine(dt.date() + td, time)

And finally, our primary function and its immediate helper to find the relevant neighbor and associated chunk #1 or #3.

def nightshift_duration(start, end):
 ns1, chunk1 = get_nightshift_startpoint(start, True)
 ns2, chunk2 = get_nightshift_startpoint(end, False)
 chunk3 = (ns2 - ns1).days * SECONDS_PER_NIGHTSHIFT
 return int(chunk1 + chunk2 + chunk3)
def get_nightshift_startpoint(dt, is_start):
 if in_nightshift(dt):
 if is_start:
 dt2 = neighbor(dt, RIGHT)
 ns = neighbor(dt2, RIGHT)
 else:
 dt2 = ns = neighbor(dt, LEFT)
 chunk = abs((dt - dt2).total_seconds())
 return (ns, chunk)
 else:
 ns = neighbor(dt, RIGHT)
 return (ns, 0)

I suppose this approach is "better" in the sense that it can immediately compute any duration, even if start and end are millions of days apart [after typing this I looked up the max year supported by datetime, and it is shockingly low, so I guess Python has low expectations for humanity]. But it was definitely harder to think through all of the details of this approach. Whether the resulting code is easier or harder to understand than my first approach is a close call. In its first draft, the new approach was definitely harder to understand, but it got better as I added more constants, so I guess that's my final piece of advice: use constants and data structures to simplify coding logic and enhance readability.

FMc FMc 13.1k2 gold badges29 silver badges40 bronze badges · Answer 1 · 2023-03-04 21:40:56Z

Set up your code for ease of testing. This is useful both for your own debugging and when asking others for help or code review. Instead of writing an English paragraph giving us example inputs and expected outputs (or hardcoding a single example in your program), organize the entire program for ease of testing and experimentation, from the ground up. In any project of significance, I would do that with a testing tool (I like pytest, but there are other good options as well). In a simple situation like an online code review, including a main() method with a simple testing apparatus works fine, as shown below.

Speaking of testing, the example in your code has a bug. Given datetime(2023, 3, 1, 12, 30, 0) and datetime(2023, 3, 3, 16, 45, 0), your code returned 57598 seconds. But the answer should be 57600 seconds: 28800 + 28800 (two 8-hour night shifts, one for 3/1 to 3/2 and the other for 3/2 to 3/3). I did not try to track down the source of the bug, because my overall reaction to your code was the following: it's looks like the code has been put together in a thoughtful way, but there is a lot of complexity in it, and my gut tells me there is a simpler way to address the problem (your question text makes me think that your gut was telling you something similar).

If something is a datetime, don't call it a date. You have a few variables that are misnamed in that fashion: start_date and end_date are the two most prominent examples. This may seem pedantic (indeed it is), but conceptual clarity is rewarded in programming in the form of fewer bugs, more readable code, and more efficient communication with others working on a project (this becomes even more true as projects grow in size and complexity). If it's a datetime, don't call it a date; if it's a dict, don't call it JSON (the later is text); and vice versa. Often, one can achieve greater clarity simply with a different naming convention: for example, start and end are shorter names, completely clear in context, and don't imply anything incorrect about their underlying nature.

An alternative approach. My idea for this problem was to maintain a tiny list of datetimes. The list would initially contain the input start and end times, plus the upcoming start or end point for the next night shift. Sort the list in reverse order. Pop off the most recent datetime. If the current start and the popped item cover the night shift, add their duration to the total. As needed, add another night shift endpoint to the list and sort again. Stop when the tiny list no longer contains the input end time.

I started with some constants:

from datetime import datetime, timedelta, time
NIGHT_START_TIME = time(22, 0)
NIGHT_END_TIME = time(6, 0)

Then a function to take the input start and yield appropriate night shift endpoints in the future:

def nightshift_endpoints_gen(start):
 start_date = start.date()
 combine = datetime.combine
 n = 0
 while True:
 dt = combine(start_date + timedelta(days = n), NIGHT_START_TIME)
 if dt > start:
 yield dt
 n += 1
 yield combine(start_date + timedelta(days = n), NIGHT_END_TIME)

With those building blocks, computing the overall duration is not too bad:

def nightshift_duration(start, end):
 dts = [end, start]
 e = start
 gen = nightshift_endpoints_gen(start)
 tot = 0
 while end in dts:
 dts.append(next(gen))
 dts.sort(reverse = True)
 s, e = (e, dts.pop())
 start_time = s.time()
 if start_time >= NIGHT_START_TIME or start_time < NIGHT_END_TIME:
 tot += (e - s).total_seconds()
 return int(tot)

And a testing apparatus:

def main():
 TESTS = (
 # Example from your code.
 (
 datetime(2023, 3, 1, 12, 30, 0),
 datetime(2023, 3, 3, 16, 45, 0),
 16 * 3600,
 ),
 # Examples from your comments.
 (
 datetime(2020, 2, 28, 21, 0, 0),
 datetime(2020, 2, 28, 23, 59, 0),
 119 * 60,
 ),
 (
 datetime(2020, 2, 2, 22, 0, 0),
 datetime(2020, 2, 3, 7, 0, 0),
 8 * 3600,
 ),
 # A time span entirely within the night shift.
 (
 datetime(2020, 2, 3, 3, 0, 0),
 datetime(2020, 2, 3, 5, 5, 0),
 125 * 60,
 ),
 )
 for start, end, exp in TESTS:
 got = nightshift_duration(start, end)
 if got == exp:
 print(got, 'ok')
 else:
 print(got, exp)
if __name__ == '__main__':
 main()

An alternative approach: solve directly. A different idea is to solve directly rather than iterating over all days from start to end. The idea here is to decompose the total duration into three chunks: (1) the partial night shift that start might be part of, (2) the partial night shift that end might be part of, and (3) the full night shift(s) that might sit in between all of that stuff.

# Five night shifts (equal signs), plus start and end (carets).
======== ======== ======== ======== ========
 ^ ^
# The chunks:
Chunk 1: start to the first night-shift-end.
Chunk 2: the last night-shift-start to end.
Chunk 3: three full night shifts.

To solve the problem along those lines, we could use a function that can take a datetime and return its closest "neighbor" to the right (future) or left (past), where a neighbor is a night shift startpoint or endpoint. We can use that function to compute the partial chunks (1 and 2) and to find references points to compute the number of full night shifts in the middle chunk (we will always use night shift starts for that purpose). To handle all of that we can break down the situation into three possibilities:

# Notation
dt : datetime
NS : night shift startpoint
NE : night shift endpoint
|| : midnight
# Three possibilities:
NE dt NS #1: dt is not in a night shift
NS dt || NE #2: dt in night shift, before midnight
NS || dt NE #3: dt in night shift, after midnight

Start with some constants:

from datetime import datetime, timedelta, time
NS = NIGHT_START_TIME = time(22, 0)
NE = NIGHT_END_TIME = time(6, 0)
NON_NIGHTSHIFT = 0
BEFORE_MIDNIGHT = 1
AFTER_MIDNIGHT = 2
LEFT, RIGHT = (0, 1)
SAME_DAY = timedelta(0)
NEXT_DAY = timedelta(1)
PREV_DAY = timedelta(-1)
NEIGHBOR_PARAMS = {
 # in_nightshift() : [LEFT neighbor, RIGHT neighbor]
 NON_NIGHTSHIFT : [(SAME_DAY, NE), (SAME_DAY, NS)],
 BEFORE_MIDNIGHT : [(SAME_DAY, NS), (NEXT_DAY, NE)],
 AFTER_MIDNIGHT : [(PREV_DAY, NS), (SAME_DAY, NE)],
}
SECONDS_PER_NIGHTSHIFT = 8 * 3600

Then two utility functions, one to determine a datetime's night shift status and another to return the "neighbor" to the right or left:

def in_nightshift(dt):
 t = dt.time()
 return (
 BEFORE_MIDNIGHT if t >= NIGHT_START_TIME else
 AFTER_MIDNIGHT if t < NIGHT_END_TIME else
 NON_NIGHTSHIFT
 )
def neighbor(dt, direction):
 params = NEIGHBOR_PARAMS[in_nightshift(dt)]
 td, time = params[direction]
 return datetime.combine(dt.date() + td, time)

And finally, our primary function and its immediate helper to find the relevant neighbor and associated chunk #1 or #3.

def nightshift_duration(start, end):
 ns1, chunk1 = get_nightshift_startpoint(start, True)
 ns2, chunk2 = get_nightshift_startpoint(end, False)
 chunk3 = (ns2 - ns1).days * SECONDS_PER_NIGHTSHIFT
 return int(chunk1 + chunk2 + chunk3)
def get_nightshift_startpoint(dt, is_start):
 if in_nightshift(dt):
 if is_start:
 dt2 = neighbor(dt, RIGHT)
 ns = neighbor(dt2, RIGHT)
 else:
 dt2 = ns = neighbor(dt, LEFT)
 chunk = abs((dt - dt2).total_seconds())
 return (ns, chunk)
 else:
 ns = neighbor(dt, RIGHT)
 return (ns, 0)

I suppose this approach is "better" in the sense that it can immediately compute any duration, even if start and end are millions of days apart [after typing this I looked up the max year supported by datetime, and it is shockingly low, so I guess Python has low expectations for humanity]. But it was definitely harder to think through all of the details of this approach. Whether the resulting code is easier or harder to understand than my first approach is a close call. In its first draft, the new approach was definitely harder to understand, but it got better as I added more constants, so I guess that's my final piece of advice: use constants and data structures to simplify coding logic and enhance readability.

Stack Exchange Network

Calculate overlap of two datetime objects

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Calculate overlap of two datetime objects

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions