Grouping work hours for consecutive days

Question 1

Problem Description

The problem was initially posted here: Parse and clean up text block of store hours in Python.

Given the following multi-line string:

"""Hours
Monday 9:30 AM - 9:00 PM
Tuesday 9:30 AM - 9:00 PM
Wednesday 9:30 AM - 9:00 PM
Thursday 9:30 AM - 9:00 PM
Friday 9:30 AM - 11:00 PM
Saturday 9:30 AM - 11:00 PM
Sunday 11:00 AM - 6:00 PM
Holiday Hours
Thanksgiving Day 11:00 AM - 6:00 PM"""

Group the consecutive weekdays by the work hours, producing the following output:

"""Mon-Thu 9:30AM-9:00PM 
Fri-Sat 9:30AM-11:00PM
Sun & Hol 11:00AM-6:00PM"""

Solution

My approach is based on parsing the weekday lines, splitting by the first space and then using itertools.groupby() to sort by the second item of every row, using the groupby's implementation detail - it would only group the consecutive matches together:

from itertools import groupby
from operator import itemgetter
data = """Hours
Monday 9:30 AM - 9:00 PM
Tuesday 9:30 AM - 9:00 PM
Wednesday 9:30 AM - 9:00 PM
Thursday 9:30 AM - 9:00 PM
Friday 9:30 AM - 11:00 PM
Saturday 9:30 AM - 11:00 PM
Sunday 11:00 AM - 6:00 PM
Holiday Hours
Thanksgiving Day 11:00 AM - 6:00 PM"""
# filter relevant rows with weekdays only
rows = [row.split(" ", 1) for row in data.splitlines()[1:-2]]
# group consecutive days by a time range
result = []
for time_range, group in groupby(rows, key=itemgetter(1)):
 days_in_group = [item[0] for item in group]
 first_day, last_day = days_in_group[0][:3], days_in_group[-1][:3]
 range_end = "-" + str(last_day) if first_day != last_day else ""
 result.append("{begin}{end} {time_range}".format(begin=first_day,
 end=range_end,
 time_range=time_range))
print("\n".join(result))

Is this the most optimal solution to the problem? Is it Pythonic? What would you improve code quality-wise?

Question 2

The & Hol part of the output is difficult to infer programmatically, and in fact, you haven't done it.

Question 3

Obtaining & Hol in the output is quite difficult to achieve programmatically in any reasonable way, without hard-coding special cases. I see that you have not attempted to satisfy that part of the request. So, the specification is open to interpretation. I would say that a reasonable approach is to try to parse the opening hours if the line looks like a day of the week, and pass through the output if the line contains anything else.

For the parsing, I would use regular expressions. If you use named capture groups, you can avoid mysterious expressions like .split(), itemgetter(1), and days_in_group[0][:3].

Instead of .append(), I'd write it as a generator function.

from itertools import groupby
from operator import itemgetter
import re
def group_by_hours(s):
 REGEX = re.compile(r'^(?P<day_abbrev>\w{3})\w*day (?P<hours>.* - .*)|'
 r'^(?P<misc>.*)', re.MULTILINE)
 for hours, lines in groupby(REGEX.finditer(s), itemgetter('hours')):
 lines = list(lines)
 if not hours:
 for line in lines:
 yield line['misc']
 elif len(lines) == 1:
 yield "{0} {1}".format(lines[0]['day_abbrev'], hours)
 elif len(lines) > 1:
 yield "{0}-{1} {2}".format(lines[0]['day_abbrev'],
 lines[-1]['day_abbrev'],
 hours)
print('\n'.join(group_by_hours(data)))

Question 4

I particularly like using the named groups for both the days and abbreviated days, switching to a generator and, of course, overall, handling the other parts of the data string. Thanks, learned new ways of thinking about the code.

Question 5

magic numbers

In

rows = [row.split(" ", 1) for row in data.splitlines()[1:-2]]

the 1 & -2 magic numbers are a bit of a wart. Perhaps you'd like to create a generator that knows to suppress 'Hours' and that won't yield any further lines after 'Holiday Hours'.

Also,

first_day, last_day = days_in_group[0][:3], days_in_group[-1][:3]

has some magic numbers, but arguably they make more sense. You could make your generator responsible for day parsing, with yield day[:3], time_range.

variable renaming

In your string formatting, you choose to rename first_day to begin, and range_end to end. I suppose my advice is "don't do that". Or if there's a name you prefer for those variables, use that consistently throughout.

In

days_in_group = [item[0] for item in group]

the identifier item is a bit vague. pair would be better (or perhaps day_range if you lean toward the verbose).

global scope

You polluted the top-level namespace with rows and other identifiers. Your file cannot be reused via import. Please use the

if __name__ == '__main__':

idiom to bury these lines of code under def main(): or similar.

Overall, yes, this is sensible code, put together in pythonic fashion.

200_success 200_success 146k22 gold badges190 silver badges479 bronze badges · Accepted Answer · 2017-09-09 00:06:55Z

Obtaining & Hol in the output is quite difficult to achieve programmatically in any reasonable way, without hard-coding special cases. I see that you have not attempted to satisfy that part of the request. So, the specification is open to interpretation. I would say that a reasonable approach is to try to parse the opening hours if the line looks like a day of the week, and pass through the output if the line contains anything else.

For the parsing, I would use regular expressions. If you use named capture groups, you can avoid mysterious expressions like .split(), itemgetter(1), and days_in_group[0][:3].

Instead of .append(), I'd write it as a generator function.

from itertools import groupby
from operator import itemgetter
import re
def group_by_hours(s):
 REGEX = re.compile(r'^(?P<day_abbrev>\w{3})\w*day (?P<hours>.* - .*)|'
 r'^(?P<misc>.*)', re.MULTILINE)
 for hours, lines in groupby(REGEX.finditer(s), itemgetter('hours')):
 lines = list(lines)
 if not hours:
 for line in lines:
 yield line['misc']
 elif len(lines) == 1:
 yield "{0} {1}".format(lines[0]['day_abbrev'], hours)
 elif len(lines) > 1:
 yield "{0}-{1} {2}".format(lines[0]['day_abbrev'],
 lines[-1]['day_abbrev'],
 hours)
print('\n'.join(group_by_hours(data)))

I particularly like using the named groups for both the days and abbreviated days, switching to a generator and, of course, overall, handling the other parts of the data string. Thanks, learned new ways of thinking about the code.

Stack Exchange Network

Grouping work hours for consecutive days

Problem Description

Solution

2 Answers 2

magic numbers

variable renaming

global scope

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Grouping work hours for consecutive days

Problem Description

Solution

2 Answers 2

magic numbers

variable renaming

global scope

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions