Advent of Code 2023 - Day 19: Aplenty (Part 1)

Question 1

Description:

The task involves processing a set of workflows, each comprising rules. These rules define conditions based on part ratings and dictate the destination workflow if the conditions are satisfied. The objective is to compute the total rating of the parts that are eventually accepted after traversing the workflows.

The workflows operate on parts rated in four categories: x, m, a, and s. Parts initiate in the initial workflow named "in" and progress through the workflows based on rule conditions. The first rule matching a part's condition is immediately applied, directing the part to the specified destination. If a part is accepted (sent to A) or rejected (sent to R), the processing ceases for that particular part.

The provided input includes a list of workflows, succeeded by a blank line, and subsequently, the ratings of the parts. The task is to determine the sum of the ratings for all parts that ultimately receive acceptance.

For example:

px{a<2006:qkq,m>2090:A,rfg}
pv{a>1716:R,A}
lnx{m>1548:A,A}
rfg{s<537:gd,x>2440:R,A}
qs{s>3448:A,lnx}
qkq{x<1416:A,crn}
crn{x>2662:A,R}
in{s<1351:px,qqz}
qqz{s>2770:qs,m<1801:hdj,R}
gd{a>3333:R,R}
hdj{m>838:A,pv}
{x=787,m=2655,a=1222,s=2876}
{x=1679,m=44,a=2067,s=496}
{x=2036,m=264,a=79,s=2244}
{x=2461,m=1339,a=466,s=291}
{x=2127,m=1623,a=2188,s=1013}

The workflows are listed first, followed by a blank line, then the ratings of the parts the Elves would like you to sort. All parts begin in the workflow named in. In this example, the five listed parts go through the following workflows:

{x=787,m=2655,a=1222,s=2876}: in -> qqz -> qs -> lnx -> A
{x=1679,m=44,a=2067,s=496}: in -> px -> rfg -> gd -> R
{x=2036,m=264,a=79,s=2244}: in -> qqz -> hdj -> pv -> A
{x=2461,m=1339,a=466,s=291}: in -> px -> qkq -> crn -> R
{x=2127,m=1623,a=2188,s=1013}: in -> px -> rfg -> A

Ultimately, three parts are accepted. Adding up the x, m, a, and s rating for each of the accepted parts gives 7540 for the part with x=787, 4623 for the part with x=2036, and 6951 for the part with x=2127. Adding all of the ratings for all of the accepted parts gives the sum total of 19114.

#!/usr/bin/env python3
from pathlib import Path
from typing import Callable, Iterable
import typer
def parse_condition(condition: str) -> dict[str, str | int] | None:
 for op in {"<", ">"}:
 if op in condition:
 attribute, threshold = condition.split(op)
 return {"attribute": attribute, "operator": op, "threshold": int(threshold)}
 return None
def parse_rule(rule: str) -> dict[str, dict[str, str | int] | str | None]:
 rule_parts = rule.split(":")
 condition, destination = rule_parts if len(rule_parts) == 2 else (None, rule)
 return {
 "condition": parse_condition(condition) if condition else None,
 "destination": destination,
 }
def parse_workflow(
 workflow: str,
) -> dict[str, list[dict[str, dict[str, str | int] | str | None]]]:
 name, rules = workflow.split("{")
 return {name: [parse_rule(rule) for rule in rules.strip("}").split(",")]}
def parse_rating(rating: str) -> dict[str, int]:
 ratings = rating.strip("{}").split(",")
 return {label: int(value) for rat in ratings for label, value in [rat.split("=")]}
def find_rule(
 name: str, rules: list[dict[str, dict[str, dict[str, str | int] | str]]]
) -> dict[str, dict[str, dict[str, str | int] | str]]:
 return next((rule for rule in rules if name in rule), None)
def evaluate_condition(condition: str, rating: dict[str, int]) -> bool:
 op, attribute, threshold = (
 condition["operator"],
 condition["attribute"],
 condition["threshold"],
 )
 return (op == "<" and rating[attribute] < threshold) or (
 op == ">" and rating[attribute] > threshold
 )
def parse_parts(
 workflows: list[dict[str, dict[str, dict[str, str | int] | str]]],
 ratings: list[dict[str, int]],
) -> int:
 rating_idx = -1
 curr = "in"
 total = 0
 while True:
 # If the rule was accepted or rejected, curr would be reset to "in",
 # and we'd move on to the next rating.
 rating_idx += 1 if curr == "in" else 0
 if rating_idx == len(ratings):
 break
 rating = ratings[rating_idx]
 rules = find_rule(curr, workflows)
 for rule in rules[curr]:
 if rule["condition"]:
 if evaluate_condition(rule["condition"], rating):
 final = rule["destination"]
 total += sum(rating.values()) if final == "A" else 0
 curr = "in" if final in {"A", "R"} else final
 break
 else:
 final = rule["destination"]
 total += sum(rating.values()) if final == "A" else 0
 curr = "in" if final in {"A", "R"} else final
 break
 return total
def read_items(lines: Iterable[str], parse_func):
 items = []
 line = next(lines).strip()
 while line:
 items.append(parse_func(line))
 try:
 line = next(lines).strip()
 except StopIteration:
 break # No lines to process.
 return items
def total_ratings(lines: Iterable[str]) -> int:
 return parse_parts(
 read_items(lines, parse_workflow), read_items(lines, parse_rating)
 )
def main(avalanche_document: Path) -> None:
 with open(avalanche_document) as f:
 print(total_ratings(f))
if __name__ == "__main__":
 typer.run(main)

Review Request:

General coding comments, style, etc.

How can I simplify parse_parts()?

Was there a better way to store the input?

Did I go overboard with the type hints?

Question 2

The current code has many positive qualities. It is well organized. It decomposes the problem into smaller parts. The variables and functions have clear names. And so forth.

The paragraph parsing is too complex. You break the input into its two sections in read_items(), which uses a while loop, relies on a try-except structure, and has to build up a list of items until we hit a blank line. It's not a bad approach, but more direct tactics are available. A classic parsing idea is to break input text into "paragraphs" consisting of contiguous blocks of blank or non-blank lines. So we can start by stripping all lines. Then we just need to group the lines, which itertools.groupby() will do for us. We care about only the non-blank lines (those with True key).

from itertools import groupby
def paragraphs(lines):
 lines = [line.strip() for line in lines]
 return [
 list(g)
 for k, g in groupby(lines, key = bool)
 if k
 ]

The solution code is too complex. You asked about simplifying parse_parts(), and those instincts are correct. You are doing too much in this function: managing multiple state variables in a while loop, and looping over rules within that structure. The code is not too bad, but is it neither simple nor expressive: the reader has to study it with some care to keep track of what is going on and to feel reasonably confident that it's correct.

Write from the top down with code you wish you had. When facing a first draft of code with issues like those in parse_parts(), one strategy is to start writing an entirely new solution from the top down. Imagine we had a Plan consisting of a list of workflows. Our top-level solution code might look like the following sketch. It has almost no algorithm or logic. It just converts one type of data directly into the next type we need. And it operates with entities and concepts directly mappable to the question statement.

def total_ratings(lines):
 # Lines to non-blank paragraphs.
 ps = paragraphs(lines)
 # Create a Plan to hold the workflows.
 plan = Plan.from_lines(ps[0])
 # Parse rating lines into dicts.
 ratings = [parse_rating(line) for line in ps[1]]
 # For ratings accepted by the plan, return sum of the rating values.
 return sum(
 sum(r.values())
 for r in ratings
 if plan.is_accepted(r)
 )

What Plan needs. It needs to be able to take lines and return a Plan instance. And it needs to be able to take a rating and return a bool. As we did above, we keep writing code we wish we had: in this case, we wish a Workflow instance could take a rating dict and tell us the resulting destination.

from dataclasses import dataclass
@dataclass
class Plan:
 workflows: list['Workflow']
 def __post_init__(self):
 # Build a dict mapping names to workflows.
 self.lookup = {...}
 @classmethod
 def from_lines(cls, lines):
 return cls(Workflow.from_line(line) for line in lines)
 def is_accepted(self, rating):
 curr = 'in'
 while True:
 wf = self.lookup[curr]
 dest = wf.get_destination(rating)
 if dest == 'A':
 return True
 elif dest == 'R':
 return False
 else:
 curr = dest

Keep going: the Workflow. A workflow has a name and some Rule instances. As above, we implement a little bit of parsing logic here and a little bit of solution logic -- but not too much, because we freely delegate grubby details farther down the food chain.

@dataclass
class Workflow:
 name: str
 rules: list['Rule']
 @classmethod
 def from_line(cls, line):
 ...
 return cls(...)
 def get_destination(self, rating):
 for r in self.rules:
 if (dest := r.destination_if_satisfied(rating)):
 return dest

More levels: the Rule and maybe the Condition. In the wished-for code above, we want a Rule instance to be able to take a rating and return the rule's destination if the rating satisfies the rule. And we could even go farther by envisioning a Condition class that would also contain a tiny bit of parsing code and a little bit of solution logic. Perhaps that final level is not helpful enough to justify the overhead. Either way, we have confined the needs to a very small domain, so we know the code in Rule and below will be quite simple.

@dataclass
class Rule:
 condition: 'Condition'
 destination: str
 @classmethod
 def from_text(cls, text):
 ...
 return cls(...)
 def destination_if_satisfied(self, rating):
 c = self.condition
 if c and not c.is_satisfied(rating):
 return None
 else:
 return self.destination

Notice what is missing: type annotation complexity. Because we broke the information down into meaningful units (plan, workflow, rule, etc), none of the methods sketched above would require heroic efforts to add type annotations.

Question 3

Let me add to the FMc's answer and address your specific questions:

How Can I Simplify Parse Parts?

Your function read_items could be improved. You are establishing try/except block for every iteration line = next(lines).strip() and the expression next(lines).strip() appears unnecessarily twice. Instead we can the try outside the loop and use the walrus operator as follows:

def read_items(lines: Iterable[str], parse_func):
 items = []
 try:
 while (line := next(lines).strip()):
 items.append(parse_func(line))
 except StopIteration:
 pass # No lines to process.
 return items

But I, too, believe that for greater clarify you should have distinct functions for parsing workflows and ratings. For workflows we iterate until we find an empty line. We should never get a StopException and so we do not need to use a try/except block for parsing workflows. But ultimately you could achieve simpler code if you chose a different representation for your workflows. It appears that you currently have a list of dictionaries. Each dictionary has a single key whose value is the name of the workflow and its values is a list of rules that apply for that workflow. Then given the name of a workflow you have to do a linear search of this list to find a dictionary that has a key matching the workflow name. I would instead create a single dictionary, for example named 'workflows', with multiple keys, one for each named workflow. So workflows['px'] is a list of the rules that applies to the workflow named 'px'.

But how should each rule be represented? There are many possibilities. Let's look at a sample workflow specification:

rfg{s<537:gd,x>2440:R,A}

We have 3 rules for workflow 'rfg' and the final rule is always a default rule that always fires. When a rule fires the result is either the name of another rule or 'R' (reject the current part) or 'A' (accept the current part). I have chosen to represent a rule as a tuple of 5 elements (or you can use a collections.namedtuple instance if you want to reference each element by a name rather than by an index). For a default rule that always fires the first 4 elements of the tuple will always be None and the 5th element will be the default "action". For example, (None, None, None, None, 'A') represents the final, default rule above. Conversely, for a non-default rule where a comparison is required, the 5th element will always be None. For example, ('s', '<', 537, 'gd', None) represents the first rule above. Then by testing the 5th element for None we know whether we are dealing with a default rule or not.

You can use whatever technique you want for creating your dictionary of workflows where the key is a workflow name and its value is a list of tuples (or named tuples that makes it clearer to the reader what each element of a rule represents, as I have used my my code below) representing the rules. In the next section below I will be using regular expressions to parse the workflows, but if you are not interested in this approach just go down to the section labeled The Code.

Using Regular Expressions to Create the `workflows` Dictionary

I will use a regular expression to first break the workflow into its name and rules:

([a-z]+){([^}]+)}

([a-z]+) - Capture group 1 will be one or more lower-case letters.
{ - Followed by '{'.
([^}]+) - Followed by capture group 2 which are one or more characters that do not match '}'. That is, to match all the rules we scan until we find a '}' character.
} - Matches a '}' character.

When we match the above regular expression against input 'rfg{s<537:gd,x>2440:R,A}' capture group 1 (the workflow's name) will be 'rfg' and capture group 2 will be 's<537:gd,x>2440:R,A', a comma-separated list of rules for this workflow.

Then once we have the string 's<537:gd,x>2440:R,A', i.e. the rules, we want to match successive rules with the following regular expression:

([xmas])([<>])([0-9]+):([a-z]+|R|A)|([a-z]+|R|A)

([xmas]) - Matches either 'x', 'm', 'a', 's' as capture group 1. This is the rule's rating category.
([<>]) - Matches either '<' or '>' as capture group 2.
([0-9]+) - Matches one or more digits as capture group 3.
: - Matches ':'.
([a-z]+|R|A) - Capture group 4 will be the action taken if we have a rule math. This can be either a string of lower-case letters specifying the name of the next workflow to be used or 'R' signifying that we are to reject this part or A signifying that we are to accept this part.
|([a-z]+|R|A)) - Capture group 5 signifies that we have an alternate possibility for what matches a rule, namely just a lower-case string (the next workflow name) or 'R' (reject this part) or 'A' (accept this part). This represents the final "default" rule that always fires if none of the previous rules match.

When we iteratively find successive matches using the above regular expression against 's<537:gd,x>2440:R,A', we will match 3 rules. For each rule we get a tuple of 5 elements, one for each capture group:

('s', '<', '537', 'gd', None)
('x', '>', '2440', 'R', None)
(None, None, None, None, 'A')

As previously stated, when we are matching the final rule representing the default rule, all elements of the tuple are None except for the last (index 4). Conversely, the last element will be None for all non -default rules). So it becomes quite easy to test whether the current rule under consideration is a default rule.

If you do not want to have to remember what each element of a rule's tuple represents we can use a collections.namedtuple as follows:

from collections import namedtuple
...
WorkflowRule = namedtuple('WorkflowRule', [
 'rating_name',
 'rating_operator',
 'rating_value',
 'new_workflow_name',
 'default_workflow_name'
 ]
)

Then all workflows can be read an parsed as follows:

import re
...
workflows = {}
workflow_rex1 = re.compile(r'([a-z]+){([^}]+)}')
workflow_rex2 = re.compile(r'([xmas])([<>])([0-9]+):([a-z]+|R|A)|([a-z]+|R|A)')
def read_workflows(lines: Iterable[str]) -> None:
 while (line := next(lines)) != '':
 m = workflow_rex1.search(line)
 workflow_name = m[1] # First capture group
 workflow_rules = []
 for m in workflow_rex2.finditer(m[2]): # Iterate the rules (second capture group)
 groups = list(m.groups()) # Convert the rules tuple to a list
 if groups[2] is not None: # We have a string representing an integer
 groups[2] = int(groups[2]) # Convert to an integer
 # Create a named tuple from the rules and append it to our list of rules:
 workflow_rules.append(WorkflowRule(*groups))
 workflows[workflow_name] = workflow_rules

ratings can remain a list of dictionaries just as you currently have it. Then part 1 is solved as follows:

The Code

I have chosen to use collections.namedtuple instances to represent a workflow rule. If you skipped the previous section, here is the named tuple again:

from collections import namedtuple
...
WorkflowRule = namedtuple('WorkflowRule', [
 'rating_name',
 'rating_operator',
 'rating_value',
 'new_workflow_name',
 'default_workflow_name'
 ]

Once we have parsed the workflows and ratings, then Part 1 is solved with:

def is_accepted(rating: dict[str, int]) -> bool:
 """Do we accept this rating?"""
 workflow_name = 'in' # Initial workflow name to process
 while True:
 workflow_rules = workflows[workflow_name] # List of rules for this workflow
 for workflow_rule in workflow_rules: # Process each rule
 if workflow_rule.default_workflow_name: # Default rule if not None
 workflow_name = workflow_rule.default_workflow_name # The next action
 result = True # default rules always fire
 else:
 # A non-default rule requiring a comparison:
 workflow_name = workflow_rule.new_workflow_name # The next action if this rule fires
 value1 = rating[workflow_rule.rating_name]
 value2 = workflow_rule.rating_value
 result = value1 > value2 if workflow_rule.rating_operator == '>' else value1 < value2
 if result: # If the rule fires
 if workflow_name == 'R':
 return False
 if workflow_name == 'A':
 return True
 break # Else process workflow name
 # Else process new rule
def part1() -> int:
 """Solves Part 2 of Day 19 Advent of Code.
 See: https://adventofcode.com/2023/day/19"""
 # We assume the `ratings` and `workflows` variables have been
 # created and the code for that is omitted here.
 total = 0
 for rating in ratings:
 if is_accepted(rating):
 total += sum(rating.values())
 return total

Was There a Better Way to Store the Input?

Answered in the previous section.

Did I Go Overboard with the Type Hints?

No. But with the above suggestions they should be less complicated.

Question 4

The walrus operator is new to me, I was missing C's assignments inside the loop conditions. I can see why they'd introduce a new operator specifically for this purpose. Thank you.

Question 5

If you are not already familiar with regular expressions, they are a worthwhile addition to your "toolbox". And if you are interested, here is the complete code for my implementation (the input is coming from stdin).

Question 6

If you want to be more object oriented, you can create an abstract base class WorkFlowRule with two subclasses, DefaultWorkFlowRule and ConditionalWorkflowRule. See this.

Question 7

You could name your capture groups to avoid having to access them with an index that has no real meaning. The first regex would look like (?P<workflow_name>[a-z]+){(?P<rules>[^}]+)}, and the workflow_name group cloud be accessed with m.group("workflow_name"). Also, read_workflows expects an Iterator, not an Iterable. Great answer otherwise!

Question 8

@Harith Actually \w also captures numbers and \d can capture more than just the digits 0 .. 9 unless ASCII mode is set as an option. (with re.ASCII) If you assume the input is "correct", then you could use (\w)=(\d+). But if you want to verify the correctness of the input then use ([xmas])=([0-9]+). By the way, [a-z]{1} is equivalent to just [a-z].

FMc FMc 13.1k2 gold badges29 silver badges40 bronze badges · Answer 1 · 2024-01-04 11:11:48Z

The current code has many positive qualities. It is well organized. It decomposes the problem into smaller parts. The variables and functions have clear names. And so forth.

The paragraph parsing is too complex. You break the input into its two sections in read_items(), which uses a while loop, relies on a try-except structure, and has to build up a list of items until we hit a blank line. It's not a bad approach, but more direct tactics are available. A classic parsing idea is to break input text into "paragraphs" consisting of contiguous blocks of blank or non-blank lines. So we can start by stripping all lines. Then we just need to group the lines, which itertools.groupby() will do for us. We care about only the non-blank lines (those with True key).

from itertools import groupby
def paragraphs(lines):
 lines = [line.strip() for line in lines]
 return [
 list(g)
 for k, g in groupby(lines, key = bool)
 if k
 ]

The solution code is too complex. You asked about simplifying parse_parts(), and those instincts are correct. You are doing too much in this function: managing multiple state variables in a while loop, and looping over rules within that structure. The code is not too bad, but is it neither simple nor expressive: the reader has to study it with some care to keep track of what is going on and to feel reasonably confident that it's correct.

Write from the top down with code you wish you had. When facing a first draft of code with issues like those in parse_parts(), one strategy is to start writing an entirely new solution from the top down. Imagine we had a Plan consisting of a list of workflows. Our top-level solution code might look like the following sketch. It has almost no algorithm or logic. It just converts one type of data directly into the next type we need. And it operates with entities and concepts directly mappable to the question statement.

def total_ratings(lines):
 # Lines to non-blank paragraphs.
 ps = paragraphs(lines)
 # Create a Plan to hold the workflows.
 plan = Plan.from_lines(ps[0])
 # Parse rating lines into dicts.
 ratings = [parse_rating(line) for line in ps[1]]
 # For ratings accepted by the plan, return sum of the rating values.
 return sum(
 sum(r.values())
 for r in ratings
 if plan.is_accepted(r)
 )

What Plan needs. It needs to be able to take lines and return a Plan instance. And it needs to be able to take a rating and return a bool. As we did above, we keep writing code we wish we had: in this case, we wish a Workflow instance could take a rating dict and tell us the resulting destination.

from dataclasses import dataclass
@dataclass
class Plan:
 workflows: list['Workflow']
 def __post_init__(self):
 # Build a dict mapping names to workflows.
 self.lookup = {...}
 @classmethod
 def from_lines(cls, lines):
 return cls(Workflow.from_line(line) for line in lines)
 def is_accepted(self, rating):
 curr = 'in'
 while True:
 wf = self.lookup[curr]
 dest = wf.get_destination(rating)
 if dest == 'A':
 return True
 elif dest == 'R':
 return False
 else:
 curr = dest

Keep going: the Workflow. A workflow has a name and some Rule instances. As above, we implement a little bit of parsing logic here and a little bit of solution logic -- but not too much, because we freely delegate grubby details farther down the food chain.

@dataclass
class Workflow:
 name: str
 rules: list['Rule']
 @classmethod
 def from_line(cls, line):
 ...
 return cls(...)
 def get_destination(self, rating):
 for r in self.rules:
 if (dest := r.destination_if_satisfied(rating)):
 return dest

More levels: the Rule and maybe the Condition. In the wished-for code above, we want a Rule instance to be able to take a rating and return the rule's destination if the rating satisfies the rule. And we could even go farther by envisioning a Condition class that would also contain a tiny bit of parsing code and a little bit of solution logic. Perhaps that final level is not helpful enough to justify the overhead. Either way, we have confined the needs to a very small domain, so we know the code in Rule and below will be quite simple.

@dataclass
class Rule:
 condition: 'Condition'
 destination: str
 @classmethod
 def from_text(cls, text):
 ...
 return cls(...)
 def destination_if_satisfied(self, rating):
 c = self.condition
 if c and not c.is_satisfied(rating):
 return None
 else:
 return self.destination

Notice what is missing: type annotation complexity. Because we broke the information down into meaningful units (plan, workflow, rule, etc), none of the methods sketched above would require heroic efforts to add type annotations.

Booboo Booboo 3,3164 silver badges14 bronze badges · Answer 2 · 2024-01-06 14:28:59Z

Let me add to the FMc's answer and address your specific questions:

How Can I Simplify Parse Parts?

Your function read_items could be improved. You are establishing try/except block for every iteration line = next(lines).strip() and the expression next(lines).strip() appears unnecessarily twice. Instead we can the try outside the loop and use the walrus operator as follows:

def read_items(lines: Iterable[str], parse_func):
 items = []
 try:
 while (line := next(lines).strip()):
 items.append(parse_func(line))
 except StopIteration:
 pass # No lines to process.
 return items

But I, too, believe that for greater clarify you should have distinct functions for parsing workflows and ratings. For workflows we iterate until we find an empty line. We should never get a StopException and so we do not need to use a try/except block for parsing workflows. But ultimately you could achieve simpler code if you chose a different representation for your workflows. It appears that you currently have a list of dictionaries. Each dictionary has a single key whose value is the name of the workflow and its values is a list of rules that apply for that workflow. Then given the name of a workflow you have to do a linear search of this list to find a dictionary that has a key matching the workflow name. I would instead create a single dictionary, for example named 'workflows', with multiple keys, one for each named workflow. So workflows['px'] is a list of the rules that applies to the workflow named 'px'.

But how should each rule be represented? There are many possibilities. Let's look at a sample workflow specification:

rfg{s<537:gd,x>2440:R,A}

We have 3 rules for workflow 'rfg' and the final rule is always a default rule that always fires. When a rule fires the result is either the name of another rule or 'R' (reject the current part) or 'A' (accept the current part). I have chosen to represent a rule as a tuple of 5 elements (or you can use a collections.namedtuple instance if you want to reference each element by a name rather than by an index). For a default rule that always fires the first 4 elements of the tuple will always be None and the 5th element will be the default "action". For example, (None, None, None, None, 'A') represents the final, default rule above. Conversely, for a non-default rule where a comparison is required, the 5th element will always be None. For example, ('s', '<', 537, 'gd', None) represents the first rule above. Then by testing the 5th element for None we know whether we are dealing with a default rule or not.

You can use whatever technique you want for creating your dictionary of workflows where the key is a workflow name and its value is a list of tuples (or named tuples that makes it clearer to the reader what each element of a rule represents, as I have used my my code below) representing the rules. In the next section below I will be using regular expressions to parse the workflows, but if you are not interested in this approach just go down to the section labeled The Code.

Using Regular Expressions to Create the `workflows` Dictionary

I will use a regular expression to first break the workflow into its name and rules:

([a-z]+){([^}]+)}

([a-z]+) - Capture group 1 will be one or more lower-case letters.
{ - Followed by '{'.
([^}]+) - Followed by capture group 2 which are one or more characters that do not match '}'. That is, to match all the rules we scan until we find a '}' character.
} - Matches a '}' character.

When we match the above regular expression against input 'rfg{s<537:gd,x>2440:R,A}' capture group 1 (the workflow's name) will be 'rfg' and capture group 2 will be 's<537:gd,x>2440:R,A', a comma-separated list of rules for this workflow.

Then once we have the string 's<537:gd,x>2440:R,A', i.e. the rules, we want to match successive rules with the following regular expression:

([xmas])([<>])([0-9]+):([a-z]+|R|A)|([a-z]+|R|A)

([xmas]) - Matches either 'x', 'm', 'a', 's' as capture group 1. This is the rule's rating category.
([<>]) - Matches either '<' or '>' as capture group 2.
([0-9]+) - Matches one or more digits as capture group 3.
: - Matches ':'.
([a-z]+|R|A) - Capture group 4 will be the action taken if we have a rule math. This can be either a string of lower-case letters specifying the name of the next workflow to be used or 'R' signifying that we are to reject this part or A signifying that we are to accept this part.
|([a-z]+|R|A)) - Capture group 5 signifies that we have an alternate possibility for what matches a rule, namely just a lower-case string (the next workflow name) or 'R' (reject this part) or 'A' (accept this part). This represents the final "default" rule that always fires if none of the previous rules match.

When we iteratively find successive matches using the above regular expression against 's<537:gd,x>2440:R,A', we will match 3 rules. For each rule we get a tuple of 5 elements, one for each capture group:

('s', '<', '537', 'gd', None)
('x', '>', '2440', 'R', None)
(None, None, None, None, 'A')

As previously stated, when we are matching the final rule representing the default rule, all elements of the tuple are None except for the last (index 4). Conversely, the last element will be None for all non -default rules). So it becomes quite easy to test whether the current rule under consideration is a default rule.

If you do not want to have to remember what each element of a rule's tuple represents we can use a collections.namedtuple as follows:

from collections import namedtuple
...
WorkflowRule = namedtuple('WorkflowRule', [
 'rating_name',
 'rating_operator',
 'rating_value',
 'new_workflow_name',
 'default_workflow_name'
 ]
)

Then all workflows can be read an parsed as follows:

import re
...
workflows = {}
workflow_rex1 = re.compile(r'([a-z]+){([^}]+)}')
workflow_rex2 = re.compile(r'([xmas])([<>])([0-9]+):([a-z]+|R|A)|([a-z]+|R|A)')
def read_workflows(lines: Iterable[str]) -> None:
 while (line := next(lines)) != '':
 m = workflow_rex1.search(line)
 workflow_name = m[1] # First capture group
 workflow_rules = []
 for m in workflow_rex2.finditer(m[2]): # Iterate the rules (second capture group)
 groups = list(m.groups()) # Convert the rules tuple to a list
 if groups[2] is not None: # We have a string representing an integer
 groups[2] = int(groups[2]) # Convert to an integer
 # Create a named tuple from the rules and append it to our list of rules:
 workflow_rules.append(WorkflowRule(*groups))
 workflows[workflow_name] = workflow_rules

ratings can remain a list of dictionaries just as you currently have it. Then part 1 is solved as follows:

The Code

I have chosen to use collections.namedtuple instances to represent a workflow rule. If you skipped the previous section, here is the named tuple again:

from collections import namedtuple
...
WorkflowRule = namedtuple('WorkflowRule', [
 'rating_name',
 'rating_operator',
 'rating_value',
 'new_workflow_name',
 'default_workflow_name'
 ]

Once we have parsed the workflows and ratings, then Part 1 is solved with:

def is_accepted(rating: dict[str, int]) -> bool:
 """Do we accept this rating?"""
 workflow_name = 'in' # Initial workflow name to process
 while True:
 workflow_rules = workflows[workflow_name] # List of rules for this workflow
 for workflow_rule in workflow_rules: # Process each rule
 if workflow_rule.default_workflow_name: # Default rule if not None
 workflow_name = workflow_rule.default_workflow_name # The next action
 result = True # default rules always fire
 else:
 # A non-default rule requiring a comparison:
 workflow_name = workflow_rule.new_workflow_name # The next action if this rule fires
 value1 = rating[workflow_rule.rating_name]
 value2 = workflow_rule.rating_value
 result = value1 > value2 if workflow_rule.rating_operator == '>' else value1 < value2
 if result: # If the rule fires
 if workflow_name == 'R':
 return False
 if workflow_name == 'A':
 return True
 break # Else process workflow name
 # Else process new rule
def part1() -> int:
 """Solves Part 2 of Day 19 Advent of Code.
 See: https://adventofcode.com/2023/day/19"""
 # We assume the `ratings` and `workflows` variables have been
 # created and the code for that is omitted here.
 total = 0
 for rating in ratings:
 if is_accepted(rating):
 total += sum(rating.values())
 return total

Was There a Better Way to Store the Input?

Answered in the previous section.

Did I Go Overboard with the Type Hints?

No. But with the above suggestions they should be less complicated.

The walrus operator is new to me, I was missing C's assignments inside the loop conditions. I can see why they'd introduce a new operator specifically for this purpose. Thank you.
If you are not already familiar with regular expressions, they are a worthwhile addition to your "toolbox". And if you are interested, here is the complete code for my implementation (the input is coming from stdin).
If you want to be more object oriented, you can create an abstract base class WorkFlowRule with two subclasses, DefaultWorkFlowRule and ConditionalWorkflowRule. See this.
You could name your capture groups to avoid having to access them with an index that has no real meaning. The first regex would look like (?P<workflow_name>[a-z]+){(?P<rules>[^}]+)}, and the workflow_name group cloud be accessed with m.group("workflow_name"). Also, read_workflows expects an Iterator, not an Iterable. Great answer otherwise!
@Harith Actually \w also captures numbers and \d can capture more than just the digits 0 .. 9 unless ASCII mode is set as an option. (with re.ASCII) If you assume the input is "correct", then you could use (\w)=(\d+). But if you want to verify the correctness of the input then use ([xmas])=([0-9]+). By the way, [a-z]{1} is equivalent to just [a-z].

Stack Exchange Network

Advent of Code 2023 - Day 19: Aplenty (Part 1)

Description:

Review Request:

2 Answers 2

How Can I Simplify Parse Parts?

Using Regular Expressions to Create the `workflows` Dictionary

The Code

Was There a Better Way to Store the Input?

Did I Go Overboard with the Type Hints?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Advent of Code 2023 - Day 19: Aplenty (Part 1)

Description:

Review Request:

2 Answers 2

How Can I Simplify Parse Parts?

Using Regular Expressions to Create the workflows Dictionary

The Code

Was There a Better Way to Store the Input?

Did I Go Overboard with the Type Hints?

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Using Regular Expressions to Create the `workflows` Dictionary