The following code solves Advent Of Code 2021 Day 10.
The goal for this time around is evaluating whether a line of brackets is complete, has a syntax error or is missing some closing brackets.
Examples of valid pairs of brackets
()
[]
([])
{()()()}
<([{}])>
[<>({}){}[([])<>]]
(((((((((())))))))))
Example of incorrect or corrupted lines
{([(<{}[<>[]}>{[]{[(<()> # Expected ], but found } instead.
[[<[([]))<([[{}[[()]]] # Expected ], but found ) instead.
[{[{({}]{}}([{[{{{}}([] # Expected ), but found ] instead.
[<(<(<(<{}))><([]([]() # Expected >, but found ) instead.
<{([([[(<>()){}]>(<<{{ # Expected ], but found > instead.
Examples of incomplete lines
[({(<(())[]>[[{[]{<()<>> # Complete by adding }}]])})].
[(()[<>])]({[<{<<[]>>( # Complete by adding )}>]}).
(((({<>}<{<{<>}{[]{[]{} # Complete by adding }}>}>)))).
{<[[]]>}<{[{[{[]{()[[[] # Complete by adding ]]}}]}]}>.
<{([{{}}[<[[[<>{}]]]>[]] # Complete by adding ])}>.
Problem:
Advent of code 10 was divided into two problems:
- Find the sum of all lines that has a syntax error (
corrupted
), where every bracket that causes a syntax error is given a particular value. - For each line that has a syntax error compute the string of closing characters that when appended to the line will make the line syntactically correct (i.e. "complete"). We then compute for this closing string an autocomplete score as follows: We set the score initially to 0. Then for each character of the closing string, multiply the score by 5 and then add to it a point value associated with the closing character. Finally find the median of all the autocomplete scores where the number of such scores is guaranteed to be odd.
Code
from typing import Final
# Points for each illegal character
points: Final[dict[str, int]] = {
')': 3,
']': 57,
'}': 1197,
'>': 25137
}
# Bracket pairs
pairs: Final[dict[str, int]] = {
'(': ')',
'[': ']',
'{': '}',
'<': '>'
}
# Values for scoring completion characters
values: Final[dict[str, int]] = {
')': 1,
']': 2,
'}': 3,
'>': 4
}
def syntax_error_score_and_autocomplete_score(lines):
scores = []
total_score = 0
for line in lines:
stack = []
corrupted = False
corrupted_char = None
for char in line.strip():
if char in pairs:
# Opening bracket
stack.append(char)
else:
# Closing bracket
if not stack:
# Extra closing bracket, corrupted
corrupted = True
corrupted_char = char
break
top = stack.pop()
if pairs[top] != char:
# Mismatched closing bracket, corrupted
corrupted = True
corrupted_char = char
break
# Ignore corrupted lines
# If the line is corrupted, add the corresponding points
if corrupted:
total_score += points[corrupted_char]
continue
# If stack is not empty, we need to complete the line
if stack:
completion_string = ""
while stack:
opening = stack.pop()
completion_string += pairs[opening]
# Calculate score for this completion string
score = 0
for ch in completion_string:
score = score * 5 + values[ch]
scores.append(score)
print("part1: (syntax_error_score)", total_score)
scores.sort()
# Get the middle score
print("part2: (autocomplete_score)", scores[len(scores)//2])
if __name__ == "__main__":
with open("D:\\ExplorerDownload\\input2021_day10.txt", "r") as f:
lines = f.readlines()
syntax_error_score_and_autocomplete_score(lines)
5 Answers 5
Looks good. It is easy to read, with clear identifiers.
extra boolean
This is some nice, idiomatic code.
corrupted_char = None
Given the type of str | None
,
we don't need a bool corrupted
flag alongside it.
Testing if the char is None
does the same job as seeing if the flag is False
.
Fewer state variables means fewer things to keep juggling in your head.
On which topic...
extract helper
We strive to write short functions which can be viewed in a single screenful, without vertical scrolling.
That for char ...
loop performs a nice localized computation
and is a good candidate for packaging up in a helper function.
Pass in a line and a stack, side effect the stack, and get
back a corrupted_char
.
odd annotation
We don't tend to often see Final
in annotated production code.
pairs: Final[dict[str, int]] = {
Quoting from its documentation:
A final name cannot be re-assigned or overridden in a subclass. ...
There is no runtime checking of these properties.
Consider instead using from types import MappingProxyType
and assign one of those.
Type checkers know all about it, plus its cousins such as frozenset
.
Runtime mutation attempts will produce
TypeError: 'mappingproxy' object does not support item assignment
In the simpler scalar case, rather than marking it Final
we typically will upcase such constants, e.g. SECONDS_PER_DAY = 86400
.
And yes, linters will call you out if they see you
trying to increment such a quantity.
And do run pyright
or mypy
over your code after
you've annotated it.
That dict[str, int]
copy-pasta mapping makes no sense.
You wanted dict[str, str]
.
Bad annotations are noticed by type checkers,
but have no effect at runtime.
extra comments
This source code is perhaps a little on the chatty side, but not overly so, not so much that it worries me. Do consider deleting the occasional comment if you feel it says nothing more than what the code already has eloquently stated. Your descriptive identifiers are a big help in that regard.
This was the only comment item that gave me pause:
if char in pairs:
# Opening bracket
stack.append(char)
It makes me slightly sad, because ideally an identifier
like pairs
should be doing the work of explaining that.
Yet I couldn't think of a better name than pairs
, sorry.
Maybe def is_opening_bracket(char):
is the best way out of this?
median
... we were then asked to find the median of the autocomplete scores.
print("part2: (autocomplete_score)", scores[len(scores)//2])
I didn't notice anything in your problem statement which promises there shall be an odd number of scores. In the even case I would expect a median computation to find the mean of the two central figures. Notice that if there's a bunch of repeated scores this code might "get lucky", as the mean of two equal numbers is identical to the value of either one of them. But with enough test inputs your luck will eventually run out.
I just have one suggestion to add:
Simplification
For Part 2 you have:
# If stack is not empty, we need to complete the line
if stack:
completion_string = ""
while stack:
opening = stack.pop()
completion_string += pairs[opening]
# Calculate score for this completion string
score = 0
for ch in completion_string:
score = score * 5 + values[ch]
scores.append(score)
This needlessly builds up a string which is then iterated. Simpler would be:
# If stack is not empty, we need to complete the line
if stack:
score = 0
while stack:
opening_char = stack.pop()
score = score * 5 + values[pairs[opening_char]]
scores.append(score)
Other answers say the code is easy to read, and I'm in no position to contradict that, as readability is always subjective. But to me, this code is not easy to read. It's just one big function with very deeply nested loops and conditionals, even with break
s and continue
s, and I have to read and understand all of it to get a sense of what's going on. I can only understand it because I happen to know the requirements (the AoC question).
So in my opinion, the first improvement to make is to divide this code into smaller bits that make sense on their own, and extract them into their own functions or even files.
Speaking of dividing, there are two very independent concerns in this solution. There is parsing a line, and there is computing some score based on the results. They're conceptually separate issues, but in this code they're all mixed up.
Another hint that the code is not readable is that it needs so many comments. Right now the comments make sense of it, but they wouldn't be necessary if the code were self-explanatory. For example, these comments:
# Points for each illegal character
points: Final[dict[str, int]]
...
# Values for scoring completion characters
values: Final[dict[str, int]]
Are not necessary if you just give the variables clearer names:
points_for_illegal_chars: Final[dict[str, int]]
points_for_completion_chars: Final[dict[str, int]]
Then this comment
# Calculate score for this completion string
Is not necessary if you have a function about calculating something
And this comment
# Get the middle score
print("part2: (autocomplete_score)", scores[len(scores)//2])
Is not necessary if you define a median
function
print("part2: (autocomplete_score)", median(scores))
...
def median(values):
return sorted(values)[len(values) // 2
Here's the code I ended with after refactoring yours. The main difference is that I've tried to keep the parsing logic separate from the scoring logic. To me, that's non-negotiable as otherwise there's no way you could write tests for the parsing bit. There are some other decisions that are way more subjective, such as the use of exceptions for control flow and parsing each line twice, and whether the parsing logic shouldn't be all just kept in just one function.
# parsing.py
from typing import Final
_bracket_pairs: Final[dict[str, int]] = {
'(': ')',
'[': ']',
'{': '}',
'<': '>'
}
def parse_line(line):
_ParseAlgorithm().run(line)
class _ParseAlgorithm:
def run(self, line):
self._initialize()
for char in line.strip():
if self._is_valid_opener(char):
self._open_group(char)
elif self._is_valid_closer(char):
self._close_group()
else:
raise UnexpectedCharException(char)
if self._expected_closers:
raise IncompleteLineException(self._expected_closers)
def _initialize(self):
self._expected_closers = []
def _is_valid_opener(self, char):
return char in _bracket_pairs.keys()
def _is_valid_closer(self, char):
return self._expected_closers and self._expected_closers[0] == char
def _open_group(self, char):
self._expected_closers.insert(0, _bracket_pairs[char])
def _close_group(self):
self._expected_closers.pop(0)
class UnexpectedCharException(Exception):
def __init__(self, unexpected_char):
self.unexpected_char = unexpected_char
class IncompleteLineException(Exception):
def __init__(self, missing_string):
self.missing_string = missing_string
# scoring.py
from typing import Final
import parsing
_corruption_points: Final[dict[str, int]] = {
')': 3,
']': 57,
'}': 1197,
'>': 25137
}
_autocomplete_points: Final[dict[str, int]] = {
')': 1,
']': 2,
'}': 3,
'>': 4
}
def compute_corruption_score(lines):
result = 0
for line in lines:
try:
parsing.parse_line(line)
except parsing.UnexpectedCharException as e:
result += _corruption_points[e.unexpected_char]
except parsing.IncompleteLineException:
pass
return result
def compute_autocomplete_score(lines):
autocomplete_scores = []
for line in lines:
try:
parsing.parse_line(line)
except parsing.UnexpectedCharException:
pass
except parsing.IncompleteLineException as e:
autocomplete_scores.append(_autocomplete_score_single_line(e.missing_string))
return _median(autocomplete_scores)
def _autocomplete_score_single_line(missing_string):
result = 0
for ch in missing_string:
result *= 5
result += _autocomplete_points[ch]
return result
def _median(values):
return sorted(values)[len(values) // 2]
# main.py
import scoring
if __name__ == "__main__":
with open("./input2021_day10.txt", "r") as f:
lines = f.readlines()
print("part1: (syntax_error_score)", scoring.compute_corruption_score(lines))
print("part2: (autocomplete_score)", scoring.compute_autocomplete_score(lines))
In addition to the points raised in the previous answer, here are some other items to consider.
Naming
User @J_H makes an astute point that you only need to retain one of these variables:
corrupted
corrupted_char
However,if you do decide to keep both, I recommend renaming corrupted
as is_corrupted
. Adding the is_
prefix is a common practice for
booleans, and it would more clearly distinguish the 2 variables by name.
Comments
My first impression while reading through the code was that the function had well-placed comments that helped me understand what was going on.
Magic number
It is not obvious to me what the significance of the 5
is in
the following line:
score = score * 5 + values[ch]
Perhaps a comment would help to clarify.
Which leads us to ...
Documentation
I understand that the focus for programming challenges is functionality, and the last thing your are thinking about is documentation, but adding some simple docstrings would help to understand the purpose of the code.
-
1\$\begingroup\$ I agree that
5
is inscrutable. I take it to be "one more thanmax(values.values())
". @138Aspen, would you please edit the question to at least include a Part Two link? And ideally add one or more sentences so the CodeReview question makes sense on its own, even if the AoC problem description page were to go 404. \$\endgroup\$J_H– J_H2024年12月16日 17:22:22 +00:00Commented Dec 16, 2024 at 17:22 -
\$\begingroup\$ @J_H please don't ask to copy&paste AOC problem texts here, it's forbidden by AOC rules. And there's no part 2 link - part 2 opens for signed in users who solved part 1 of that day, and there's a reason for such treatment. So OP can only rephrase the question in their own words \$\endgroup\$STerliakov– STerliakov2024年12月16日 21:43:48 +00:00Commented Dec 16, 2024 at 21:43
-
\$\begingroup\$ @STerliakov, I was only trying to minimize the OP's effort. @138Aspen, would you please take another stab at rephrasing the Part Two question in your own words? The CodeReview question as written does not offer sufficient detail for a reviewer to adequately understand the problem assignment to properly review the code. In particular the
5
is inscrutable and the AoC site is perhaps not using the word "median" in the way that most high school math teachers would use it. Maybe medoid was intended? \$\endgroup\$J_H– J_H2024年12月16日 21:56:47 +00:00Commented Dec 16, 2024 at 21:56 -
1\$\begingroup\$ I know the problem stmt, median calculation here is correct - the problem promised there will always be an odd number of malformed lines. (But yes, the question should be edited to clarify this and scoring rules) \$\endgroup\$STerliakov– STerliakov2024年12月16日 22:01:11 +00:00Commented Dec 16, 2024 at 22:01
DRY
The repeated statements to deal with either an empty stack or a 'corrupt character' suggest an alternative approach could be pursued.
Imagine pushing a "non-bracket" character, onto the stack before addressing any of the characters of line
. Imagine that character is, for example, '$'
, sitting there indicating "this is the end of the road."
The else
branch means 1 of 4 "closing bracket" characters from line
is being examined.
Both the "corrupt" case and the "empty stack" case can be determined in one conditional because the popped '$'
won't match any of the 4 possible close brackets.
Note: '$'
must be added as a 5th record to pairs[]
so that it can be found when needed.
pairs: Final[dict[str, str]] = {
'(': ')',
'[': ']',
'{': '}',
'<': '>',
'$': '$'
}
Result: 4 lines + 'pop()' + 4 lines ==> 'pop()' + 4 lines
When the characters of line
are exhausted, and presuming there is no 'corruption', the stack will not be empty. Fine! Go ahead! Drain what remains on the stack and tabulate some score. (Need to account for '$'
in this bizarre calculation because of the compounding x = x * 5 + y
.)
This eliminates the line if stack:
toward the bottom of the loop.
Adopting the already VG suggestions made by @J_H and by @Booboo...
stack = []
stack.append('$') # NEW!
corrupted_char = None
for char in line.strip():
if char in pairs:
# Opening bracket
stack.append(char)
else:
# Closing bracket
top = stack.pop()
if pairs[top] != char:
# Mismatched closing bracket, corrupted
corrupted_char = char
break # end of the road, corrupt or "empty" stack
if corrupted_char:
total_score += points[corrupted_char]
else: # instead of 'continue'. Avoid explicit flow control if possible!
score = 0
while stack:
top = stack.pop()
if top != '$':
score = score * 5 + values[pairs[top]]
if score: # Only record non-zero results...
scores.append(score)
There's nothing in the rules preventing the coder from manipulating the obvious data requirements (adding a 5th element) as long as the program's behaviour achieves the desired goal.
Really DRY
The observant reader will notice two translation lookup instances into pair[]
in that code snippet.
Instead of pushing "opens" onto the stack, this would be a time to push its required "close" mate. The popping operations are, already, tricky to think about:
stack = [ '$' ] # Best guess in my ignorance of Python
corrupted_ch = None
for ch in line.strip():
if ch in pairs: # Opening bracket
stack.append( pairs[ch] ) # Push its mate
else: # Closing bracket
if stack.pop() != ch: # Mismatched! corrupted
corrupted_ch = ch
break # end of the road, corrupt or '$' depleted stack
if corrupted_ch:
total_score += points[corrupted_ch]
else: # instead of 'continue'. Avoid explicit flow control if possible!
score = 0
while stack:
top = stack.pop()
if top != '$':
score = score * 5 + values[top]
if score: # Only record non-zero results...
scores.append(score)
(I believe guidelines probably discourage too much density (dangling comments on the same line as code statements. This is presented, as-is, to illustrate replacing two lookups with only one lookup.)
The calculation of "points" no longer needs to translate "opens" to "closes" when looking-up the value of each character.
Maybe?:
When calculating "points" perhaps more could be done simply using the index of each closing bracket character in a simpler string (".)]}>"
) to give 1
, 2
, 3
, or 4
.
-
\$\begingroup\$ After a good night's sleep the term sentinel rises out of the fog (with the morning's coffee.) This variation uses a "sentinel character" that removes the need to explicitly detect and deal with an empty stack (to improve focus on pairing-up brackets.) \$\endgroup\$Fe2O3– Fe2O32025年01月14日 21:30:59 +00:00Commented Jan 14 at 21:30
Explore related questions
See similar questions with these tags.