Objective:
Given a string
s
containing '(', ')', '{', '}', '[', ']' (and other characters), determine if the input string is valid.An input string is valid if:
- Open brackets are closed by the same type of brackets.
- Open brackets are closed in the correct order.
Code:
#!/usr/bin/env python3
from pathlib import Path
import typer
def validate_brackets(s: str, brackets={"[": "]", "{": "}", "(": ")"}) -> bool:
stack = []
opening = brackets.keys()
closing = brackets.values()
for c in s:
if c in opening:
stack.append(c)
if c in closing:
if not stack or c != brackets[stack[-1]]:
return False
stack.pop()
return not stack
def main(file: Path) -> None:
with open(file) as f:
print(validate_brackets(f.read()))
if __name__ == "__main__":
typer.run(main)
Review Request:
General coding comments, style, bad practices, et cetera.
5 Answers 5
Permit me to add the following to the review posted by toolic:
Use Docstrings
I would add a docstring to function validate_brackets
describing its functionality.
Create a set
from closing
Update
I have done a little research into the implementation of the dict_keys
class an instance of which is created with opening = brackets.keys()
and checking c in opening
should be more or less equivalent to checking c in brackets
, which is to say efficient. So there appears to be no need to create a set from the keys. However, if we want to efficiently check to see whether brackets.values()
contains duplicates, then it seems to me we should create a set
from these values and verify that its length is the same as the number of keys. This should also speed up the check c in closing
. We can check to see if the set of keys and the set of values are disjoint by using opening.isdisjoint(closing)
(see the next section concerning validation).
Sample Validation of the brackets Argument
Things you might test for:
- The brackets argument is a non-empty
dict
instance. brackets.values()
contains no duplicate characters. Although this condition is not strictly necessary, it is the norm and we will enforce it.- The opening and closing characters should be disjoint.
The above validations are more easily performed when the closing characters are a set. This will also speed up the check c in closing
. We will place the validation logic in a separate validate_brackets_dict
function for greater readability. Since this function might be called repeatedly for the same brackets
dictionary instance, we can cache previously-validated dictionaries in a dictionary whose key is a frozenset
of the dictionary's items (since dictionaries are not hashable) and whose value is a tuple consisting of the opening and closing characters to be used.
I would also make the default value for the brackets argument to be None
for which a default dictionary that requires no validation will be provided.
Putting It All Together
With the above suggested changes, the final source could be:
#!/usr/bin/env python3
from pathlib import Path
import typer
from typing import Union, Dict # For older Python versions
default_brackets = {"[": "]", "{": "}", "(": ")"}
default_opening = default_brackets.keys()
default_closing = set(default_brackets.values())
validated_brackets_dict = {}
def validate_brackets_dict(brackets: Dict) -> tuple:
"""Validate the brackets dictionary and if valid
return a tuple of the opening and closing characters to be
used. Otherwise, an exception is raised."""
if not (isinstance(brackets, dict) and brackets):
raise ValueError('brackets must be a non-empty dict instance')
cache_key = frozenset(brackets.items()) # We cannot cache a dictionary
opening_closing = validated_brackets_dict.get(cache_key, None)
if opening_closing is None:
# Validating a brackets dictionary that we haven't seen before
opening, closing = brackets.keys(), set(brackets.values())
# Although ensuring non-duplicate closing brackets is not strictly necessary,
# it is certainly the norm. So we can consider the following
# check optional but desirable:
if len(closing) != len(brackets):
raise ValueError('Duplicate closing characters')
# Check for closing characters disjoint from opening
# characters:
if not opening.isdisjoint(closing):
raise ValueError('Opening and closing characters are not disjoint')
# Cache for possible next time:
opening_closing = (opening, closing)
validated_brackets_dict[cache_key] = opening_closing
return opening_closing
def validate_brackets(text: str, brackets: Union[Dict, None]) -> bool:
"""Determine if the input text argument contains balanced
parentheses. The optional brackets arguments is a dictionary
of "parentheses" key/value pairs to be used.
If brackets is None then a default dictionary is used."""
if brackets is None:
brackets = default_brackets
opening, closing = default_opening, default_closing
else:
opening, closing = validate_brackets_dict(brackets)
stack = []
for c in text:
if c in opening:
stack.append(c)
elif c in closing:
if not stack or c != brackets[stack[-1]]:
return False
stack.pop()
return not stack
def main(file: Path) -> None:
with open(file) as f:
text = f.read()
print('text:', repr(text))
print()
# Some test case for validation:
for brackets in (
[],
{},
{'[': ']','(': ']'},
{'[': ']','(': '['},
{'[': 'x'},
None
):
print('brackets:', repr(brackets), end='')
try:
result = validate_brackets(text, brackets)
except Exception as e:
result = e
print(', result:', repr(result))
if __name__ == "__main__":
typer.run(main)
Prints:
text: 'abc[xx]'
brackets: [], result: ValueError('brackets must be a non-empty dict instance')
brackets: {}, result: ValueError('brackets must be a non-empty dict instance')
brackets: {'[': ']', '(': ']'}, result: ValueError('Duplicate closing characters')
brackets: {'[': ']', '(': '['}, result: ValueError('Opening and closing characters are not disjoint')
brackets: {'[': 'x'}, result: False
brackets: None, result: True
-
1\$\begingroup\$ @bob The idea was to not waste cycles performing validation on a value of
brackets
known to be valid. \$\endgroup\$Booboo– Booboo2024年02月20日 03:12:00 +00:00Commented Feb 20, 2024 at 3:12 -
\$\begingroup\$ Ah yes you’re right, I’m not sure what I was thinking. Lol. 🤦♂️ \$\endgroup\$bob– bob2024年02月20日 03:15:04 +00:00Commented Feb 20, 2024 at 3:15
-
1\$\begingroup\$ I have updated the post with a separate method
validate_brackets_dict
that does the validation of the passedbrackets
dictionary and caches the computedopening
andclosing
characters so if a successive call tovalidate_brackets
is made with the samebrackets
dictionary (highly likely) nothing has to be recomputed. \$\endgroup\$Booboo– Booboo2024年02月20日 12:26:20 +00:00Commented Feb 20, 2024 at 12:26 -
1\$\begingroup\$ @KellyBundy True, but a user will have defined a
brackets
dictionary and then will use the same dictionary repeatedly rather than passing a different dictionary with the same key/value pairs but in a different insertion order. If you are truly concerned, then you can usesorted(tuple(brackets.items()))
. Me, I am not concerned. \$\endgroup\$Booboo– Booboo2024年02月20日 13:01:06 +00:00Commented Feb 20, 2024 at 13:01 -
1\$\begingroup\$ @KellyBundy I did a benchmark of using a
frozenset
vs. what I was doing and thefrozenset
not only is key-insertion-order independent, it is faster and so I made the change. \$\endgroup\$Booboo– Booboo2024年02月21日 12:24:14 +00:00Commented Feb 21, 2024 at 12:24
The code layout looks good, and you gave the functions and variables meaningful names.
That being said, you could consider using longer variable names for the single-letter ones. Perhaps, text
instead of s
, for example.
You should consider checking the inputs to the validate_brackets
function. For example, I assume that you do not want to support the case where the opening and closing bracket is the same character, like [
and [
. This code returns True
:
print(validate_brackets('[aaa]', {"[": "["}))
If you do add checking for this case, then you can use elif
for the closing
condition since they are mutually exclusive:
if c in opening:
stack.append(c)
elif c in closing:
I also recommend adding header comments describing the overall functionality and if there are restrictions on the inputs.
If you have not done so, adding some basic unit tests would be beneficial.
-
1\$\begingroup\$ I had actually started with
text
, but changed it tos
as per the specification later. :) \$\endgroup\$Madagascar– Madagascar2024年02月19日 13:52:59 +00:00Commented Feb 19, 2024 at 13:52 -
\$\begingroup\$ +1 about names. It would make the code a lot easier to grasp at a glance. Instead of "stack" what about an explanation of what c represents, e.g. "unterminated_brackets" or something similar? \$\endgroup\$bob– bob2024年02月20日 03:12:28 +00:00Commented Feb 20, 2024 at 3:12
-
2\$\begingroup\$ @bob Such a long name would make the code harder to read. And "stack" is a pretty good name. If I see this task and then see the word "stack", I immediately know what they're going to do (unless they do something weird instead of the usual). \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月20日 05:23:41 +00:00Commented Feb 20, 2024 at 5:23
-
\$\begingroup\$ It is a long name and there’s probably a way to shorten it, but naming it for what it holds rather than for what it is will greatly enhance readability. If it’s called stack, the reader has to mentally reverse-engineer the algorithm to understand it as stacks can hold all sorts of things. If it has an indicative name that cognitive burden is reduced. \$\endgroup\$bob– bob2024年02月20日 12:37:26 +00:00Commented Feb 20, 2024 at 12:37
-
3\$\begingroup\$ @bob "If it’s called stack, the reader has to mentally reverse-engineer the algorithm to understand it" - I wouldn't call it reverse-engineer. When I saw that name, not only did I know that it's a stack, but I also (correctly) predicted what it will hold and how it will be handled, as that's the straightforward way to solve this and similar tasks and I've seen or used it many times. And then I only had to read the code and confirm the prediction. Note I'm not saying the name
stack
is always this useful and sufficient. But here I think it is. \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月20日 12:52:35 +00:00Commented Feb 20, 2024 at 12:52
Minor improvement maybe: Replacing
if not stack or c != brackets[stack[-1]]:
return False
stack.pop()
with:
if not stack or c != brackets[stack.pop()]:
return False
-
3\$\begingroup\$ I’m not sure eliminating a line is beneficial here as it could make it easier to miss an important state-modifying operation: stack.pop(). Personally I’d leave that on its own line, and possibly even comment it as long as the comment wasn’t "pop the next item off the stack." \$\endgroup\$bob– bob2024年02月20日 03:06:08 +00:00Commented Feb 20, 2024 at 3:06
-
\$\begingroup\$ @bob It's not about eliminating a line (although I do see that as a benefit as always, as it allows another line to fit on the screen instead). It's about eliminating a wasteful access. And using what
pop()
returns, instead of throwing it away. Comment the pop()? For something this trivial? That would just be distracting. \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月20日 05:13:11 +00:00Commented Feb 20, 2024 at 5:13 -
1\$\begingroup\$
if not stack: return False
;match = stack.pop()
;return c == brackets[match]
might be better. That avoids side-effects buried in theor
. \$\endgroup\$Toby Speight– Toby Speight2024年02月20日 09:03:08 +00:00Commented Feb 20, 2024 at 9:03 -
\$\begingroup\$ @TobySpeight That doesn't work, the second
return
must be conditional as well. I don't think it's hidden/buried the way I put it, but ok. How aboutif not stack or stack.pop() != c:
then? After pushing not the opening parentheses but their counterparts needed to close them. \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月20日 11:34:59 +00:00Commented Feb 20, 2024 at 11:34 -
\$\begingroup\$ Yes. Still, as a general rule, I try to avoid side-effects inside conditionally-executed parts of expressions (
and
,or
,x if c else y
, etc). "Try to avoid" doesn't mean at all costs, though! I agree with pushing the expected close symbol - I think that's how I would write this function to start with, but I'm not sure why. \$\endgroup\$Toby Speight– Toby Speight2024年02月20日 13:22:51 +00:00Commented Feb 20, 2024 at 13:22
Separate out the brackets
There's a bit of preprocessing necessary for the brackets:
- Validation
- Separation of keys and values
You can just preprocess the brackets again and again and again each time the function is called, but it's quite a waste of time when the user will, most likely, pass the same specification every, single, time.
The solution in such a case is to create a Brackets
class, and perform all the preprocessing in its constructor.
The user is then free to pass the same instance over and over, or not to bother and just create it from scratch every time, or something in-between.
class Brackets:
def __init__(self, brackets: dict[str, str]):
self.__brackets = brackets
self.__open = frozenset(brackets.keys())
self.__close = frozenset(brackets.values())
# You may or may not wish to impose that closing brackets are
# distinct from one another. It is not strictly necessary though.
if not self.__open.isdisjoin(self.__close)
raise ValueError(f'Opening and closing characters are not disjoint in {brackets}')
def is_open(self, c: str) -> bool:
"""Returns whether `c` is an opening bracket"""
return c in self.__open
def is_close(self, c: str) -> bool:
"""Returns whether `c` is a closing bracket"""
return c in self.__close
def is_matching(self, open: str, close: str) -> bool:
"""Returns whether `close` is the matching closing bracket of `open`"""
return self.__brackets[open] == close
DEFAULT_BRACKETS = Brackets({"[": "]", "{": "}", "(": ")"})
Then, the function becomes:
def validate_brackets(s: str, brackets: Brackets=DEFAULT_BRACKETS) -> bool:
"""Returns whether `s` has balanced brackets, or not.
A string has `s` has balanced brackets if ...
"""
stack = []
for c in s:
if brackets.is_open(c):
stack.append(c)
continue
if not brackets.is_close(c):
continue
if not stack or not brackets.is_matching(stack[-1], c):
return False
stack.pop()
return not stack
-
\$\begingroup\$ -1 because of useless over-engineering \$\endgroup\$stefan– stefan2024年02月23日 12:39:35 +00:00Commented Feb 23, 2024 at 12:39
The code is lean and maintainable. There are minor issues:
predicate naming
Predicates shall start with e. g. is_...
or has_...
so that
True
and False
are natural answers.
useless locals
There is no need for
opening = brackets.keys()
closing = brackets.values()
as you can test for membership directly at no extra cost. Also the names do not help, brackets
is a well introduced variable and is comparable in reading length. Also the locals miss the plural. Reading code is simpler if there are less names.
dict.keys()
Instead of testing for
some_var in brackets.keys()
you test for
some_var in brackets
Parameter brackets
I dislike the parameter as it is calling for errors. The function fails if the user does a typo. Also there might be a genius using the function to test for closed strings by passing
is_valid(s, brackets={'"': '"'})
Either
make brackets a constant
or limit the function parameters to explicitly named bracket types
is_valid(s, brackets=True, braces=True, parentheses=True)
So you can test your function for all parameters.
Documentation
Add a one liner docstring explaining what is the meaning of "valid".
-
\$\begingroup\$ What's your replacement for
c in closing
? \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月23日 15:41:24 +00:00Commented Feb 23, 2024 at 15:41 -
\$\begingroup\$ @KellyBundy edited to clarify - it is a generic advice. \$\endgroup\$stefan– stefan2024年02月23日 16:22:07 +00:00Commented Feb 23, 2024 at 16:22
-
\$\begingroup\$ So it's
c in brackets.values()
? I find that less nice, but ok. Though "at no extra cost" is not true. Made the whole solution ~20% slower in my testing. \$\endgroup\$Kelly Bundy– Kelly Bundy2024年02月23日 17:49:39 +00:00Commented Feb 23, 2024 at 17:49
typer
for the CLI. \$\endgroup\$