Any comments / improvements welcome on this simple function.
tests.py
@pytest.mark.parametrize([
# path
'/c/target-uri',
'/c/uk/indiana/target-uri',
'/c/uk/indiana/target-uri/persons',
'/c/uk/florida/target-uri/persons',
])
def test_extract_uri(self, path):
assert extract_uri(path) == 'target-uri'
main.py
from states import state_names
def extract_uri(path: str):
exclude_path_snippets = ['persons']
def filter_fn(element: str):
return all([
element.lower() not in state_names,
element.lower() not in exclude_path_snippets,
len(element) > 2, # exclude 'c' or 'uk'
])
path_snippets: List[str] = path.split('/')
filtered = list(filter(filter_fn, path_snippets))
if len(filtered) != 1:
raise Exception(f"Failed to extract uri from {path}")
return filtered[0]
3 Answers 3
filter_fn
should probably be moved to a global because it does not need closure.
Combine state_names
and exclude_path_snippets
to one set via |
, and then reduce to a single not in
.
all
is not called-for; use a single boolean expression. Once you have only one set comparison, you will only need one and
.
Don't raise a bare Exception
. Raise a ValueError
, or an application-specific exception.
Consider a tuple-unpack, potentially with a rethrow; and don't cast to a list
:
try:
filtered, = filter(filter_fn, path_snippets)
except ValueError as e:
raise ValueError(f'Failed to extract URI from {path}') from e
return filtered
filter_fn
deserves a better name like is_snippet_valid
.
-
\$\begingroup\$
filter_fn
doesn't need closure, but arguably doesn't need to be polluting the global namespace - is that an argument for using an inner function? \$\endgroup\$Toby Speight– Toby Speight2022年12月12日 14:07:26 +00:00Commented Dec 12, 2022 at 14:07 -
2\$\begingroup\$ @TobySpeight IMO no. Global functions are more easily unit-testable, and module exports can still be managed via
__all__
\$\endgroup\$Reinderien– Reinderien2022年12月12日 14:13:03 +00:00Commented Dec 12, 2022 at 14:13
Your tests are not very robust.
Consider the following implementation
def extract_uri(path: str):
return 'target-uri'
Is this the wrong implementation if it passes all your tests? Either both the implementation and the tests are ok, or neither is.
Simplify the filter_fn
function by using a list comprehension and an any()
call:
def test_extract_uri(self, path):
def filter_fn(element: str):
return [
element.lower() not in state_names,
element.lower() not in exclude_path_snippets,
len(element) > 2, # exclude 'c' or 'uk'
]
path_snippets: List[str] = path.split('/')
filtered = list(filter(filter_fn, path_snippets))
if len(filtered) != 1:
raise Exception(f"Failed to extract uri from {path}")
return filtered[0]
Alternatively, we can use a regular expression to extract the target URI from the path. This would make the code more readable and maintainable:
import re
def test_extract_uri(self, path):
match = re.match(r'/c(?:/[a-zA-Z]{2})?/[a-zA-Z]+/target-uri(?:/persons)?', path)
if not match:
raise Exception(f"Failed to extract uri from {path}")
return match.group(0)
It's also a good idea to import the List
type from the typing
module and add type hints to the extract_uri()
and test_extract_uri()
functions. Finally, it's a good practice to add a docstring explaining what the function does and what its input and output are.