I'm building an error handler which handles different kinds of errors arranged hierarchically. As part of that, I need a way to figure out what errors to send a recipient based on what their subscription level is. The below function gives me what I want, but it feels a bit crude to me. Please advise if there is a better, more pythonic way of doing this.
alert_levels = {
"all": {
"import": ["file", "email"],
"send": ["tableau", "delivery"]
}
}
def get_children_if_parent(tag: str, parent_dict: dict) -> list:
tag_list = []
for key in parent_dict.keys():
if type(parent_dict[key]) == dict:
if key == tag:
target = ''
else:
target = tag
tag_list.extend(get_children_if_parent(target, parent_dict[key]))
else:
if tag == key or tag == '':
tag_list.extend(parent_dict[key])
elif tag in parent_dict[key]:
tag_list.append(tag)
return tag_list
print(get_children_if_parent('send', alert_levels))
# Output: ['tableau', 'delivery']
print(get_children_if_parent('file', alert_levels))
# Output: ['file']
print(get_children_if_parent('import', alert_levels))
# Output: ['file', 'email']
print(get_children_if_parent('delivery', alert_levels))
# Output: ['delivery']
print(get_children_if_parent('all', alert_levels))
# Output: ['file', 'email', 'tableau', 'delivery']
````
4 Answers 4
Just a few minor suggestions to simplify the code, lighten its visual weight, or otherwise enhance readability.
Prefer normal language when feasible. If a regular word like tags
is
available and clear, don't complicate things by using something like
tag_list
.
If you need a dict's keys and values, iterate via items()
.
You use parent_dict[key]
multiple times. That's not necessary.
Indeed, iterating over a dict via keys()
is never useful,
because directly iterating over the dict does the same thing.
Use isinstance()
when checking a type. There are special situations when
type()
is needed, but for garden-variety checks, isinstance()
is the way to
go.
Give your special constant a name. Your are using empty string as a wildcard. Help your reader out by using a descriptive name to convey your intent.
WILDCARD = ''
def get_children_if_parent(tag, parent_dict):
tags = []
for key, val in parent_dict.items():
if isinstance(val, dict):
target = WILDCARD if key == tag else tag
tags.extend(get_children_if_parent(target, val))
else:
if tag == key or tag == WILDCARD:
tags.extend(val)
elif tag in val:
tags.append(tag)
return tags
You could flatten the conditional structure, but I would lean against it. Initially, I was going to suggest an if-elif-elif structure to reduce the indentation. But after considering it, I opted to leave that alone: your current approach emphasizes the crucial distinction between the dict case (recurse) and the rest.
Evaluate remaining awkward names. I find the function name and
parent_dict
to be a bit awkward and/or puzzling, but I don't have enough
context about your project to suggest compelling alternatives.
-
1\$\begingroup\$ Thank you for the comments and suggestions \$\endgroup\$mankand007– mankand0072023年03月10日 21:43:42 +00:00Commented Mar 10, 2023 at 21:43
Work in progress
Useless accesses to parent_dict
You iterate over the keys of parent_dict
and then retrieve parent_dict[key]
in various places. For a start, you could use a variable to do it just once but there are other (better) options.
Indeed, you could iterate over both keys and values in one go:
def get_children_if_parent(tag: str, parent_dict: dict) -> list:
tag_list = []
for key, val in parent_dict.items():
if type(val) == dict:
if key == tag:
target = ''
else:
target = tag
tag_list.extend(get_children_if_parent(target, val))
else:
if tag == key or tag == '':
tag_list.extend(val)
elif tag in val:
tag_list.append(tag)
return tag_list
Reduce indentation depth with elif
You could write the inside of the loop as:
if type(val) == dict:
if key == tag:
target = ''
else:
target = tag
tag_list.extend(get_children_if_parent(target, val))
elif tag == key or tag == '':
tag_list.extend(val)
elif tag in val:
tag_list.append(tag)
Edit:
Smaller functions
My feeling is that the logic around the empty string makes the logic harder to understand. Also, having an arbitrary value act as a flag may be misleading for the user who may genuinely want to use ""
as a argument. It may be easier to define a short function handle the case where key == tag
and val
is a dict.
We'd get something like:
def get_all_children(parent_dict: dict) -> list:
tag_list = []
for key, val in parent_dict.items():
if type(val) == dict:
tag_list.extend(get_all_children(val))
else:
assert type(val) == list
tag_list.extend(val)
return tag_list
def get_children_if_parent(tag: str, parent_dict: dict) -> list:
tag_list = []
for key, val in parent_dict.items():
if type(val) == dict:
if key == tag:
tag_list.extend(get_all_children(val))
else:
tag_list.extend(get_children_if_parent(tag, val))
else:
assert type(val) == list
if tag == key:
tag_list.extend(val)
elif tag in val:
tag_list.append(tag)
return tag_list
Second edit
Relaxing constraints on get_all_children inputs
By-reading the code above, something appeared to me: in 2 different places, we do the same thing, we get either get_all_children(val)
or just val
based on whether val
is a list.
It could be easier to just accept that get_all_children
can accept a list a as parameter and the logic becomes:
def get_all_children(collection) -> list:
if type(collection) == list:
return collection
assert type(collection) == dict
tag_list = []
for key, val in collection.items():
tag_list.extend(get_all_children(val))
return tag_list
Now, in get_children_if_parent
, we can handle key == tag
similarly for different types of input.
We'd get something like:
def get_children_if_parent(tag: str, parent_dict: dict) -> list:
tag_list = []
for key, val in parent_dict.items():
if key == tag:
tag_list.extend(get_all_children(val))
elif type(val) == dict:
tag_list.extend(get_children_if_parent(tag, val))
else:
assert type(val) == list
if tag in val:
tag_list.append(tag)
return tag_list
But then I realized that maybe the same strategy could be applied here: if get_children_if_parent
accepted both lists and dicts, the implementation may be more straight-forward:
def get_children_if_parent(tag: str, collection) -> list:
if type(collection) == list:
return [tag] if tag in collection else []
assert type(collection) == dict
tag_list = []
for key, val in collection.items():
tag_list.extend(get_all_children(val) if key == tag else get_children_if_parent(tag, val))
return tag_list
-
1\$\begingroup\$ Thank you for the alternate way. the empty string is one of the things that made my function feel crude. Replacing that with the get_all_children function adds more code, but feels more streamlined \$\endgroup\$mankand007– mankand0072023年03月10日 21:56:08 +00:00Commented Mar 10, 2023 at 21:56
-
1\$\begingroup\$ @mankand007 I'm glad you like it. I'm added more details to my solutions \$\endgroup\$SylvainD– SylvainD2023年03月10日 22:41:19 +00:00Commented Mar 10, 2023 at 22:41
Use proper naming
The name get_children_if_parent
is a bad name on two counts.
- Your function does not return a list of children but all leaf nodes in the tree that have a common ancestor with a given tag
- The name in ambigious. get_children_if_parent what? What is the condition here on the parent here.
A much better name here is get_leaf_nodes_of
.
Separate Concerns You are performing two tasks here. First you find a node that matches a tag and then starting from that node you list all the children. We can separate this and make the code more readable.
def find_value_of_node(tag, dictonary):
for key, value in dictonary.items():
if key == tag:
return value
if isinstance(value, dict):
return find_value_of_node(tag, value)
elif isinstance(value, list):
for item in value:
if tag == item:
return item
elif value == tag:
return value
def iterate_over_leaves(node):
for _, value in node.items():
if isinstance(value, dict):
yield from iterate_over_leaves(value)
else:
yield value
def list_all_leaves(node):
if isinstance(node, dict):
return [leaf for leaf in iterate_over_leaves(node)]
if isinstance(node, list):
return [item for item in node]
return [node]
def get_leaf_nodes_of(tag : str, dictonary : dict) -> list:
value = find_value_of_node(tag, dictonary)
if value is None:
return
return list_all_leaves(value)
print(get_leaf_nodes_of('send', alert_levels))
Make your data structure consistent
Currently, you have a tree where the leaf - 1
and leaf
nodes are different from others, aka the leaf - 1
nodes store children in a list, and leaf nodes
are strings. This creates a lot of noise in the code. Instead you can have a uniform structure by defining a node as given below.
@dataclass
class Alert:
name : str
children : List['Alert'] = field(default_factory=List)
alert_levels = Alert(
"all", [
Alert(
"import", [
Alert("file", []),
Alert("email", [])
]
),
Alert(
"send", [
Alert("tableau", []),
Alert("delivery", [])
]
)
]
)
def iterate_over_leaves(node):
if not node.children:
yield node
for child in node.children:
yield from iterate_over_leaves(child)
def find_node(tag, root):
if root.name == tag:
return root
for child in root.children:
node = find_node(tag, child)
if node:
return node
return None
def get_leaf_nodes_of(tag : str, dictonary : dict) -> list:
node = find_node(tag, dictonary)
if not node:
return []
return [leaf.name for leaf in iterate_over_leaves(node)]
I reworked my previous solution. It may, if your use case supports it be beneficial to flatten the data structure and include an all
field.
from typing import Any
alert_levels = {"import": ["file", "email"], "send": ["tableau", "delivery"]}
alert_levels["all"] = [
tag for tag_list in alert_levels.values() for tag in tag_list
]
def get_children_if_parent(tag: str, parent_dict: dict[str, Any]) -> list[str]:
return parent_dict.get(tag, [tag])
assert get_children_if_parent("send", alert_levels) == ["tableau", "delivery"]
assert get_children_if_parent("file", alert_levels) == ["file"]
assert get_children_if_parent("import", alert_levels) == ["file", "email"]
assert get_children_if_parent("all", alert_levels) == ["file", "email", "tableau", "delivery"]
If you don't like the idea of modifying the global object you could create your own dict to support the all
call
Note when you perform the d[key]
what you are calling under the hood, is the dictionary's __getitem__
special method.
class AlertLevel(dict[str, list[str]]):
def __getitem__(self, tag: str) -> list[str]:
return self.get(tag, [tag])
def get_all(self) -> list[str]:
return [tag for v in self.values() for tag in v]
alert_levels = AlertLevel( {"import": ["file", "email"], "send": ["tableau", "delivery"]})
assert alert_levels["send"] == ["tableau", "delivery"]
assert alert_levels["file"] == ["file"]
assert alert_levels["import"] == ["file", "email"]
assert alert_levels.get_all() == ["file", "email", "tableau", "delivery"]
-
\$\begingroup\$ This is a very different kind of solution. Thank you for this! \$\endgroup\$mankand007– mankand0072023年03月13日 22:10:21 +00:00Commented Mar 13, 2023 at 22:10
-
1\$\begingroup\$ The second example takes more of an OOP approach. In general, if I'm writing a function that deals with a specific data structure, I try to encapsulate that function as a method of the object. this data structure would be a dict[str, list[T]] \$\endgroup\$Jason Leaver– Jason Leaver2023年03月14日 00:47:45 +00:00Commented Mar 14, 2023 at 0:47
-
\$\begingroup\$ I'm building my error handler as a class, and this function is for one of the attributes of the class. I can use it as a different class and use it in my error handler class, but I'm finding two issues with this approach. 1. I can call this with any non-existent value, and it will return me a list with that value instead of returning an empty list. 2. The structure is a bit rigid, and doesn't allow for nested values. e.g. {"import": ["file", "email"], "send": [{"tableau": ["server", "data"]}, "delivery"]}. Do you have any thoughts on the above issues? \$\endgroup\$mankand007– mankand0072023年03月14日 14:40:06 +00:00Commented Mar 14, 2023 at 14:40
-
1\$\begingroup\$ Well your structure is a bit messy
dict[str, list[str | dict[str, list[str]]]]
, because the contents of your list is either adict
or astr
you will need to have some logic to resolve that. I suggest that you make the contents of the list consistent. I'm assumingimport
andsend
are the events where the error occurred. Try something likedict[Event, dict[Action, list[Targets]]]
\$\endgroup\$Jason Leaver– Jason Leaver2023年03月14日 15:28:27 +00:00Commented Mar 14, 2023 at 15:28
isinstance
instead oftype
:if isinstance(val, dict):
. You can also assigntarget
using a conditional expression:target = '' if key == tag else tag
. \$\endgroup\$