This code snippet shows 2 common patterns I see a lot in Python that I don't really like:
- Arbitrarily-named, narrowly-scoped recursive function
- Inappropriate data type (in this case, an array
ans
to store 1 float) used purely for its mutability and scope, to store side-effects from some other scope
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, x):
# self.val = x
# self.left = None
# self.right = None
class Solution:
def maximumAverageSubtree(self, root: TreeNode) -> float:
ans = [0]
def helper(node):
if not node:
return 0, 0
left_sum, left_count = helper(node.left)
right_sum, right_count = helper(node.right)
ans[0] = max(ans[0], (left_sum + right_sum + node.val) / (left_count + right_count + 1))
return left_sum + right_sum + node.val, left_count + right_count + 1
helper(root)
return ans[0]
ans
has to be an array, even though we just want to store one scalar value, because if we tried to use a float it would not be visible inside helper
. This seems like an ugly shortcut to me.
And the other pattern I don't like is defining helper
just so we can do recursion with additional return values for logic, then return helper
's side effects. Is it normal to define short, sort of "throwaway" functions like this?
2 Answers 2
Non-Local Variables
ans
is visible (for reading) inside of helper
. It just can't be changed, although its contents can be modified.
So yes, ans = [0]
is an ugly hack. You don't need it. What you need in its place is nonlocal
.
def maximumAverageSubtree(self, root: TreeNode) -> float:
ans = 0
def helper(node):
nonlocal ans # <--- Refer to ans in outer (but not global) scope
if not node:
return 0, 0
left_sum, left_count = helper(node.left)
right_sum, right_count = helper(node.right)
ans = max(ans, (left_sum + right_sum + node.val) / (left_count + right_count + 1))
return left_sum + right_sum + node.val, left_count + right_count + 1
helper(root)
return ans
Arbitrary-named, narrowly-scoped recursive function
In this case, you're going to need to get used to it.
Instead of creating a new function, which is visible outside the maximumAverageSubtree
function, this helper
is only useful to this function. It does not need to exposed to the outside, so it makes sense to hide it inside.
This is a common paradigm in Python. Decorators use this all the time. For example, a @timed
decorator has an internal wrapper()
function.
def timed(func):
def wrapper(*argv, **kwargs):
start_time = time.perf_counter()
result = func(*argv, **kwargs)
end_time = time.perf_counter()
print(func.__name__, "took", end_time - start_time, "seconds")
return result
return wrapper
The function shouldn't be arbitrarily named; its name should reflect its purpose. Here, we are wrapping a call to another function, so wrapper()
makes sense.
Above, you have a helper()
function. That probably could be named better. Maybe process_subtree(node)
. But it is "scoped" inside maximumAverageSubtree()
, so its name doesn't need to repeat that level of detail.
-
1\$\begingroup\$ I wasn't aware of
nonlocal
, I like that a lot. Likeglobal
without all the namespace pollution. Thank you. \$\endgroup\$NightDriveDrones– NightDriveDrones2019年08月18日 20:06:16 +00:00Commented Aug 18, 2019 at 20:06
No need for a hack or nonlocal variable, return the maximum average:
class Solution:
def maximumAverageSubtree(self, root: TreeNode) -> float:
def helper(node):
if not node:
return 0, 0, 0
left_max_avg, left_sum, left_count = helper(node.left)
right_max_avg, right_sum, right_count = helper(node.right)
this_sum = left_sum + right_sum + node.val
this_count = left_count + right_count + 1
this_avg = this_sum / this_count
this_max_avg = max(left_max_avg, this_max_avg, right_max_avg )
return this_max_avg, this_sum , this_count
max_avg, _, _ = helper(root)
return max_avg
Or, in this case you could use a class instance variable:
class Solution:
def maximumAverageSubtree(self, root: TreeNode) -> float:
def helper(node):
if not node:
return 0, 0
left_sum, left_count = helper(node.left)
right_sum, right_count = helper(node.right)
this_sum = left_sum + right_sum + node.val
this_count = left_count + right_count + 1
self.ans = max(self.ans, this_sum / this_count)
return this_sum , this_count
self.ans = 0
helper(root)
return self.ans
-
\$\begingroup\$ Is it acceptable to define an object property outside of
__init__
, or should they always first be declared inside__init__
? \$\endgroup\$NightDriveDrones– NightDriveDrones2019年08月18日 19:57:20 +00:00Commented Aug 18, 2019 at 19:57 -
\$\begingroup\$ Generally, it's good to put it in init. If you expect to call maximumAverageSubtree more than once, self.ans would have to be reinitialized every call. Using @AJNeufelds's answer with
nonlocal
or returning the max avg as in my first answer is a better approach. \$\endgroup\$RootTwo– RootTwo2019年08月18日 21:06:10 +00:00Commented Aug 18, 2019 at 21:06 -
\$\begingroup\$ Should probably be
if not node: return float('-inf'), 0, 0
\$\endgroup\$Peter Taylor– Peter Taylor2019年08月19日 08:55:40 +00:00Commented Aug 19, 2019 at 8:55