This interesting unpacking function allows unpacking a nested tuple into the arguments of a function as its arguments, with structure checking support and only one function call overhead excluding one-time compilation cost.
def flatten_tuple(data, structure, cache={}):
if not (unpacker := cache.get(structure)):
def get_unstruct(structure, n_vars):
if isinstance(structure, int):
if structure == 0:
return "*x%d" % n_vars, n_vars + 1
if structure == -1:
return "_", n_vars
new_n_vars = n_vars + structure
ret = ",".join("x%d" % i for i in range(n_vars, new_n_vars))
n_vars = new_n_vars
return ret, new_n_vars
assert isinstance(structure, tuple)
unstruct_code = "("
for x in structure:
unstruct_code_one, n_vars = get_unstruct(x, n_vars)
unstruct_code += unstruct_code_one + ","
return unstruct_code + ")", n_vars
unstruct_code, n_vars = get_unstruct(structure, 0)
local_dict = {}
exec(
compile("def unpacker(data):\n %s = data\n return x%s" %
(unstruct_code, ",x".join(map(str, range(n_vars)))),
"",
"exec",
dont_inherit=True), {}, local_dict)
unpacker = local_dict["unpacker"]
cache[structure] = unpacker
return unpacker(data)
Usage example:
def f(x, y, z):
print(x, y, z)
f(*flatten_tuple((10, (20, 30)), (1, (1, 1))))
# prints: 10 20 30
Any thoughts on how it can be improved?
Edit:
Here's a quick performance comparison against flat
as given by @Reinderien:
def flat(t):
if isinstance(t, tuple):
for x in t:
yield from flat(x)
else:
yield t
In [2]: def f(x0,x1,x2,x3,x4,x5,x6,x7):
...: return x0+x1+x2+x3+x4+x5+x6+x7
...:
In [3]: %timeit f(*flat(((10,20,30),(40,50),(60,70,80))))
2.06 μs ± 69.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
In [4]: %timeit f(*flatten_tuple(((10,20,30),(40,50),(60,70,80)), ((3,),(2,),(3,))))
383 ns ± 17.9 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
1 Answer 1
motivation
As reflected in the comments, the Review Context doesn't really set up why this would be helpful. If I have this or that use case, it's unclear how I would evaluate whether the OP code is a good match for that use case, or if other (not mentioned) competing approaches would be a better fit.
readability
Python offers several tools so you can assist the Gentle Reader in rapidly understanding what is going on, which you chose not to use.
Optional type annotation, e.g. ... , n_vars: int) -> tuple[str, int]:
,
would improve readability.
Adding a bunch of """docstrings""" would go a long way
toward improving readability.
Some of them might even use the >>>
doctest
syntax.
Note that even nested functions can have docstrings.
The "handle an integer" code is maybe doing enough that it's worth breaking out a helper function, if only to spell out its contract.
maintainability
In the Review Context we get a hint that this code will be called by an inner loop where every cycle counts. But we find no automated test suite. So when a month from now a maintenance engineer tinkers with the code, it will be difficult to tell if it is "better" or not. In particular, we'll have no measurements for detecting a performance regression.
introspection
In the sample usage
f(*flatten_tuple((10, (20, 30)), (1, (1, 1))))
we only get to examine the input data.
If a reference to f
was available, we could
ask
what args f
wants to receive,
and verify they're compatible.
Phrasing your code as a decorator would be one way to approach that.
cache idiom
We see a nice, simple caching approach which is readily understood. Nonetheless, it could be simplified slightly. Such code is easier to reason about (and to test) when there's a single unconditional code path for consuming the cached value.
In the OP code we find
if not (unpacker := cache.get(structure)):
... do stuff ...
unpacker = local_dict["unpacker"]
cache[structure] = unpacker
return unpacker(data)
This requires the Gentle Reader to understand some extra relationships. Plus it always does a cache write, even on a cache hit, which is slightly surprising.
Consider avoiding the walrus operator, and proceeding in two sequential steps.
- ensure an entry is cached
- use that entry
if not structure in cache:
... do stuff ...
cache[structure] = local_dict["unpacker"]
unpacker = cache[structure]
return unpacker(data)
Consider using a @cache or @lru_cache decorator.
-
\$\begingroup\$ nice catch with the cache write. I should have just done it inside the "if" branch.. \$\endgroup\$user1537366– user15373662024年07月30日 08:22:57 +00:00Commented Jul 30, 2024 at 8:22
-
\$\begingroup\$ I don't see the point of avoiding the walrus operator. \$\endgroup\$user1537366– user15373662024年07月30日 08:31:42 +00:00Commented Jul 30, 2024 at 8:31
def flat(t): if isinstance(t, tuple): for x in t: yield from flat(x) else: yield t
? \$\endgroup\$