In the following code, I have run into a RecursionError: maximum recursion depth exceeded.
def unpack(given):
for i in given:
if hasattr(i, '__iter__'):
yield from unpack(i)
else:
yield i
some_list = ['a', ['b', 'c'], 'd']
unpacked = list(unpack(some_list))
This works fine if I use some_list = [1, [2, [3]]], but not when I try it with strings.
I suspect my lack of knowledge in python. Any guidance appreciated.
1 Answer 1
Strings are infinitely iterable. Even a one-character string is iterable.
Therefore, you will always get a stack overflow unless you add special handling for strings:
def flatten(x):
try:
it = iter(x)
except TypeError:
yield x
return
if isinstance(x, (str, bytes)):
yield x
return
for elem in it:
yield from flatten(elem)
Note: using hasattr(i, '__iter__') is not sufficient to check if i is iterable, because there are other ways to satisfy the iterator protocol. The only reliable way to determine whether an object is iterable is to call iter(obj).
8 Comments
hasattr(i, '__iter__') is not sufficient to check if i is iterable" - true, but isinstance(x, Iterable) fails for many of the same cases. The most reliable check is to just call iter.__len__ that returns an int (and make sure it's in range(0, 1<<32)), and a __getitem__ that doesn't raise IndexError for any value in range(0, __len__()) but does for __len__(), and you've got an iterable but not an Iterable.flatten? I almost always prefer EAFP, but somehow I always come back to LBYL for this one. For example - which exception(s) to catch, is just TypeError enough? It's hard to say.isinstance(x, Iterable) is definitely an improvement over hasattr(x, '__iter__'), at least. It'll pick up "anti-registration" through __iter__ = None, and it'll detect isinstance("asdf", Iterable) in Python 2 (though only due to a register call).
for x in 'a'yields'a'itself.some_list = []; some_list.append(some_list); unpacked = list(unpack(some_list))to see that this can happen with anything with depth>1000. So the remaining question is why every string has depth>1000, which wim's answer (and BallpointBen's comment) explains.__iter__forlistreturns itself, and naturally it's unending recursion?some_list[0] is some_list. I thought this would be less surprising than the fact that ifs = 'a',s[0] is s, so it would help illuminate the problem, but now that I think about it, how many people actually know about recursive lists in Python? The only really obvious example would be a class that explicitly iterates itself, which is too big and distracting to be worth commenting about; better to just go straight tos[0] is sfor strings as BallpointBen did.