I wrote a program sometime ago that reads csv files, sorts, reorders and groups their columns/rows' content (with itertools
) and later labels the groups using a dictionary. The groups and dictionary are then flattened and rows of length N (columns) are uploaded to a spreadsheet. This is another function, not specific to that program, to use in this situation.
def digfrom(iterable, *, depth=None, strings=True):
"""
Dig values from nested iterables, flattening
the input iterable.
>>> iterable = [['a'], 'bc', ('de', ['f'])]
>>> list(digfrom(iterable))
['a', 'b', 'c', 'd', 'e', 'f']
Depth may be limited and "exploding" strings
is optional.
>>> list(digfrom(iterable, depth=2))
['a', 'b', 'c']
>>> list(digfrom(iterable, strings=False))
['a', 'bc', 'de', 'f']
"""
exhausted = object()
iterable_attr = '__iter__'
iterator_attr = '__next__'
iterators = [iter(iterable)]
while iterators:
it = next(iterators[-1], exhausted)
if it is exhausted:
iterators.pop()
continue
if hasattr(it, iterable_attr):
string = isinstance(it, str)
if not ((string and len(it) <= 1) or
(string and not strings)):
it = iter(it)
if hasattr(it, iterator_attr):
iterators.append(it)
iterators = iterators[:depth]
else:
yield it
-
\$\begingroup\$ Downvote. So it's no good I guess. Was just looking for a nudge to redo that awful program I mentioned. Maybe I should've talked about the code, although it seems small and simple enough not to warrant a more detailed explanation. \$\endgroup\$blackleg– blackleg2018年06月18日 00:36:45 +00:00Commented Jun 18, 2018 at 0:36
-
\$\begingroup\$ Take a look at codereview.stackexchange.com/help/how-to-ask - please always include all of your code and a description of what you're doing and why you're doing it. \$\endgroup\$Raystafarian– Raystafarian2018年06月18日 02:47:57 +00:00Commented Jun 18, 2018 at 2:47
-
2\$\begingroup\$ @Raystafarian OP already included a description of the code (albeit somewhat short). There's no need for additional information here. Taking one function from a program to get it reviewed is okay as far as I'm aware. The help center only states that the code presented must work as expected, which it does. \$\endgroup\$Daniel– Daniel2018年06月18日 07:47:20 +00:00Commented Jun 18, 2018 at 7:47
-
1\$\begingroup\$ @Coal_ I was responding to the comment about the downvote - I didn't vote but thought maybe that would help. I really don't like downvotes with no comment :/ Either way the post indicates there are more functions, which aren't included. \$\endgroup\$Raystafarian– Raystafarian2018年06月18日 07:48:36 +00:00Commented Jun 18, 2018 at 7:48
1 Answer 1
Looks like decent code. I think you only got downvoted (once) because you alluded to how you wanted to use this function as part of a (gasp!) program — and then of course twice more because they saw you'd already been downvoted once. That's just how this site works. :)
The biggest problem with your code is that it uses the identifiers iterable
, iterable_attr
, iterator_attr
, iterators
, it
, and iter
, all with different meanings. That's too many it
s!
iterable_attr
and iterator_attr
could be called more like ITER_ATTR
and NEXT_ATTR
, because (A) the capitalization indicates that they're constants, and (B) the names indicate the values they hold. However, this is right up there with THREE = 3
. When you see yourself giving symbolic names to global constants, it's time to DRY up your code:
if hasattr(it, '__iter__'):
string = isinstance(it, str)
if not ((string and len(it) <= 1) or
(string and not strings)):
it = iter(it)
if hasattr(it, '__next__'):
iterators.append(it)
iterators = iterators[:depth]
Actually, that logical expression buried in the middle there looks like it could stand some De Morgan's Laws. Also, string
is a bad name for a boolean! Name your booleans after predicates: in this case, is_string
would do. But let's see if we need it at all.
is_string = isinstance(it, str)
if not ((is_string and len(it) <= 1) or (is_string and not strings)):
it = iter(it)
De Morgan 1:
is_string = isinstance(it, str)
if not(is_string and len(it) <= 1) and not(is_string and not strings):
it = iter(it)
De Morgan 2:
isnt_string = not isinstance(it, str)
if (isnt_string or len(it) > 1) and (isnt_string or strings):
it = iter(it)
Distributive law:
isnt_string = not isinstance(it, str)
if isnt_string or (strings and len(it) > 1):
it = iter(it)
Refactor to emphasize the one special case we're trying to express:
if (not hasattr(it, '__iter__')) or (len(it) <= 1):
pass # can't explode this guy
elif (not strings) and isinstance(it, str):
pass # we're not exploding strings
else:
it = iter(it) # explode it
The next two lines
iterators.append(it)
iterators = iterators[:depth]
look suspicious. We're appending to the array and then immediately truncating it? Couldn't we do that in some cleaner way?
By the way, that *,
in the signature must be a Python-3-ism; I'm not sure what it means. In Python 2 you'd just omit it and it would still do what you wanted.
Your exhausted = object()
... if it is exhausted
idiom is neat, if maybe a bit too cute. It beats catching the exception from next(iterators[-1])
, anyway.
Unit tests (especially for depth=...
) would go a long way here.
-
2\$\begingroup\$ The single star in the arguments is PEP3102. \$\endgroup\$301_Moved_Permanently– 301_Moved_Permanently2018年06月18日 16:22:10 +00:00Commented Jun 18, 2018 at 16:22