2
\$\begingroup\$

I need a function to iterate through a python iterable in chunks. That is, it takes an iterable and a size of n and yields generators iterating through each chunk of size n. After some experimentation, I wrote this stupid hack because it seems there is no easy way to preemptively check whether an iterable has been exhausted. How can I improve this code?

def iterchunks(it, n):
 def next_n(it_, n_, return_first = None):
 if return_first is not None:
 n_ -= 1
 yield return_first
 for _ in range(n_):
 yield next(it_)
 # check if the iterator is exhausted by advancing the iterator, 
 # if not return the value returned by advancing the iterator along with the boolean result
 def exhausted(it_):
 res = next(it_, None)
 return res is None, res
 while True:
 exhsted, rf = exhausted(it)
 if exhsted:
 return
 else:
 # if the iterator is not exhausted, yield the returned value along with the next chunk
 yield next_n(it, n, rf)
200_success
145k22 gold badges190 silver badges478 bronze badges
asked Apr 26, 2017 at 22:02
\$\endgroup\$
2
  • \$\begingroup\$ Hmm, not exact solution, but... do you know StopIteration exception? try: while True: yield [next(it) for _ in range(n)] except StopIteration: pass \$\endgroup\$ Commented Apr 26, 2017 at 22:20
  • \$\begingroup\$ @enedil Yes, if I just pack the generators into lists or tuples, there would be no problem since the StopIteration exception would be triggered upon calling any empty generators. That would sacrifice some flexibility and laziness though. \$\endgroup\$ Commented Apr 26, 2017 at 22:39

1 Answer 1

2
\$\begingroup\$

Your code has a significant bug. If I ask it to chunk a list with None at a multiple of n plus 1 spot (c * n + 1), it will not return the rest of the list

xs = list(range(75, 90))
xs[5] = None
print([list(c) for c in iterchunks(iter(xs), 5)])
# Outputs [[75, 76, 77, 78, 79]]
# Expected [[75, 76, 77, 78, 79], [None, 81, 82, 83, 84], [85, 86, 87, 88, 89]]

To resolve this, use the standard practice of trying something, and asking for forgiveness later. I would suggest either an iterable you build up. Still this seems like a case of reinventing the wheel, it is unfortunate python doesn't have it built in to the itertools library. It does define grouper in the docs of itertools, which is kinda what you want, except it pads a fill value to the end of the iterable.

def chunk(it, n):
 try:
 while True:
 xs = [] # The buffer to hold the next n items
 for _ in range(n):
 xs.append(next(it))
 yield xs
 except StopIteration:
 yield xs

This is the code from the itertools docs here, with one amendment, to yield from instead of returning the iterable it creates

def grouper(iterable, n, fillvalue=None):
 "Collect data into fixed-length chunks or blocks"
 # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
 yield from zip_longest(*[iter(iterable)] * n, fillvalue=fillvalue)
answered Apr 27, 2017 at 0:41
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.