[Python-ideas] Is this PEP-able? for X in ListY while conditionZ:

Tue Jul 2 00:44:22 CEST 2013

On 1 July 2013 21:29, David Mertz <mertz at gnosis.cx> wrote:
> However, I see the point made by a number of people that the 'while' clause
> has no straightforward translation into an unrolled loop, and is probably
> ruled out on that basis.

My thought (in keeping with the title of the thread) is that the comprehension
 data = [x for y in stuff while z]
would unroll as the loop
 for y in stuff while z:
 data.append(x)
which would also be valid syntax and have the obvious meaning. This is
similar to Nick's suggestion that 'break if' be usable in the body of
the loop so that
 data = [x for y in stuff; break if not z]
would unroll as
 for y in stuff:
 break if not z
 data.append(y)
Having a while clause on for loops is not just good because it saves a
couple of lines but because it clearly separates the flow control from
the body of the loop (another reason I dislike 'break if'). In other
words I find the flow of the loop
 for p in primes() while p < 100:
 print(p)
easier to understand (immediately) than
 for p in primes():
 if p >= 100:
 break
 print(p)
These are just trivially small examples. As the body of the loop grows
in complexity the readability benefit of moving 'if not z: break' into
the top line becomes more significant.
You can get the same separation of concerns using takewhile at the
expense of a different kind of readability
 for p in takewhile(lambda p: p < 100, primes()):
 print(p)
However there is another problem with using takewhile in for loops
which is that it discards an item from the iterable. Imagine parsing a
file such as:
csvfile = '''# data.csv
# This file begins with an unspecified number of header lines.
# Each header line begins with '#'.
# I want to keep these lines but need to parse the separately.
# The first non-comment line contains the column headers
x y z
1 2 3
4 5 6
7 8 9'''.splitlines()
You can do
 csvfile = iter(csvfile)
 headers = []
 for line in csvfile:
 if not line.startswith('#'):
 break
 headers.append(line[1:].strip())
 fieldnames = line.split()
 for line in csvfile:
 yield {name: int(val) for name, val in zip(fieldnames, line.split())}
However if you use takewhile like
 for line in takewhile(lambda line: line.startswith('#'), csvfile):
 headers.append(line[1:].split())
then after the loop 'line' holds the last comment line. The discarded
column header line is gone and cannot be recovered; takewhile is
normally only used when the entire remainder of the iterator is to be
discarded.
I would propose that
 for line in csvfile while line.startwith('#'):
 headers.append(line)
would result in 'line' referencing the item that failed the while predicate.
Oscar