In a compiler project for an LL(1/2) grammar, I have had a need for a generator iterator with a push back function. I have been surprised not to find any perfectly applicable solution (clean, simple, obvious, standard, etc.). On the other hand I was surprised of how easy it was to create exactly what I wanted after I decided to invest a little time.
Here is what I came up with:
class Back_pushable_iterator:
"""Class whose constructor takes an iterator as its only parameter, and
returns an iterator that behaves in the same way, with added push back
functionality.
The idea is to be able to push back elements that need to be retrieved once
more with the iterator semantics. This is particularly useful to implement
LL(k) parsers that need k tokens of lookahead. Lookahead or push back is
really a matter of perspective. The pushing back strategy allows a clean
parser implementation based on recursive parser functions.
The invoker of this class takes care of storing the elements that should be
pushed back. A consequence of this is that any elements can be "pushed
back", even elements that have never been retrieved from the iterator.
The elements that are pushed back are then retrieved through the iterator
interface in a LIFO-manner (as should logically be expected).
This class works for any iterator but is especially meaningful for a
generator iterator, which offers no obvious push back ability.
In the LL(k) case mentioned above, the tokenizer can be implemented by a
standard generator function (clean and simple), that is completed by this
class for the needs of the actual parser.
"""
def __init__(self, iterator):
self.iterator = iterator
self.pushed_back = []
def __iter__(self):
return self
def __next__(self):
if self.pushed_back:
return self.pushed_back.pop()
else:
return next(self.iterator)
def push_back(self, element):
self.pushed_back.append(element)
def main():
it = Back_pushable_iterator(x for x in range(10))
x = next(it) # 0
print(x)
it.push_back(x)
x = next(it) # 0
print(x)
x = next(it) # 1
print(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)
it.push_back(y)
it.push_back(x)
x = next(it) # 2
y = next(it) # 3
print(x)
print(y)
for x in it:
print(x) # 4-9
it.push_back(x)
y = next(it) # 9
print(x)
if __name__ == "__main__":
main()
Any feedback is appreciated. A better name? ;-)
Note: I have already posted this code as a very late answer to https://stackoverflow.com/questions/2425270/how-to-look-ahead-one-element-in-a-python-generator, but got no feeback.
-
1\$\begingroup\$ I have little time for a deep review, which wouldn't be that long since your code is nicely short. However, have you thoroughly tested the case where the underlying iterator is exhausted, and the user wants to push back? \$\endgroup\$aghast– aghast2017年10月25日 17:58:03 +00:00Commented Oct 25, 2017 at 17:58
-
\$\begingroup\$ @Austin: thanks. I don't know about thoroughly but I added a little test at the end (updated) and it behaves as I expected. \$\endgroup\$nilo– nilo2017年10月25日 18:04:41 +00:00Commented Oct 25, 2017 at 18:04
3 Answers 3
This is to expand on Austin Hastings comment.
According to Python Standard Library documentation,
The intention of the protocol is that once an iterator’s next() method raises StopIteration, it will continue to do so on subsequent calls. Implementations that do not obey this property are deemed broken.
It means that once the underlying iterator raises StopIteration
, your object shall not accept any more push backs.
I would call the list lookahead
rather than pushed_back
, but it is a matter of taste.
Otherwise, LGTM.
-
\$\begingroup\$ Thanks a lot. Now I get it. Do you think it is reasonable to raise StopIteration for push_back() subsequent calls too? I have updated the code accordingly. When it comes to lookahead, I have actually considered using that name for a method that would look ahead in the iterator (without changing the object's state from a user perspective). Therefore, I will not rename pushed_back to lookahead. \$\endgroup\$nilo– nilo2017年10月26日 17:37:17 +00:00Commented Oct 26, 2017 at 17:37
-
1\$\begingroup\$ @nilo I am not sure about
StopIteration
(it is strictly reserved for thenext
call). I would go with a custom exception. \$\endgroup\$vnp– vnp2017年10月26日 18:08:23 +00:00Commented Oct 26, 2017 at 18:08 -
\$\begingroup\$ I think, however, that if you deny push_back after the underlying iteration has stopped the class will not be useful for the intended scope. \$\endgroup\$Emanuele Paolini– Emanuele Paolini2019年07月03日 12:10:52 +00:00Commented Jul 3, 2019 at 12:10
The corrected code after review from vnp:
class StopPushingBack(Exception):
pass
class Back_pushable_iterator:
"""Class whose constructor takes an iterator as its only parameter, and
returns an iterator that behaves in the same way, with added push back
functionality.
The idea is to be able to push back elements that need to be retrieved once
more with the iterator semantics. This is particularly useful to implement
LL(k) parsers that need k tokens of lookahead. Lookahead or push back is
really a matter of perspective. The pushing back strategy allows a clean
parser implementation based on recursive parser functions.
The invoker of this class takes care of storing the elements that should be
pushed back. A consequence of this is that any elements can be "pushed
back", even elements that have never been retrieved from the iterator.
The elements that are pushed back are then retrieved through the iterator
interface in a LIFO-manner (as should be logically expected).
This class works for any iterator but is especially meaningful for a
generator iterator, which offers no obvious push back ability.
In the LL(k) case mentioned above, the tokenizer can be implemented by a
standard generator function (clean and simple), that is completed by this
class for the needs of the actual parser.
Once the iterator's next() method raises StopIteration, subsequent calls to
push_back() will raise StopPushingBack.
"""
def __init__(self, iterator):
self.iterator = iterator
self.pushed_back = []
self.ok_to_push_back = True
def __iter__(self):
return self
def __next__(self):
if self.pushed_back:
return self.pushed_back.pop()
else:
try:
return next(self.iterator)
except StopIteration:
self.ok_to_push_back = False
raise
def push_back(self, element):
if not self.ok_to_push_back:
raise StopPushingBack
else:
self.pushed_back.append(element)
def main():
it = Back_pushable_iterator(x for x in range(10))
x = next(it)
print(x) # 0
it.push_back(x)
x = next(it)
print(x) # 0
x = next(it)
print(x) # 1
x = next(it)
y = next(it)
print(x) # 2
print(y) # 3
it.push_back(y)
it.push_back(x)
x = next(it)
y = next(it)
print(x) # 2
print(y) # 3
for x in it:
print(x) # 4-9
it.push_back(x) # StopPushingBack
if __name__ == "__main__":
main()
I am not too used to defining custom exceptions. Is that fine?
For Python 2.7 the above code is slightly modified and improved:
__next__()
->next()
self.iterator = iter(iterator)
- for simpler initiation, for examplePushBackIterator(range(10))
insteadPushBackIterator(x for x in range(10))
- names of classes
class PushBackDenied(Exception):
pass
class PushBackIterator(object):
def __init__(self, iterator):
self.iterator = iter(iterator)
self.pushed_back = []
self.ok_to_push_back = True
def __iter__(self):
return self
def next(self):
if self.pushed_back:
return self.pushed_back.pop()
else:
try:
return next(self.iterator)
except StopIteration:
self.ok_to_push_back = False
raise
def push_back(self, element):
if not self.ok_to_push_back:
raise PushBackDenied
else:
self.pushed_back.append(element)
it = PushBackIterator(range(10))
for i, x in enumerate(it):
print i, x
if i % 2:
it.push_back(x)
0 0
1 1
2 1
3 2
4 2
5 3
6 3
7 4
8 4
9 5
10 5
11 6
12 6
13 7
14 7
15 8
16 8
17 9
18 9
-
\$\begingroup\$ some explanation wouldn't hurt \$\endgroup\$dfhwze– dfhwze2019年07月03日 12:04:51 +00:00Commented Jul 3, 2019 at 12:04
-
\$\begingroup\$ I just rewrite the code above for Python 2.7
def __next__()
->def next()
and addediter(iterator)
in__init__()
for simpler initiation. Example of that usage posted below. Thank you for your question, I tried to understand better what I did ;-) \$\endgroup\$Alexey Ruzin– Alexey Ruzin2019年07月03日 12:59:02 +00:00Commented Jul 3, 2019 at 12:59 -
\$\begingroup\$ added description of changes \$\endgroup\$Alexey Ruzin– Alexey Ruzin2019年07月03日 13:03:35 +00:00Commented Jul 3, 2019 at 13:03
-
\$\begingroup\$ and, sorry, I missed the site, thought of being at stackoverflow ;-) \$\endgroup\$Alexey Ruzin– Alexey Ruzin2019年07月03日 13:11:12 +00:00Commented Jul 3, 2019 at 13:11