I have a section of code I use to extract an event log out of a large text file. It works well, it's just my use of list(itertools.takewhile(...))
that feels a little sketchy to me.
Is there a nicer way of doing this?
import itertools
testdata = '''
Lots of other lines...
Really quite a few.
*************
* Event Log *
*************
Col1 Col2 Col3
----- ----- -----
1 A B
2 A C
3 B D
Other non-relevant stuff...
'''
def extractEventLog(fh):
fhlines = (x.strip() for x in fh)
list(itertools.takewhile(lambda x: 'Event Log' not in x, fhlines))
list(itertools.takewhile(lambda x: '-----' not in x, fhlines))
lines = itertools.takewhile(len, fhlines) # Event log terminated by blank line
for line in lines:
yield line # In the real code, it's parseEventLogLine(line)
Expected output:
>>> list(extractEventLog(testdata.splitlines()))
['1 A B', '2 A C', '3 B D']
1 Answer 1
Yes, it is indeed a bit sketchy/confusing to use takewhile
when you really don't want to take the lines, but discard them. I think it's better to use dropwhile
and then use its return value instead of discarding it. I believe that that captures the intent much more clearly:
def extractEventLog(fh):
fhlines = (x.strip() for x in fh)
lines = itertools.dropwhile(lambda x: 'Event Log' not in x, fhlines)
lines = itertools.dropwhile(lambda x: '-----' not in x, lines)
lines.next() # Drop the line with the dashes
lines = itertools.takewhile(len, lines) # Event log terminated by blank line
for line in lines:
yield line # In the real code, it's parseEventLogLine(line)
-
\$\begingroup\$ Much nicer! My eyes must have glossed over
dropwhile
. \$\endgroup\$MikeyB– MikeyB2011年03月19日 23:03:23 +00:00Commented Mar 19, 2011 at 23:03