I'm writing a function that will read a file given the number of header lines to skip and the number of footer lines to skip.
def LoadText(file, HeaderLinesToSkip, FooterLinesToSkip):
fin = open(file)
text = []
for line in fin.readlines()[HeaderLinesToSkip, -FooterLinesToSkip]
text.append(line.strip())
return text
My problem is that this function will work properly only of FooterLinesToSkip is at least equal to 1. If FooterLinesToSkip = 0, then the function will return []. I can solve this problem with an if statement, but is there a much simpler form?
Edit : I actually simplified my problem; the lines read from the file contains columns separated by a semi-column. The real function includes .split(delimiter_character) and should store only column 1.
def LoadText(file, HeaderLinesToSkip, FooterLinesToSkip):
fin = open(file)
text = []
for line in fin.readlines()[HeaderLinesToSkip, -FooterLinesToSkip]
text.append(line.strip().split(';')[1])
return text
1 Answer 1
Set FooterLinesToSkip to None instead, so the slice defaults to the list length:
def LoadText(file, HeaderLinesToSkip, FooterLinesToSkip):
with open(file) as fin:
FooterLinesToSkip = -FooterLinesToSkip if FooterLinesToSkip else None
text = []
for line in fin.readlines()[HeaderLinesToSkip:FooterLinesToSkip]):
text.append(line.strip().split(';')[1])
Let me offer you an improvement, which does not require you to read the whole list into memory:
from collections import deque
from itertools import islice
def skip_headers_and_footers(fh, header_skip, footer_skip):
buffer = deque(islice(fh, header_skip, header_skip + footer_skip), footer_skip)
for line in fh:
yield buffer.popleft()
buffer.append(line)
This reads lines one by one, after skipping header_skip lines, and keeping footer_skip lines in a buffer. By the time we looped over all lines in the file, footer_skip lines remain in the buffer and are ignored.
This is a generator function, so it'll yield lines in a loop:
with open(filename) as open_file:
for line in skip_headers_and_footers(open_file, 2, 2):
# do something with this line.
line = line.strip()
I moved the file opening out of the function so that it can be used for other iterables too, not just files.
Now you can use the csv module to handle the column splitting and stripping:
import csv
with open(filename, 'rb') as open_file:
reader = csv.reader(open_file, delimiter=';')
for row in skip_headers_and_footers(reader, 2, 2):
column = row[1]
and the skip_headers_and_footers() generator has skipped the first two rows for you and will never yield the last two rows either.
7 Comments
.strip() call, for some reason. The second sample is more generic but I made an editing error. Mea culpa, corrected the sample.
list, not anarray; there is a big difference between the types. :-)