Im trying to take a text file and use only the first 30 lines of it in python. this is what I wrote:
text = open("myText.txt")
lines = myText.readlines(30)
print lines
for some reason I get more then 150 lines when I print? What am I doing wrong?
4 Answers 4
Use itertools.islice
import itertools
for line in itertools.islice(open("myText.txt"), 0, 30)):
print line
3 Comments
[1.9277660846710205, 1.9260480403900146, 1.9186549186706543] for a file of about 500 lines, and [1.5532219409942627, 1.5311739444732666, 1.5274620056152344] for one of 50, but I would appreciate cross-checking my findings...islice and repeat the operation twice, you'll see that it continues where it left off i.e. the file was not read till the end.If you are going to process your lines individually, an alternative could be to use a loop:
file = open('myText.txt')
for i in range(30):
line = file.readline()
# do stuff with line here
EDIT: some of the comments below express concern about this method assuming there are at least 30 lines in the file. If that is an issue for your application, you can check the value of line before processing. readline() will return an empty string '' once EOF has been reached:
for i in range(30):
line = file.readline()
if line == '': # note that an empty line will return '\n', not ''!
break
index = new_index
# do stuff with line here
7 Comments
readline() instead of readlines()?readline() will return "". You're still iterating through 30 values even if there are less lines in the file.The sizehint argument for readlines isn't what you think it is (bytes, not lines).
If you really want to use readlines, try text.readlines()[:30] instead.
Do note that this is inefficient for large files as it first creates a list containing the whole file before returning a slice of it.
A straight-forward solution would be to use readline within a loop (as shown in mac's answer).
To handle files of various sizes (more or less than 30), Andrew's answer provides a robust solution using itertools.islice(). To achieve similar results without itertools, consider:
output = [line for _, line in zip(range(30), open("yourfile.txt", "r"))]
or as a generator expression (Python>2.4):
output = (line for _, line in zip(range(30), open("yourfile.txt", "r")))
for line in output:
# do something with line.
1 Comment
The argument for readlines is the size (in bytes) that you want to read in. Apparently 150+ lines is 30 bytes worth of data.
Doing it with a for loop instead will give you proper results. Unfortunately, there doesn't seem to be a better built-in function for that.
lines = text.readlines(30)?