Message 189488 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	Michael.Fox
Recipients	Michael.Fox, nadeem.vawda
Date	2013年05月17日.22:27:23
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1368829644.15.0.251410244875.issue18003@psf.upfronthosting.co.za>

Content
import lzma count = 0 f = lzma.LZMAFile('bigfile.xz' ,'r') for line in f: count += 1 print(count) Comparing python2 with pyliblzma to python3.3.1 with the new lzma: m@air:~/q/topaz/parse_datalog$ time python lzmaperf.py 102368 real 0m0.062s user 0m0.056s sys 0m0.004s m@air:~/q/topaz/parse_datalog$ time python3 lzmaperf.py 102368 real 0m7.506s user 0m7.484s sys 0m0.012s Profiling shows most of the time is spent here: 102371 6.881 0.000 6.972 0.000 lzma.py:247(_read_block) I also notice that reading the entire file into memory with f.read() is perfectly fast. I think it has something to do with lack of buffering.

Content

import lzma
count = 0
f = lzma.LZMAFile('bigfile.xz' ,'r')
for line in f:
 count += 1
print(count)
Comparing python2 with pyliblzma to python3.3.1 with the new lzma:
m@air:~/q/topaz/parse_datalog$ time python lzmaperf.py
102368
real 0m0.062s
user 0m0.056s
sys 0m0.004s
m@air:~/q/topaz/parse_datalog$ time python3 lzmaperf.py
102368
real 0m7.506s
user 0m7.484s
sys 0m0.012s
Profiling shows most of the time is spent here:
 102371 6.881 0.000 6.972 0.000 lzma.py:247(_read_block)
I also notice that reading the entire file into memory with f.read() is perfectly fast.
I think it has something to do with lack of buffering.

History
Date	User	Action	Args
2013年05月17日 22:27:24	Michael.Fox	set	recipients: + Michael.Fox, nadeem.vawda
2013年05月17日 22:27:24	Michael.Fox	set	messageid: <1368829644.15.0.251410244875.issue18003@psf.upfronthosting.co.za>
2013年05月17日 22:27:24	Michael.Fox	link	issue18003 messages
2013年05月17日 22:27:23	Michael.Fox	create

homepage