Message189488
| Author |
Michael.Fox |
| Recipients |
Michael.Fox, nadeem.vawda |
| Date |
2013年05月17日.22:27:23 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1368829644.15.0.251410244875.issue18003@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
import lzma
count = 0
f = lzma.LZMAFile('bigfile.xz' ,'r')
for line in f:
count += 1
print(count)
Comparing python2 with pyliblzma to python3.3.1 with the new lzma:
m@air:~/q/topaz/parse_datalog$ time python lzmaperf.py
102368
real 0m0.062s
user 0m0.056s
sys 0m0.004s
m@air:~/q/topaz/parse_datalog$ time python3 lzmaperf.py
102368
real 0m7.506s
user 0m7.484s
sys 0m0.012s
Profiling shows most of the time is spent here:
102371 6.881 0.000 6.972 0.000 lzma.py:247(_read_block)
I also notice that reading the entire file into memory with f.read() is perfectly fast.
I think it has something to do with lack of buffering. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2013年05月17日 22:27:24 | Michael.Fox | set | recipients:
+ Michael.Fox, nadeem.vawda |
| 2013年05月17日 22:27:24 | Michael.Fox | set | messageid: <1368829644.15.0.251410244875.issue18003@psf.upfronthosting.co.za> |
| 2013年05月17日 22:27:24 | Michael.Fox | link | issue18003 messages |
| 2013年05月17日 22:27:23 | Michael.Fox | create |
|