Message 228866 - Python tracker

➜

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

In-reply-to
Author	benhoyt
Recipients	abacabadabacaba, akira, benhoyt, giampaolo.rodola, pitrou, socketpair, tim.golden, vstinner
Date	2014年10月09日.12:35:40
SpamBayes Score	-1.0
Marked as misclassified	Yes
Message-id	<1412858141.51.0.983204385418.issue22524@psf.upfronthosting.co.za>

Content
Thanks, Victor and Antone. I'm somewhat surprised at the 2-3x numbers you're seeing, as I was consistently getting 4-5x in the Linux tests I did. But it does depend quite a bit on what file system you're running, what hardware, whether you're running in a VM, etc. Still, 2-3x faster is a good speedup! The numbers are significantly better on Windows, as you can see. Even the smallest numbers I've seen with "--scandir os" are around 12x range on Windows. In any case, Victor's last tests are "right" -- I presume we'll have some C, so what we want to be comparing is "benchmark.py --scandir c" versus "benchmark.py --scandir os": the some C version versus the all C version in the attached CPython 3.5 patch. BTW, Victor, "Generic" isn't really useful. I just used it as a test case that calls listdir() and os.stat() to implement the scandir/DirEntry interface. So it's going to be strictly slower than listdir + stat due to using listdir and creating all those DirEntry objects. Anyway, where to from here? Are we agreed given the numbers that -- especially on Linux -- it makes good performance sense to use an all-C approach?

Content

Thanks, Victor and Antone. I'm somewhat surprised at the 2-3x numbers you're seeing, as I was consistently getting 4-5x in the Linux tests I did. But it does depend quite a bit on what file system you're running, what hardware, whether you're running in a VM, etc. Still, 2-3x faster is a good speedup!
The numbers are significantly better on Windows, as you can see. Even the smallest numbers I've seen with "--scandir os" are around 12x range on Windows.
In any case, Victor's last tests are "right" -- I presume we'll have *some* C, so what we want to be comparing is "benchmark.py --scandir c" versus "benchmark.py --scandir os": the some C version versus the all C version in the attached CPython 3.5 patch.
BTW, Victor, "Generic" isn't really useful. I just used it as a test case that calls listdir() and os.stat() to implement the scandir/DirEntry interface. So it's going to be strictly slower than listdir + stat due to using listdir and creating all those DirEntry objects.
Anyway, where to from here? Are we agreed given the numbers that -- especially on Linux -- it makes good performance sense to use an all-C approach?

History
Date	User	Action	Args
2014年10月09日 12:35:41	benhoyt	set	recipients: + benhoyt, pitrou, vstinner, giampaolo.rodola, tim.golden, abacabadabacaba, akira, socketpair
2014年10月09日 12:35:41	benhoyt	set	messageid: <1412858141.51.0.983204385418.issue22524@psf.upfronthosting.co.za>
2014年10月09日 12:35:41	benhoyt	link	issue22524 messages
2014年10月09日 12:35:40	benhoyt	create

homepage