Message175093
| Author |
serhiy.storchaka |
| Recipients |
serhiy.storchaka |
| Date |
2012年11月07日.12:38:46 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1352291929.32.0.895096534658.issue16427@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
In the discussion of issue14621 it was noted that much more complex hash algorithms can overtake the current one due to the fact that they process more data at a time. Here is a patch that implements this idea for the current algorithm. Also code duplication removed.
Microbenchmarks:
$ ./python -m timeit -n 1 -s "t = b'a' * 10**8" "hash(t)"
$ ./python -m timeit -n 1 -s "t = 'a' * 10**8" "hash(t)"
$ ./python -m timeit -n 1 -s "t = '\u0100' * 10**8" "hash(t)"
$ ./python -m timeit -n 1 -s "t = '\U00010000' * 10**8" "hash(t)"
Results on 32-bit Linux on AMD Athlon 64 X2 4600+:
original patched speedup
bytes 181 msec 45.7 msec 4x
UCS1 429 msec 45.7 msec 9.4x
UCS2 179 msec 92 msec 1.9x
UCS4 183 msec 183 msec 1x
If the idea is acceptable, I will create benchmarks for short strings. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2012年11月07日 12:38:49 | serhiy.storchaka | set | recipients:
+ serhiy.storchaka |
| 2012年11月07日 12:38:49 | serhiy.storchaka | set | messageid: <1352291929.32.0.895096534658.issue16427@psf.upfronthosting.co.za> |
| 2012年11月07日 12:38:48 | serhiy.storchaka | link | issue16427 messages |
| 2012年11月07日 12:38:48 | serhiy.storchaka | create |
|