homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author mark.dickinson
Recipients ajaksu2, facundobatista, mark.dickinson
Date 2007年09月15日.04:06:39
SpamBayes Score 0.0005784205
Marked as misclassified No
Message-id <1189829202.72.0.980292283083.issue1772851@psf.upfronthosting.co.za>
In-reply-to
Content
Here's a revised patch, that only touches longobject.c and does the minimum 
necessary to alter long.__hash__ from being *almost* completely predictable to 
being completely predictable. To summarize the effects of this patch:
 - the current long.__hash__ is predictable except for a very small number of 
(unpredictable) inputs; this patch makes it predictable for all inputs, thereby 
making it possible to define an efficient Decimal.__hash__. To be precise, the 
patched long.__hash__ has the property that it's periodic of period ULONG_MAX in 
the ranges [1, infinity) and (-infinity, -1].
 - the value of hash(n) is unchanged for any n that's small enough to be 
representable as an int, and also unchanged for the vast majority of long 
integers n of reasonable size. (For integers up to 450 digits long, the new hash 
value differs from the old for around 1 in every 2500 integers on a 32-bit 
system; on a 64-bit system the figure is 1 in every 10000 billion.)
 - On my test system (Dual Xeon 2.8GHz/SuSE Linux 10.2/gcc 4.1), using
timeit.timeit('hash(n)') there's no measurable slowdown for small integers. 
Unfortunately there *is* slowdown for larger integers: around 20% slowdown for 
100 digit integers and around 70% extra time for 1000 digit integers, though in 
all cases the times are in fractions of a microsecond. It doesn't help that gcc 
doesn't optimize the inner loop perfectly: with the -O3 flag, the addition and 
subsequent correction are compiled to (with x in eax and v->ob_digit[i] in edx):
 addl %eax, %edx
 cmpl %eax, %edx
 adcl 0,ドル %edx
and the cmpl here is entirely redundant. Maybe there's some way of writing the C 
code that makes it easier for gcc to pick up on this optimization?
Files
File name Uploaded
long_hash.patch mark.dickinson, 2007年09月15日.04:06:40
History
Date User Action Args
2007年09月15日 04:06:43mark.dickinsonsetspambayes_score: 0.000578421 -> 0.0005784205
recipients: + mark.dickinson, facundobatista, ajaksu2
2007年09月15日 04:06:42mark.dickinsonsetspambayes_score: 0.000578421 -> 0.000578421
messageid: <1189829202.72.0.980292283083.issue1772851@psf.upfronthosting.co.za>
2007年09月15日 04:06:42mark.dickinsonlinkissue1772851 messages
2007年09月15日 04:06:41mark.dickinsoncreate

AltStyle によって変換されたページ (->オリジナル) /