[Python-Dev] Status of the fix for the hash collision vulnerability

Mark Dickinson dickinsm at gmail.com
Fri Jan 13 19:13:08 CET 2012


On Fri, Jan 13, 2012 at 5:43 PM, Guido van Rossum <guido at python.org> wrote:
>> How pathological do you consider the set
>>>>   {1 << n for n in range(2000)}
>>>> to be?  What about the set:
>>>>   ieee754_powers_of_two = {2.0**n for n in range(-1074, 1024)}
>>>> ?  The > 2000 elements of the latter set have only 61 distinct hash
>> values on 64-bit machine, so there will be over 2000 total collisions
>> involved in creating this set (though admittedly only around 30
>> collisions per hash value).
>> Hm... So how does the collision counting work for this case?

Ah, my bad. It looks like the ieee754_powers_of_two is safe---IIUC,
it's the number of collisions involved in a single key-set operation
that's limited. So a dictionary with keys {1<<n for n in range(2000)}
is fine, but a dictionary with keys {1<<(61*n) for n in range(2000)}
is not:
>>> {1<<(n*61):True for n in range(2000)}
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "<stdin>", line 1, in <dictcomp>
KeyError: 'too many hash collisions'
[67961 refs]
I'd still not consider this particularly pathological, though.
-- 
Mark


More information about the Python-Dev mailing list

AltStyle によって変換されたページ (->オリジナル) /