Planning a Python Course for Beginners

Thu Aug 10 12:58:17 EDT 2017

On Fri, Aug 11, 2017 at 2:41 AM, Steve D'Aprano
<steve+python at pearwood.info> wrote:
> On 2017年8月11日 12:58 am, Chris Angelico wrote:
>>> On Fri, Aug 11, 2017 at 12:45 AM, Steve D'Aprano
>> <steve+python at pearwood.info> wrote:
>>>>> The C code says:
>>>>>>> /* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid
>>>> excessive hash collisions for dicts and sets */
>>>>>> which I think agrees with my comment: using the id() itself would put too
>>> many objects in the same bucket (i.e. too many collisions).
>>>>>>>>>> If that were the problem it wouldn't be solved by the current approach:
>>>>>>>>>>> sample = [object() for _ in range(10)]
>>>>>>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])]
>>>> [1, 1, 1, 1, 1, 1, 1, 1, 1]
>>>> A difference of 1 in a hash is usually going to mean dropping
>> something into the next bucket. A difference of 4, 8, or 16 would mean
>> that a tiny dictionary (which has 8 slots and thus uses modulo-8)
>> would have everything on the same slot.
>> Um... yes? And how does that relate to the comment given in the source code?
>> "bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid excessive hash
> collisions for dicts and sets"
> Are we in agreement so far?

Yes, we're in agreement. It may have been unclear from my quoting
style, but the main point I was disagreeing with was this:
>>>> If that were the problem it wouldn't be solved by the current approach:
>>>>>>>>>>> sample = [object() for _ in range(10)]
>>>>>>> [hash(b) - hash(a) for a, b in zip(sample, sample[1:])]
>>>> [1, 1, 1, 1, 1, 1, 1, 1, 1]

Incrementing hashes by 1 usually will put things into successive
buckets. Incrementing by 8 or 16 will usually put things into the same
bucket.
Sorry for the confusion.
ChrisA