1
$\begingroup$

Context

Say I have a transposition table that uses keys produced by (e.g. Zobrist) hashing game positions.

The table has a finite recycled memory (key % p is the index of the key, p is table size and prime).

That is, at table[index(key)] (there) we store key and (game position) evaluation.

So, when fetching cached evaluation of a game position P that was hashed into key:

  • If key matches with key stored at index(key), we return evaluation stored there.
  • Else, we return default value (indicating "empty"), then rewrite key and evaluation there.

(I've basically summarized how a simple transposition table can be implemented.)

Collision problem

The problem is when two different game positions P1 and P2 are hashed into the same key (collision).

Say that we encounter position P1 first. First time reading this key, "empty" value will be returned and evaluation1 for P1 will be stored in table[index(key)]. Second time reading this key (matching key at index), evaluation1 for P1 will be returned. This will either:

  • Be fine, in the case that we encountered P1 again.

  • Be an error in the game evaluation, in the case that we encountered P2.

How can this error be detected and corrected, efficiently?

Apparently Stockfish uses something like this Zobrist hashing and has verification "Every entry is carefully validated before use,...", meaning that "...this (game engine) would work even if every hash table access would return random numbers."; But, how is this verification implemented?

asked Jun 18, 2021 at 15:29
$\endgroup$
1
  • $\begingroup$ You don't correct zobrist hash collisions.In this paper, Robert Hyatt(creator of Crafty Engine) ran a test where he forced the transposition table to have collisions. He and Anthony Cozzie(creator of Zappa) concluded that the cost of a collision here and there will not significantly change the evaluation of the position, and 64 bit keys are enough. Also, you cannot detect zobrist hash collisions. It is impossible to go from a zobrist hash back to a chess position, and storing the whole position in the transposition table takes too much memory. $\endgroup$ Commented Mar 14, 2023 at 20:51

1 Answer 1

1
$\begingroup$

I don't have any particular experience with transposition tables in particular, but in general, the standard way to use a hash table stores the key (e.g., the value P1 or P2) in the entry, not the hash of the key. This way you can detect collisions.

Terminology: in standard hash table usage, the key is the value (i.e., the position, such as P1 or P2), not its hash.

There are many standard ways to resolve collisions in hash tables, e.g., using open chaining or closed chaining. See https://en.wikipedia.org/wiki/Hash_table. However, I suspect those are not needed for transposition tables, as I suspect it is fine to return "unknown" in the event of a collision. If so, you might not need any of those methods.

answered Jun 18, 2021 at 19:34
$\endgroup$
3
  • $\begingroup$ I'm not concerned with that type of collisions (my hash function that compute the index is simple modulo p, prime size), i.e. I'm not concerned with the transposition table hashing. Instead, note that people combine transposition tables with additional (e.g. Zorbist) hashing of keys to save space (smaller keys). Then they usually try to minimize collisions of hashed keys in various ways, as they will inevitably happen. Instead of minimizing collisions, I'm wondering if there's a way to efficiently detect and correct those collisions, maybe using properties of Zorbist or some other hash. $\endgroup$ Commented Jun 18, 2021 at 20:01
  • $\begingroup$ You’d often store the hashed value to have a very fast check that you found the right value, but for chess positions you could have a hash value that is actually a subset of the position. $\endgroup$ Commented Jul 14, 2022 at 7:13
  • $\begingroup$ For chess positions, you could create a 64 or 96 bit hash values. With 96 bits, you’d have to lookup 2^48 positions on average until you get the wrong result, and then it is most likely that one incorrect lookup doesn’t affect the result. Having a bigger hashtable, saving time and calculating further, may very well outweigh the risk of very rare non-optimal results. $\endgroup$ Commented Jul 14, 2022 at 7:21

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.