I am reading the implementation details of Java 8 HashMap, can anyone let me know why Java HashMap initial array size is 16 specifically? What is so special about 16? And why is it the power of two always? Thanks
3 Answers 3
The reason why powers of 2 appear everywhere is because when expressing numbers in binary (as they are in circuits), certain math operations on powers of 2 are simpler and faster to perform (just think about how easy math with powers of 10 are with the decimal system we use). For example, multication is not a very efficient process in computers - circuits use a method similar to the one you use when multiplying two numbers each with multiple digits. Multiplying or dividing by a power of 2 requires the computer to just move bits to the left for multiplying or the right for dividing.
And as for why 16 for HashMap? 10 is a commonly used default for dynamically growing structures (arbitrarily chosen), and 16 is not far off - but is a power of 2.
You can do modulus very efficiently for a power of 2. n % d = n & (d-1)
when d is a power of 2, and modulus is used to determine which index an item maps to in the internal array - which means it occurs very often in a Java HashMap. Modulus requires division, which is also much less efficient than using the bitwise and
operator. You can convince yourself of this by reading a book on Digital Logic.
The reason why bitwise and
works this way for powers of two is because every power of 2 is expressed as a single bit set to 1. Let's say that bit is t. When you subtract 1 from a power of 2, you set every bit below t to 1, and every bit above t (as well as t) to 0. Bitwise and
therefore saves the values of all bits below position t from the number n (as expressed above), and sets the rest to 0.
But how does that help us? Remember that when dividing by a power of 10, you can count the number of zeroes following the 1, and take that number of digits starting from the least significant of the dividend in order to find the remainder. Example: 637989 % 1000 = 989. A similar property applies to binary numbers with only one bit set to 1, and the rest set to 0. Example: 100101 % 001000 = 000101
-
Thank you so much for the excellent explanation. Other than that, does it have to do anything with negative hash code? What happens if the hash code of my hash function returns a negative value? Does it have a significance or it just does not matter?Imran– Imran2017年03月22日 02:27:00 +00:00Commented Mar 22, 2017 at 2:27
-
1You'll still get the modulo of the two numbers. If you're just using Java's modulo operator, you can get a negative number with a negative dividend, which would cause an
ArrayOutOfBoundsIndexException
, but if you correct for this you'll get the same result as thebitwise and
approach.Anish Goyal– Anish Goyal2017年03月22日 02:58:07 +00:00Commented Mar 22, 2017 at 2:58
There's one more thing about choosing the hash & (n - 1)
versus modulo
and that is negative hashes. hashcode
is of type int, which of course can be negative. modulo on a negative number (in Java) is negative also, while &
is not.
Another reason is that you want all of the slots in the array to be equally likely to be used. Since hash() is evenly distributed over 32 bits, if the array size didn't divide into the hash space, then there would be a remainder causing lower indexes to have a slightly higher chance of being used. Ideally, not just the hash, but (hash() % array_size) is random and evenly distributed.
But this only really matters for data with a small hash range (like a byte or character).
Explore related questions
See similar questions with these tags.
16
? a very sexy number