I do not understand why val
was introduced in addition to value
– it isn't even a copy of the contents.
On the other hand h
does make sense once you consider thread safety. Assume we have two threads which calculate the hash of the string at nearly the same time – to trigger this, have both threads sleep for a moment when they enter the if
branch. At the beginning, both threads will have hash == 0
, so they will calculate the hash anew. With the original implementation, only a local variable is mutated, and then atomicaly committed to the hash
variable. With your code, both threads mutate the same hash
directly and will thus compute an invalid result. Your solution would be OK iff all accesses to the hash
were synchronized (by locking the whole string object for each such method which is silly when using immutable strings).
As this is Code Review, I will also suggest to return early in both implementations, instead of putting the whole calculation into an if
branch. It is also nicer to use a foreach loop than to use indices.
public int hashCode() {
int h = hash; // copy for thread safety
if (h != 0 || value.length == 0) return h;
for (char c : value) {
h = 31 * h + c;
}
hash = h; // commit the change
return h;
}
On the other hand, a foreach-loop might not be as optimized as looping over the indices, and an early return could also impact what code is generated.
Edit: The value.length > 0
test in the original avoids committing the old value when no change of the hash is required (for the empty string). This can be relevant in cache-sensitive applications (remember that writing to shared memory can invalidate caches for other processors as well – we want to avoid this if possible. The condition also provides the guarantee that the loop will execute at least once if that branch is taken. Code ought to be generated in a way that this prior check does not reduce performance – on the contrary.
- 12.7k
- 37
- 67