Skip to main content
Code Review

Return to Answer

added 554 characters in body
Source Link
amon
  • 12.7k
  • 37
  • 67

I do not understand why val was introduced in addition to value – it isn't even a copy of the contents.

On the other hand h does make sense once you consider thread safety. Assume we have two threads which calculate the hash of the string at nearly the same time – to trigger this, have both threads sleep for a moment when they enter the if branch. At the beginning, both threads will have hash == 0, so they will calculate the hash anew. With the original implementation, only a local variable is mutated, and then atomicaly committed to the hash variable. With your code, both threads mutate the same hash directly and will thus compute an invalid result. Your solution would be OK iff all accesses to the hash were synchronized (by locking the whole string object for each such method which is silly when using immutable strings).

As this is Code Review, I will also suggest to return early in both implementations, instead of putting the whole calculation into an if branch. It is also nicer to use a foreach loop than to use indices.

public int hashCode() {
 int h = hash; // copy for thread safety
 if (h != 0 || value.length == 0) return h;
 for (char c : value) {
 h = 31 * h + c;
 }
 hash = h; // commit the change
 return h;
}

On the other hand, a foreach-loop might not be as optimized as looping over the indices, and an early return could also impact what code is generated.


Edit: The value.length > 0 test in the original avoids committing the old value when no change of the hash is required (for the empty string). This can be relevant in cache-sensitive applications (remember that writing to shared memory can invalidate caches for other processors as well – we want to avoid this if possible. The condition also provides the guarantee that the loop will execute at least once if that branch is taken. Code ought to be generated in a way that this prior check does not reduce performance – on the contrary.

I do not understand why val was introduced in addition to value – it isn't even a copy of the contents.

On the other hand h does make sense once you consider thread safety. Assume we have two threads which calculate the hash of the string at nearly the same time – to trigger this, have both threads sleep for a moment when they enter the if branch. At the beginning, both threads will have hash == 0, so they will calculate the hash anew. With the original implementation, only a local variable is mutated, and then atomicaly committed to the hash variable. With your code, both threads mutate the same hash directly and will thus compute an invalid result. Your solution would be OK iff all accesses to the hash were synchronized (by locking the whole string object for each such method which is silly when using immutable strings).

As this is Code Review, I will also suggest to return early in both implementations, instead of putting the whole calculation into an if branch. It is also nicer to use a foreach loop than to use indices.

public int hashCode() {
 int h = hash; // copy for thread safety
 if (h != 0 || value.length == 0) return h;
 for (char c : value) {
 h = 31 * h + c;
 }
 hash = h; // commit the change
 return h;
}

On the other hand, a foreach-loop might not be as optimized as looping over the indices, and an early return could also impact what code is generated.

I do not understand why val was introduced in addition to value – it isn't even a copy of the contents.

On the other hand h does make sense once you consider thread safety. Assume we have two threads which calculate the hash of the string at nearly the same time – to trigger this, have both threads sleep for a moment when they enter the if branch. At the beginning, both threads will have hash == 0, so they will calculate the hash anew. With the original implementation, only a local variable is mutated, and then atomicaly committed to the hash variable. With your code, both threads mutate the same hash directly and will thus compute an invalid result. Your solution would be OK iff all accesses to the hash were synchronized (by locking the whole string object for each such method which is silly when using immutable strings).

As this is Code Review, I will also suggest to return early in both implementations, instead of putting the whole calculation into an if branch. It is also nicer to use a foreach loop than to use indices.

public int hashCode() {
 int h = hash; // copy for thread safety
 if (h != 0 || value.length == 0) return h;
 for (char c : value) {
 h = 31 * h + c;
 }
 hash = h; // commit the change
 return h;
}

On the other hand, a foreach-loop might not be as optimized as looping over the indices, and an early return could also impact what code is generated.


Edit: The value.length > 0 test in the original avoids committing the old value when no change of the hash is required (for the empty string). This can be relevant in cache-sensitive applications (remember that writing to shared memory can invalidate caches for other processors as well – we want to avoid this if possible. The condition also provides the guarantee that the loop will execute at least once if that branch is taken. Code ought to be generated in a way that this prior check does not reduce performance – on the contrary.

Source Link
amon
  • 12.7k
  • 37
  • 67

I do not understand why val was introduced in addition to value – it isn't even a copy of the contents.

On the other hand h does make sense once you consider thread safety. Assume we have two threads which calculate the hash of the string at nearly the same time – to trigger this, have both threads sleep for a moment when they enter the if branch. At the beginning, both threads will have hash == 0, so they will calculate the hash anew. With the original implementation, only a local variable is mutated, and then atomicaly committed to the hash variable. With your code, both threads mutate the same hash directly and will thus compute an invalid result. Your solution would be OK iff all accesses to the hash were synchronized (by locking the whole string object for each such method which is silly when using immutable strings).

As this is Code Review, I will also suggest to return early in both implementations, instead of putting the whole calculation into an if branch. It is also nicer to use a foreach loop than to use indices.

public int hashCode() {
 int h = hash; // copy for thread safety
 if (h != 0 || value.length == 0) return h;
 for (char c : value) {
 h = 31 * h + c;
 }
 hash = h; // commit the change
 return h;
}

On the other hand, a foreach-loop might not be as optimized as looping over the indices, and an early return could also impact what code is generated.

lang-java

AltStyle によって変換されたページ (->オリジナル) /