_Jv_HashCode
Per Bothner
per@bothner.com
Wed May 10 16:44:00 GMT 2000
Bryce McKinlay <bryce@albatross.co.nz> writes:
> Don't the HashTable and HashSet classes depend on a fairly good
> distribution of hashCode() values? I think performance for these classes
> is suffering when Objects that don't override hashCode() are used as
> keys.
>> To accomodate copying collectors, we may eventually want to use the
> current sync_info field to store a hashcode and perhaps other GC
> bookkeeping information. However, doing a right shift of the object
> address seems like a good solution for now.
Two points:
(1) Doing a right shift of the object address does *not* give you a
fairly good distribution of hashCode() values"! To do that, you need
to do some more serious shuffling of the bits. Certainly, shifting by
2 doesn't help much, given no objects are only 4 bytes long. Assuming
a minimum Object size of 16, you would have to shift by 4 for it to
help much. But that looses 4 bits of information in the hash-value.
To get them back, you could put them back in the high order bit, for
example. One suggestion:
(jint) OBJ ^ ((jint) OBJ >> 12) // on a 16- or 32-bit machine
(jint) OBJ ^ (jint) ((jlong) OBJ >> 32) // on a 64-bit machine
(2) Putting random objects that that don't override hashCode into a
hash table is usually not something that makes sense ...
In any case, before changing System.identityHashCode I do think we
need to decide on what gdb should print for an object reference. The
way gdb works is that it must be able to print out a reference *without*
invoking a method in the inferior process. (After all, the inferior
might be a core dump or the run-time may be frozen.) There may (and
should be) other printout modes, including one that prints the
result of invoking toString, and another mode that prints out the
fields of the Object (ideally in a form used by a GUI inspector).
However, there needs to be one format that just prints the object's
address, and that should probably be the default format (as that is
most consistent with how gdb works in general).
Note the address-printing format needs to actually print the
address, not a hasCode based on the address. For one thing the
output should uniquely identify the object. For another, gdb
users expect the address. Otherwise you would get much less use
from the `x' command.
If System.identityHashCode no longer returns the object's address,
then gdb's default syntax should not be `CLASSNAME@ADDRESS', since
that would suggest confusingly that ADDRESS is the identityHashCode.
We *have* to change the Syntax. Probably the most logical would
be `(CLASSNAME) 0xADDRESS'. That looks like a Java cast, and
matches the C++ syntax (except for the `*' operator).
--
--Per Bothner
per@bothner.com http://www.bothner.com/~per/
More information about the Java
mailing list