To give a simple example, consider a Place
class:
public class Place {
//fields
private String name;
private String state;
private int population;
private int squareMileage;
private int elevation;
//constructors
public Place() {
}
public Place(String name, String state) {
this.name = name;
this.state = state;
}
public Place(String name, String state, int population, int squareMileage,
int elevation) {
this.name = name;
this.state = state;
this.population = population;
this.squareMileage = squareMileage;
this.elevation = elevation;
}
//getters and setters
public String getName() {
return this.name;
}
public String getState() {
return this.state;
}
//... (other getters and setters omitted)
//do stuff
}
And a Places
class (a HashMap
of Place
objects):
import java.util.*;
public class Places {
private Map searchablePlaces;
public Places() {
searchablePlaces = new HashMap();
}
public void add(Place value) {
Place key = new Place(value.getName(), value.getState());
searchablePlaces.put(key, value);
}
public Place find(Place key) {
return searchablePlaces.get(key);
}
//override hashCode, equals
//do stuff
}
Essentially, my question is:
- Would there be any efficiency gained in searching the
HashMap
forkey
, as opposed to searching forvalue
directly in, let's say, a sortedArrayList
? - What if
key
was of typeString
equal toname + state
(or of typeString[2]
)?
5 Answers 5
It's confusing, IMHO. I would define a PlaceKey class, holding just a name and a state and defining hashCode and equals. A place would hold a PlaceKey, and other attributes. And I would use a Map<PlaceKey, Place>
.
Note that, in your example, it's the Place class that needs equals and hashCode, not the Places class.
Now, to answer your 2 questions :
- a hash map is O(1), and a list is O(n). A map will thus usually be faster (except perhaps for very small n values)
- concatenating is almost always a bad idea. You could have name1 = aa, state1 = a, name2 = a and state2 = aa, which would result in a collision. An array doesn't override equals and hashCode in terms of their content, and is not an appropriate map key class.
3 Comments
Would there be any efficiency gained in searching the HashMap for key, as opposed to searching for value directly in, let's say, a sorted ArrayList?
Yes. Comparing strings is way slower than comparing integers (and additionally rarely comparing strings).
If you use custom object for keys, you should implement hashCode
carefully with the most used possible values in mind. You should also have a streamlined equals
as well.
The benefit of the custom class against the name + state
would be that you can take into account the possible values.
6 Comments
keySet
, then you've completely lost the benefits of having a hash table.hashCode
. - This can happen due to a carelessly written custom key object (for example basing the hashCode on the first few letters of the state), in which case there are going to be lots of identical codes.value
is type Place
, not int
. It's HashMap<Place, Place>
. So it would come down to the speed of comparing String
objects versus the speed of comparing Place
objects. Thanks for your time and sorry for the multiple edits (I'm new).hashCode
is to minimize the call to equals
. When you use HashMap
and search for a Place
then thae API will get the hashCode of your key, then can do a binary search between the hash codes of the elements (no comparsion of strings here), then only if there are multiple items with the same hash code will it use equals
. - So it boils down to "comparing integers (and additionally rarely comparing strings)" as I wronte in my answer.Accessing objects through an ArrayList is much faster than doing it by a key of a HashMap, but the pointers to that objects have to be stored on memory in a continuous block, that makes slower to remove and put object in the ArrayList. So deciding which type is better will depend on the operations you will do during the execution of your program.
For example, if you need to put and/or remove the places lots of times it will be better to use the HashMap as the ArrayList is less eficient. If your places are more or less fixed and you have to move them rarely then ArrayList will give you more performance.
If you definetly use the HashMap it will be faster to use an unique string as a key.
Using another (not fully filled) Place object as key is at least waste of memory. Usually you search by a unique identifier. There are several possibilities to get one:
- Use (name+SEPARATOR+state) as key, if you want to search using both of them.
- Use several maps, one for each (searchable) property - if it isn't unique you'll connect a List to it.
- Use a (cryptogrphic) hash of the object as key ...
Comments
It depends upon how you search the
HashMap
. If, for example, you usecontainsKey()
then the map will be an order of magnitude faster than searching through the list. In the typical case, checking for a key in a hash-table happens in constant time (O(1)
). Searching a sorted array for the key can be done inO(log n)
time if you use a binary search. However, if you search for the key by iterating the keyset (which is not guaranteed to be sorted in any order) then you have anO(n)
operation, which is slower than searching through a sorted list.The short version? Using
containsKey()
will give the best performance.You could do that, but then why use a
Map
? Why not aList
or aSet
? The point of aMap
is to associate a key with a value. If you aren't doing that (and the example code isn't, really), then you are using the wrong data structure for your task.
1 Comment
containsKey
for efficiency. Regarding #2, I'm a bit confused. I like @JB Nizet's suggestion of defining a PlaceKey
class as the key type, but (while maybe more inefficient), it still seems reasonable to use a String
(containing only the relevant search terms) as the key to a Map
. Would you mind clarifying? Would that not be more efficient than searching a List
or a Set
?
add
andfind
methods. You store<String, String>
inadd
but try to fetchPlace
as though it is<String, Place>
infind
.Place
object, then added thatPlace
and the one passed as an arg to theadd
method, so it should be<Place, Place>
.