I am trying to remove a key from a hash table, which is built using linear probing. Removing a key works fine but I know I need to rehash a portion of the table after the removal but my method down below is not quite efficient. It basically rehashes the whole table that is unnecessary. Is there anyway that I could improve the performance while rehashing.
bool remove(int x, int H[], int m)
{
int hashVal = hash(x, m);
for (int i = 0; H[(hashVal + i) % m] != EMPTY; i++)
{
if (H[(hashVal + i) % m] == x)
{
H[(hashVal + i) % m] = EMPTY;
//Rehash the table
//I think the problem is here. This for loop condition can be better
for (int k = 1; H[(hashVal + i + k) % m] != EMPTY ; k++)
{
int keyShiftUp = H[(hashVal + i + k) % m];
H[(hashVal + i + k) % m] = EMPTY;
insertLP(keyShiftUp, H, m);
cout << "keyShiftUp: " << keyShiftUp << ends;
}
return true;
}
}
return false;
}
-
\$\begingroup\$ Welcome to Code Review! Good job on your first question. \$\endgroup\$SirPython– SirPython2015年12月06日 02:37:39 +00:00Commented Dec 6, 2015 at 2:37
1 Answer 1
C++, not C
First things first, I note that remove
is a free function. But that's not how we would do something like this in C++. We would prefer to have a type HashTable
in which we'd write:
class HashTable {
public:
...
bool remove(int key);
...
private:
int hash(int );
std::vector<int> table; // vector is better than int*
// table is a better name than H
};
Your naming also could use some work - what is x
and what is m
? I'm guessing m
is the capacity and x
is the key that we're erasing, but where are the values? We just have int H[]
, so is this more like a hash set than a hash table? It's unclear.
Looping and separation of concerns
The hash function should just do the hashing (and thus shouldn't take in the capacity). The loop itself should loop over slots, not indices - this will localize your modulo operation so it doesn't creep in everywhere:
int next(int i) {
return (i+1) % table.size();
}
bool remove(int key) {
int i = hash(key) % table.size();
if (table[i] == EMPTY) { // break early
return false;
}
for (; table[i] != EMPTY; i = next(i)) {
if (table[i] == key) {
table[i] = EMPTY;
for (int k = next(i); table[k] != EMPTY; k = next(k)) {
...
}
return true;
}
}
return false;
}
Once you reorient looping over additive indices to looping over actual indices, the inner body becomes more straightforward too:
int old_key = table[k];
table[k] = EMPTY;
insert(old_key);
Find the slot
Your i
loop is all about finding the slot where the key
is at. The body of that loop isn't really a loop, it's a find
-type algorithm. So let's rewrite it like one:
bool remove(int key) {
int i = hash(key) % table.size();
for (; table[i] != key; i = next(i)) {
if (table[i] == EMPTY) { // break early
return false;
}
}
// now, table[i] == key
table[i] = EMPTY;
for (int k = next(i); table[k] != EMPTY; k = next(k)) {
int old_key = table[k];
table[k] = EMPTY;
insert(old_key);
}
return true;
}
EMPTY
In addition to not really being a hash table, you're seemingly limiting the possible valid keys to exclude something like 0
or -1
or whatever the value of EMPTY
is. May be worth separating the key from the empty concept.