6
\$\begingroup\$

I am trying to remove a key from a hash table, which is built using linear probing. Removing a key works fine but I know I need to rehash a portion of the table after the removal but my method down below is not quite efficient. It basically rehashes the whole table that is unnecessary. Is there anyway that I could improve the performance while rehashing.

bool remove(int x, int H[], int m)
{
 int hashVal = hash(x, m);
 for (int i = 0; H[(hashVal + i) % m] != EMPTY; i++)
 {
 if (H[(hashVal + i) % m] == x)
 {
 H[(hashVal + i) % m] = EMPTY;
 //Rehash the table
 //I think the problem is here. This for loop condition can be better
 for (int k = 1; H[(hashVal + i + k) % m] != EMPTY ; k++)
 {
 int keyShiftUp = H[(hashVal + i + k) % m];
 H[(hashVal + i + k) % m] = EMPTY;
 insertLP(keyShiftUp, H, m);
 cout << "keyShiftUp: " << keyShiftUp << ends;
 }
 return true;
 }
 }
return false;
}
200_success
146k22 gold badges190 silver badges479 bronze badges
asked Dec 6, 2015 at 2:24
\$\endgroup\$
1
  • \$\begingroup\$ Welcome to Code Review! Good job on your first question. \$\endgroup\$ Commented Dec 6, 2015 at 2:37

1 Answer 1

3
\$\begingroup\$

C++, not C

First things first, I note that remove is a free function. But that's not how we would do something like this in C++. We would prefer to have a type HashTable in which we'd write:

class HashTable {
public:
 ...
 bool remove(int key);
 ...
private:
 int hash(int );
 std::vector<int> table; // vector is better than int*
 // table is a better name than H
};

Your naming also could use some work - what is x and what is m? I'm guessing m is the capacity and x is the key that we're erasing, but where are the values? We just have int H[], so is this more like a hash set than a hash table? It's unclear.

Looping and separation of concerns

The hash function should just do the hashing (and thus shouldn't take in the capacity). The loop itself should loop over slots, not indices - this will localize your modulo operation so it doesn't creep in everywhere:

int next(int i) {
 return (i+1) % table.size();
}
bool remove(int key) {
 int i = hash(key) % table.size(); 
 if (table[i] == EMPTY) { // break early
 return false;
 }
 for (; table[i] != EMPTY; i = next(i)) {
 if (table[i] == key) {
 table[i] = EMPTY;
 for (int k = next(i); table[k] != EMPTY; k = next(k)) {
 ...
 }
 return true;
 }
 }
 return false;
}

Once you reorient looping over additive indices to looping over actual indices, the inner body becomes more straightforward too:

int old_key = table[k];
table[k] = EMPTY;
insert(old_key);

Find the slot

Your i loop is all about finding the slot where the key is at. The body of that loop isn't really a loop, it's a find-type algorithm. So let's rewrite it like one:

bool remove(int key) {
 int i = hash(key) % table.size(); 
 for (; table[i] != key; i = next(i)) {
 if (table[i] == EMPTY) { // break early
 return false;
 }
 }
 // now, table[i] == key
 table[i] = EMPTY;
 for (int k = next(i); table[k] != EMPTY; k = next(k)) {
 int old_key = table[k];
 table[k] = EMPTY;
 insert(old_key);
 }
 return true;
}

EMPTY

In addition to not really being a hash table, you're seemingly limiting the possible valid keys to exclude something like 0 or -1 or whatever the value of EMPTY is. May be worth separating the key from the empty concept.

answered Dec 6, 2015 at 16:58
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.