Return to Answer

replaced http://stackoverflow.com/ with https://stackoverflow.com/

Source Link

edited May 23, 2017 at 11:33

Community Bot

edited May 23, 2017 at 11:33

Community Bot

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

replaced http://codereview.stackexchange.com/ with https://codereview.stackexchange.com/

Source Link

edited Apr 13, 2017 at 12:40

Community Bot

edited Apr 13, 2017 at 12:40

Community Bot

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

Minor fixes on typos and grammar.

Source Link

edited Jul 19, 2015 at 23:41

glampert

edited Jul 19, 2015 at 23:41

glampert

17.3k
4
31
89

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template classstructure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

What exactly is the reason behind that? The modulo operator will properly wrap the hash to the size of the table, so why bring down the value first, possiblepossibly wasting valuable hash bits in the process?

The entry->keynode indirection seems useless, only forcing you to allocate memory twice at every new insertion, probableprobably also making your CPU cache usage way less efficient than it could be, and also adding more memory overhead per block.
The way mem_count is used seems suspicious. It starts as 0, but then you increment it whenever something is deallocated. By its name, I'd think it is counting the number of allocated bytes, but rather it seems to keep track of the number of deallocated bytes. If you keep it that way, a better name would be in order, if not, then it would make more sense for it to hold the number of bytes currently allocated.
I see that you are assigning zero to pointers. This is old-style C++. C++11 has been around for some time now and it's not even the most recent standard anymore, so you can start using nullptr instead.
You're missing the rule of three. Your classstruct can be copied by the implicit copy-constructor and assignment operatorsoperator, resulting on shallow copies of the same object that share the same pointers. That would result in several destructors trying to free the same memory. Simplest possible fix is to disallow copy and assignment by either making the copy-constructor/operator private and unimplemented or by deleting them both.

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template class, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

What exactly is the reason behind that? The modulo operator will properly wrap the hash to the size of the table, so why bring down the value first, possible wasting valuable hash bits in the process?

The entry->keynode indirection seems useless, only forcing you to allocate memory twice at every new insertion, probable making your CPU cache usage way less efficient than it could be, and also adding more memory overhead per block.
The way mem_count is used seems suspicious. It starts as 0, but then you increment it whenever something is deallocated. By its name, I'd think it is counting the number of allocated bytes, but rather it seems to keep track of the number of deallocated bytes. If you keep it that way, a better name would be in order, if not, then it would make more sense for it to hold the number of bytes currently allocated.
I see that you are assigning zero to pointers. This is old-style C++. C++11 has been around for some time now and it's not even the most recent standard anymore, so you can start using nullptr instead.
You're missing the rule of three. Your class can be copied by the implicit copy and assignment operators, resulting on shallow copies of the same object that share the same pointers. That would result in several destructors trying to free the same memory. Simplest possible fix is to disallow copy and assignment by either making the copy-constructor/operator private and unimplemented or by deleting them.

If you're aiming for portability, then the first thing to do is change the use of those compiler-specific types to the standard sized integer types of <cstdint>, as it was correctly suggested by @tkausl. Also note that in C++ those types are members of namespace std, so the correct portable usage would be, for instance: std::uint32_t. It is incidental that those types happen to be visible at the global scope due to the fact that compilers usually "recycle" the C header files for C++, so we can imagine a hypothetical C++-only compiler where those types would only be visible inside the std namespace.
Make sure to include the standard headers between < >, e.g.: #include <iostream>. By using double-quotes, you force the compiler to look first in the current working directory, which is a waste of compile-time.
Forgive my honesty, but those global variables are awful. I can understand the table having a couple stats counters, but then make those class members. The first three, initial_size, growth_threshold and growth_factor are clearly constants, so you should definitely make them static const, and also members of the class to avoid polluting the global scope.
Overall, your hash-table is implemented in a very C-ish way. Even though it is a template structure, I doubt it will function properly for non Plain Old Data - POD types, since you are memcopying and memcomping stuff. If that's really what you want, then I'd suggest adding a static assertion to make sure this class is not used accidentally on some non-POD type, yielding undefined behavior. You're using clang, so just enable C++11 with -std=c++11:
```
 // Somewhere inside the declaration of jshash:
 static_assert(std::is_pod<value_t>::value, "This container requires a POD type!");
```
The way find is implemented is also unusual. It doesn't return the requested value, but rather, sets and internal pointer with the result. That's counter-intuitive. Why not return the pointer if found and nullptr if not found? Or keep the boolean return and add an extra output parameter to the function, if you prefer.
Now I'm no hash-table guru, but I have implemented a hash-table before, and I didn't do that shifting by two on the result of my hash function:

What exactly is the reason behind that? The modulo operator will properly wrap the hash to the size of the table, so why bring down the value first, possibly wasting valuable hash bits in the process?

The entry->keynode indirection seems useless, only forcing you to allocate memory twice at every new insertion, probably also making your CPU cache usage way less efficient than it could be, and adding more memory overhead per block.
The way mem_count is used seems suspicious. It starts as 0, but then you increment it whenever something is deallocated. By its name, I'd think it is counting the number of allocated bytes, but rather it seems to keep track of the number of deallocated bytes. If you keep it that way, a better name would be in order, if not, then it would make more sense for it to hold the number of bytes currently allocated.
I see that you are assigning zero to pointers. This is old-style C++. C++11 has been around for some time now and it's not even the most recent standard anymore, so you can start using nullptr instead.
You're missing the rule of three. Your struct can be copied by the implicit copy-constructor and assignment operator, resulting on shallow copies of the same object that share the same pointers. That would result in several destructors trying to free the same memory. Simplest possible fix is to disallow copy and assignment by either making the copy-constructor/operator private and unimplemented or by deleting them both.

Added an extra point.

Source Link

edited Jul 19, 2015 at 20:16

glampert

edited Jul 19, 2015 at 20:16

glampert

17.3k
4
31
89

Source Link

answered Jul 19, 2015 at 19:46

glampert

answered Jul 19, 2015 at 19:46

glampert

17.3k
4
31
89

lang-cpp