Does this data structure make sense? It's essentially a vector of pointers, intended to point to larger objects. In effect, it gives the benefit of a linked list, with an \$O(1)\$ lookup at the cost of maintaining an array of pointers to each element.
What edge cases am I missing, and what's the formal name of this data structure? (What's the standard data structure used in such situations?)
#include<iostream>
#include<vector>
using namespace std;
template <class T>
class pointerArray
{
vector<void*> addresses;
public:
pointerArray(int initialSize=100)
{
addresses.resize(initialSize);
}
~pointerArray()
{
for(unsigned int i=0;i<addresses.size();i++)
{
if(addresses[i]!=0)
{
delete (T*)addresses[i];
}
}
}
T& operator[](unsigned int pos)
{
if(pos >= addresses.capacity())
{
addresses.resize(pos*2);
}
if(addresses[pos] == 0)
{
T* ptr = new T;
addresses[pos] = ptr;
}
return *(T*)(addresses[pos]);
}
};
int main()
{
pointerArray<string> pa(100);
for(int i=0;i<10000;i++)
{
pa[i]="Hello World This Is a Very Long String";
}
for(int i=0;i<10000;i++)
{
cout<<pa[i]<<endl;
}
}
2 Answers 2
I don't think that this data structure has any special name(vector of pointers sounds reasonable).
Now about the code itself:
I do not see any reason to use
void*
pointers here and cast them explicitly every time you access an element. You can just usevector<T*> addresses
and get rid of all casts. You can also use smart pointers(unique_ptr
looks like a good choice here). Here is my version of this class:template<class T> class PointerArray { public: PointerArray(size_t initialSize = 100): array(initialSize) {} T& operator[](size_t pos) { if (pos >= array.size()) { size_t newSize = std::max(pos + 1, 2 * pos); array.resize(newSize); } if (!array[pos]) { array[pos] = std::unique_ptr<T>(new T()); // or std::make_unique in C++14 } return *array[pos]; } private: std::vector<std::unique_ptr<T>> array; };
There no need for a custom destructor anymore(
unique_ptr
handles it). Note that I have added this line:int newSize = std::max(pos + 1, 2 * pos);
. Why is it necessary? We need it when theinitialSize
is0
and so ispos
.In this line:
if(pos >= addresses.capacity())
, you should be comparingpos
to the vector's size, not capacity(because it is an undefined behavior to access elements of the vector beyond size).More general notes:
using namespace std;
is a bad practice. There is no need to pollute global namespace.Binary operators are usually surrounded with whitespaces. It is also conventional to have an empty line between different functions to improve readability.
And the last one: do have good reasons to use this data structure? In my opinion, as long as a simple
std::vector<T>
works fine, there is no need to invent any new data structures.
-
\$\begingroup\$ Whats the name of the syntax you used in the default constructor? This was more of a thought than anything else, such a data structure gets rid of the requirement to have a large contiguous memory block when dealing with larger objects in memory \$\endgroup\$Akash– Akash2015年04月04日 12:10:43 +00:00Commented Apr 4, 2015 at 12:10
-
\$\begingroup\$ @Akash It is called member initializer list: en.cppreference.com/w/cpp/language/initializer_list \$\endgroup\$kraskevich– kraskevich2015年04月04日 13:29:15 +00:00Commented Apr 4, 2015 at 13:29
In addition to what kraskevich said (especially about smart pointers), I would also implement a proper copy constructor and assignment operator (or explicitly delete them).
Your current version would produce a shallow copy (and as a result UB when deleting referenced objects a second time). Kraskevich's solution doesn't have a copy constructor at all, as std::unique_ptr
cannot be copied. While this is better than having a faulty copy constructor, you should try to stay more in line with the behavior of the standard library containers.
Also missing is a size()
function:
#include <iostream>
#include <vector>
#include <memory>
#include <string>
#include <algorithm>
template <class T>
class pointerArray
{
std::vector<std::unique_ptr<T>> addresses; //<-- Use unique_ptr -> no delete function needed
public:
pointerArray(size_t initialSize = 100) : //<-- use size_t
addresses(initialSize)
{}
pointerArray(const pointerArray& other) { //<-- implement deep copy
addresses.resize(other.addresses.size());
for (size_t i = 0; i < addresses.size(); ++i) {
if (other.addresses[i] != nullptr) {
addresses[i] = std::unique_ptr<T>(new T(*other.addresses[i]));
}
}
}
pointerArray(pointerArray&& other) = default; //<-- default move constructor does exactly what it is supposed to do
pointerArray& operator=(pointerArray other) { //<-- could be made minimally more efficent by implementing separate copy and move assigment operators
addresses=std::move(other.addresses );
return *this;
}
size_t size() { return addresses.size(); } //<-- handy when passed as a parameter
T& operator[](size_t pos) {
if (pos >= addresses.size()) { //<-- compare against size, not capacity
addresses.resize(pos+1);
}
if (addresses[pos] == nullptr) { //<-- compare against nullptr
addresses[pos] = std::unique_ptr<T>(new T());
}
return *addresses[pos];
}
};
-
\$\begingroup\$ naive question, but what happens if T does not implement a copy constructor and pointerArray<T> is created in this case? \$\endgroup\$Akash– Akash2015年04月07日 16:43:08 +00:00Commented Apr 7, 2015 at 16:43
-
\$\begingroup\$ Also, I'm not so sure the size function makes a lot of sense.. the data structure is addressable at any arbitrary index, and size() will return the size of the underlying vector, which is set to 2x the max pos requested, and should probably not be exposed since its internal behaviour... \$\endgroup\$Akash– Akash2015年04月07日 16:46:07 +00:00Commented Apr 7, 2015 at 16:46
-
\$\begingroup\$ Which brings to mind a new question, if someone accesses a[100000], and not enough memory is available to resize size, how should I fail gracefully ? \$\endgroup\$Akash– Akash2015年04月07日 16:47:26 +00:00Commented Apr 7, 2015 at 16:47
-
\$\begingroup\$ @Akash: As long as you don't try to copy
pointerArray
, nothing will happen. If you do you will - depending on your compiler - get a very long and ugly compiletime error message (as always, when templates are involved). \$\endgroup\$MikeMB– MikeMB2015年04月07日 17:09:15 +00:00Commented Apr 7, 2015 at 17:09 -
\$\begingroup\$ There are a lot of situations, where you need to know, how many elements are inside a container. But you are right with the
2x
growth. I corrected that in the answer. Its not necessary anyway, as the usedstd::vector
usually already implements a quadratic growth algorithm. \$\endgroup\$MikeMB– MikeMB2015年04月07日 17:11:43 +00:00Commented Apr 7, 2015 at 17:11
std::vector<T*>
or (better)std::vector<std::unique_ptr<T>>
? Indeed, isn't it exactly the same thing as these standard library features? And shouldn't one prefer standard library implementations of common data structures over reinventing the wheel (unless the standard implementation has unacceptable performance, likestd::map
)? \$\endgroup\$