I rolled an own vector-like construct with improved performance and got it reviewed. Here is the updated version.
#ifndef GUARD_HEADER_custom_2_h
#define GUARD_HEADER_custom_2_h
#include <vector>
namespace custom_2
{
template <class T> class container
{
public:
class iterator
{
public:
typedef int difference_type;
typedef T value_type;
typedef T& reference;
typedef T* pointer;
typedef std::forward_iterator_tag iterator_category;
iterator(const std::vector<int>::const_iterator& it, const std::vector<T>& el)
: m_Iterator(it), m_Elements(*const_cast<std::vector<T>*>(&el))
{}
T& operator *() const
{
return m_Elements[*m_Iterator];
}
T* operator->() const
{
return &m_Elements[*m_Iterator];
}
iterator& operator ++()
{
++m_Iterator;
return *const_cast<iterator*>(this);
}
iterator& operator --()
{
--m_Iterator;
return *const_cast<iterator*>(this);
}
iterator operator + (int n) const
{
return iterator(m_Iterator + n, m_Elements);
}
iterator operator - (int n) const
{
return iterator(m_Iterator - n, m_Elements);
}
int operator - (const iterator& it) const
{
return m_Iterator - it.m_Iterator;
}
bool operator == (const iterator& it) const
{
return m_Iterator == it.m_Iterator;
}
bool operator != (const iterator& it) const
{
return m_Iterator != it.m_Iterator;
}
private:
std::vector<T>& m_Elements;
std::vector<int>::const_iterator m_Iterator;
};
container()
{
m_Elements.reserve(30);
m_Indices.reserve(30);
m_Size = 0;
}
void reserve(int n)
{
m_Elements.reserve(n);
m_Indices.reserve(n);
}
T& operator[](int n) const
{
return m_Elements[m_Indices[n]];
}
T& at(int n) const
{
return m_Elements[m_Indices[n]];
}
void erase(int n)
{
int deletedIndex = m_Indices.at(n);
m_Indices.erase(m_Indices.begin() + n);
m_Elements.at(n).~T();
m_Indices.push_back(deletedIndex);
--m_Size;
}
void insert(int n, const T& value)
{
if (m_Size == m_Elements.size())
{
m_Indices.insert(m_Indices.begin() + n, m_Elements.size());
m_Elements.push_back(value);
++m_Size;
return;
}
int indexToInsert = m_Indices.back();
m_Indices.pop_back();
m_Indices.insert(m_Indices.begin() + n, indexToInsert);
new(&m_Elements.at(indexToInsert))T(value);
++m_Size;
}
void insert(const iterator& n, const T& value)
{
int index = n - begin();
insert(index, value);
}
void push_back(const T& value)
{
if (m_Size == m_Elements.size())
{
m_Indices.push_back(m_Elements.size());
m_Elements.push_back(value);
++m_Size;
return;
}
int index = m_Indices.at(m_Size);
new(&m_Elements.at(index))T(value);
++m_Size;
}
void pop_back()
{
--m_Size;
m_Elements[m_Indices[m_Size]].~T();
}
iterator begin() const
{
return iterator(m_Indices.begin(), m_Elements);
}
iterator end() const
{
return iterator(m_Indices.begin() + m_Size, m_Elements);
}
int size() const
{
return m_Size;
}
void shrink()
{
std::vector<T> elements;
elements.reserve(m_Size);
for (int i = 0; m_Size > i; ++i)
{
elements.push_back(m_Elements[m_Indices[i]]);
}
m_Indices.clear();
for (int i = 0; m_Size > i; ++i)
{
m_Indices.push_back(i);
}
m_Elements.clear();
m_Elements.swap(elements);
}
T& front() const
{
return at(0);
}
T& back() const
{
return at(m_Size - 1);
}
private:
std::vector<T> m_Elements;
std::vector<int> m_Indices;
int m_Size;
};
}
#endif//GUARD_HEADER_custom_2_h
The user Frank came up with the idea to implement my container by using stl_container, so that's what I did. The main problem comes with some big performance differences through different systems. Under a Ryzen 5 1600x CPU the performance of both implementations is close to equal, under an I7 6600U on the other hand, the std::vector<T>
implementation is much slower.
Here is the link to the original implementation: Vector-like custom container with improved insert performance
2 Answers 2
This container can not serve as vector replacement because it does not store items contiguously. And there seems to be no rationale behind claimed performance improvements. Moreover the given implementation is flawed as it will call destructor on T
elements twice if at least some of them were erased. This happens because inside of erase
you manually call element destructor m_Elements.at(n).~T();
and then at m_Elements
vector destructor or clear
call it will be called again. Also use of int
as a size type causes signed / unsigned mixing everywhere.
-
\$\begingroup\$ It can, the shrink function restores the order of memory and indices. That way building the vector is faster and if you call shrink, it's completly continous. The item destruction is a goid point and the unsigned mix is not an issue since this is only a prototype test implementation. \$\endgroup\$Mango– Mango2017年11月07日 08:48:59 +00:00Commented Nov 7, 2017 at 8:48
-
\$\begingroup\$ The improvment in performance comes due to the lack of memory reorder during insert and erase operations, which directly affects the performance. If you insert into a regular vector, every element on the "right" side of your insert index will be shifted one element size to the right. This is something which won't happen with that implementation until you call shrink. On smaller data types (int, double etc.) the gain is minimal. on larger objects on the other hand it's enormous. If you don't believe it, try it. If you don't want to try it, don't make claims about the performance. \$\endgroup\$Mango– Mango2017年11月07日 08:59:31 +00:00Commented Nov 7, 2017 at 8:59
-
\$\begingroup\$ @Mango If you try to add a method
data
to obtain pointer to buffer (just like a normal vector) then you'll have to callshrink
inside of it, and it will turn out to be very ugly. In const-qualified variant ofdata
you'll have to useconst_cast
(or mark field as mutable, which is even worse).shrink
implementation is rather slow and potentially throwing as it must perform mass reallocation and iterate over two vectors (and it suffers from the same double destructor problem). \$\endgroup\$user7860670– user78606702017年11月07日 08:59:35 +00:00Commented Nov 7, 2017 at 8:59 -
\$\begingroup\$ Vector is indeed not the fastest when it comes to inserting in the middle, but you can not claim "performance improvements" just because you've improved only one or two of vector methods potentially considerably slowing others. Also note that your container suffers from extra memory overhead. \$\endgroup\$user7860670– user78606702017年11月07日 09:02:25 +00:00Commented Nov 7, 2017 at 9:02
-
\$\begingroup\$ I tested around a lot, the push_back performance suffers close to nothing, insert and erase gain a lot. Yes the shrink call before a data call might get ugly but you could also call it before you pass it anywhere. And yes indeed, the memory overhead is much higher. You won't get good performance and low memory at once. But if one need continous memory, fast insert/erase/push_back performance you don't have a good choice under the stl container which is bugging me for a long time now. I never said it's a wonder container which handles verything perfekt, but it's an approach to a common problem \$\endgroup\$Mango– Mango2017年11月07日 09:09:50 +00:00Commented Nov 7, 2017 at 9:09
It's an interesting idea though you should not make the mistake of thinking this is a generalized improvement to vector
whatsoever. To the contrary this is much more narrowly-applicable since it slows down the common case usage of vector
for most people in favor of a rare case scenario where there are many insertions and removals from the middle or beginning. Nevertheless, it might be useful for some exotic cases.
However, it's fundamentally flawed. You can't erase elements like that from the middle and just leave them in the data array, try to destroy them at the time of removal, and then destroy them again when the container is destroyed. To avoid the linear-time removal of the data, you'll have to mark it some way or keep another data structure, like another vector of ints, like vector<int> m_FreeIndices;
If you do it this way, then in your destructor (and you need a manual destructor), you can sort the free indices and then skip over destroying the elements at the sorted freed indices which have already been freed in linear time. Shrinking can clear the free index array.
As for whether you use vector
or not to implement it, I actually don't care if you unit test this well and make sure it works in all possible cases, including edge cases, and that it does actually properly support non-POD types of T, properly invoking their constructors and destructors as needed, and being exception-safe.
You might be able to make it much simpler if you reduce its applicability to POD types at which point you can go back to using things like memcpy
and realloc
for the implementation. However, if you do that, my request is that you do a static assertion to make sure that T
, indeed, is a plain-old data type with trivial constructors and destuctors and post compiler errors if the collection is used against anything else.
-
\$\begingroup\$ Thanks for the suggestion, I am aware that this is not a replacer for all purposes. Meanwhile I went with my older (a bit more complicated) vector implementation since the double vector implementation sadly underperforms compared to my original version but still has the same and more flaws. I am using this for small gaming applications mostly. \$\endgroup\$Mango– Mango2017年12月01日 17:58:21 +00:00Commented Dec 1, 2017 at 17:58
-
\$\begingroup\$ In my opinion, if you are doing this as an optimization for specific cases, I would actually go back to your old one and exploit the fact that you can just move things around as raw bits and bytes with
memcpy
and especiallyrealloc
whichvector
can't exploit due to its design withstd::allocator
. However, that will make it so it can't be used to store C++ classes with non-trivial constructors and destructors, but hey, it's an optimization detail -- something I wouldn't try to generalize to work in every possible use case. \$\endgroup\$user90268– user902682017年12月01日 18:00:00 +00:00Commented Dec 1, 2017 at 18:00 -
\$\begingroup\$ ... just try to make it so it doesn't silently accept the use cases it can't handle. If you use things like
memcpy
andrealloc
, my number one request and suggestion is to do static assertions to make sure thatT
is, indeed, a POD type (like a C-style struct) so that it's a build error to try to create like acontainer<std::string>
. You can look up the docs for your compiler to do that -- I've seen ways to detect that on MSVC and GCC at least. \$\endgroup\$user90268– user902682017年12月01日 18:03:48 +00:00Commented Dec 1, 2017 at 18:03 -
\$\begingroup\$ You may laugh, but I actually tried that also... But actually I went back to the double pointer based solution since it suited my needs better. I fixed the double destructor isdue and basically got no problems with the older implementation at all so it's still in use ;) \$\endgroup\$Mango– Mango2017年12月01日 18:03:50 +00:00Commented Dec 1, 2017 at 18:03
-
\$\begingroup\$ The "memcopy" container were the first attempt on optimizing a dynamic areay forcthis special case \$\endgroup\$Mango– Mango2017年12月01日 18:05:40 +00:00Commented Dec 1, 2017 at 18:05