Vector replacer

Question 1

I rolled an own vector-like construct with improved performance and got it reviewed. Here is the updated version.

#ifndef GUARD_HEADER_custom_2_h
#define GUARD_HEADER_custom_2_h
#include <vector>
namespace custom_2
{
 template <class T> class container
 {
 public:
 class iterator
 {
 public:
 typedef int difference_type;
 typedef T value_type;
 typedef T& reference;
 typedef T* pointer;
 typedef std::forward_iterator_tag iterator_category;
 iterator(const std::vector<int>::const_iterator& it, const std::vector<T>& el)
 : m_Iterator(it), m_Elements(*const_cast<std::vector<T>*>(&el))
 {}
 T& operator *() const
 {
 return m_Elements[*m_Iterator];
 }
 T* operator->() const
 {
 return &m_Elements[*m_Iterator];
 }
 iterator& operator ++()
 {
 ++m_Iterator;
 return *const_cast<iterator*>(this);
 }
 iterator& operator --()
 {
 --m_Iterator;
 return *const_cast<iterator*>(this);
 }
 iterator operator + (int n) const
 {
 return iterator(m_Iterator + n, m_Elements);
 }
 iterator operator - (int n) const
 {
 return iterator(m_Iterator - n, m_Elements);
 }
 int operator - (const iterator& it) const
 {
 return m_Iterator - it.m_Iterator;
 }
 bool operator == (const iterator& it) const
 {
 return m_Iterator == it.m_Iterator;
 }
 bool operator != (const iterator& it) const
 {
 return m_Iterator != it.m_Iterator;
 }
 private:
 std::vector<T>& m_Elements;
 std::vector<int>::const_iterator m_Iterator;
 };
 container()
 {
 m_Elements.reserve(30);
 m_Indices.reserve(30);
 m_Size = 0;
 }
 void reserve(int n)
 {
 m_Elements.reserve(n);
 m_Indices.reserve(n);
 }
 T& operator[](int n) const
 {
 return m_Elements[m_Indices[n]];
 }
 T& at(int n) const
 {
 return m_Elements[m_Indices[n]];
 }
 void erase(int n)
 {
 int deletedIndex = m_Indices.at(n);
 m_Indices.erase(m_Indices.begin() + n);
 m_Elements.at(n).~T();
 m_Indices.push_back(deletedIndex);
 --m_Size;
 }
 void insert(int n, const T& value)
 {
 if (m_Size == m_Elements.size())
 {
 m_Indices.insert(m_Indices.begin() + n, m_Elements.size());
 m_Elements.push_back(value);
 ++m_Size;
 return;
 }
 int indexToInsert = m_Indices.back();
 m_Indices.pop_back();
 m_Indices.insert(m_Indices.begin() + n, indexToInsert);
 new(&m_Elements.at(indexToInsert))T(value);
 ++m_Size;
 }
 void insert(const iterator& n, const T& value)
 {
 int index = n - begin();
 insert(index, value);
 }
 void push_back(const T& value)
 {
 if (m_Size == m_Elements.size())
 {
 m_Indices.push_back(m_Elements.size());
 m_Elements.push_back(value);
 ++m_Size;
 return;
 }
 int index = m_Indices.at(m_Size);
 new(&m_Elements.at(index))T(value);
 ++m_Size;
 }
 void pop_back()
 {
 --m_Size;
 m_Elements[m_Indices[m_Size]].~T();
 }
 iterator begin() const
 {
 return iterator(m_Indices.begin(), m_Elements);
 }
 iterator end() const
 {
 return iterator(m_Indices.begin() + m_Size, m_Elements);
 }
 int size() const
 {
 return m_Size;
 }
 void shrink()
 {
 std::vector<T> elements;
 elements.reserve(m_Size);
 for (int i = 0; m_Size > i; ++i)
 {
 elements.push_back(m_Elements[m_Indices[i]]);
 }
 m_Indices.clear();
 for (int i = 0; m_Size > i; ++i)
 {
 m_Indices.push_back(i);
 }
 m_Elements.clear();
 m_Elements.swap(elements);
 }
 T& front() const
 {
 return at(0);
 }
 T& back() const
 {
 return at(m_Size - 1);
 }
 private:
 std::vector<T> m_Elements;
 std::vector<int> m_Indices;
 int m_Size;
 };
}
#endif//GUARD_HEADER_custom_2_h

The user Frank came up with the idea to implement my container by using stl_container, so that's what I did. The main problem comes with some big performance differences through different systems. Under a Ryzen 5 1600x CPU the performance of both implementations is close to equal, under an I7 6600U on the other hand, the std::vector<T> implementation is much slower.

Here is the link to the original implementation: Vector-like custom container with improved insert performance

Question 2

This container can not serve as vector replacement because it does not store items contiguously. And there seems to be no rationale behind claimed performance improvements. Moreover the given implementation is flawed as it will call destructor on T elements twice if at least some of them were erased. This happens because inside of erase you manually call element destructor m_Elements.at(n).~T(); and then at m_Elements vector destructor or clear call it will be called again. Also use of int as a size type causes signed / unsigned mixing everywhere.

Question 3

It can, the shrink function restores the order of memory and indices. That way building the vector is faster and if you call shrink, it's completly continous. The item destruction is a goid point and the unsigned mix is not an issue since this is only a prototype test implementation.

Question 4

The improvment in performance comes due to the lack of memory reorder during insert and erase operations, which directly affects the performance. If you insert into a regular vector, every element on the "right" side of your insert index will be shifted one element size to the right. This is something which won't happen with that implementation until you call shrink. On smaller data types (int, double etc.) the gain is minimal. on larger objects on the other hand it's enormous. If you don't believe it, try it. If you don't want to try it, don't make claims about the performance.

Question 5

@Mango If you try to add a method data to obtain pointer to buffer (just like a normal vector) then you'll have to call shrink inside of it, and it will turn out to be very ugly. In const-qualified variant of data you'll have to use const_cast (or mark field as mutable, which is even worse). shrink implementation is rather slow and potentially throwing as it must perform mass reallocation and iterate over two vectors (and it suffers from the same double destructor problem).

Question 6

Vector is indeed not the fastest when it comes to inserting in the middle, but you can not claim "performance improvements" just because you've improved only one or two of vector methods potentially considerably slowing others. Also note that your container suffers from extra memory overhead.

Question 7

I tested around a lot, the push_back performance suffers close to nothing, insert and erase gain a lot. Yes the shrink call before a data call might get ugly but you could also call it before you pass it anywhere. And yes indeed, the memory overhead is much higher. You won't get good performance and low memory at once. But if one need continous memory, fast insert/erase/push_back performance you don't have a good choice under the stl container which is bugging me for a long time now. I never said it's a wonder container which handles verything perfekt, but it's an approach to a common problem

Question 8

It's an interesting idea though you should not make the mistake of thinking this is a generalized improvement to vector whatsoever. To the contrary this is much more narrowly-applicable since it slows down the common case usage of vector for most people in favor of a rare case scenario where there are many insertions and removals from the middle or beginning. Nevertheless, it might be useful for some exotic cases.

However, it's fundamentally flawed. You can't erase elements like that from the middle and just leave them in the data array, try to destroy them at the time of removal, and then destroy them again when the container is destroyed. To avoid the linear-time removal of the data, you'll have to mark it some way or keep another data structure, like another vector of ints, like vector<int> m_FreeIndices;

If you do it this way, then in your destructor (and you need a manual destructor), you can sort the free indices and then skip over destroying the elements at the sorted freed indices which have already been freed in linear time. Shrinking can clear the free index array.

As for whether you use vector or not to implement it, I actually don't care if you unit test this well and make sure it works in all possible cases, including edge cases, and that it does actually properly support non-POD types of T, properly invoking their constructors and destructors as needed, and being exception-safe.

You might be able to make it much simpler if you reduce its applicability to POD types at which point you can go back to using things like memcpy and realloc for the implementation. However, if you do that, my request is that you do a static assertion to make sure that T, indeed, is a plain-old data type with trivial constructors and destuctors and post compiler errors if the collection is used against anything else.

Question 9

Thanks for the suggestion, I am aware that this is not a replacer for all purposes. Meanwhile I went with my older (a bit more complicated) vector implementation since the double vector implementation sadly underperforms compared to my original version but still has the same and more flaws. I am using this for small gaming applications mostly.

Question 10

In my opinion, if you are doing this as an optimization for specific cases, I would actually go back to your old one and exploit the fact that you can just move things around as raw bits and bytes with memcpy and especially realloc which vector can't exploit due to its design with std::allocator. However, that will make it so it can't be used to store C++ classes with non-trivial constructors and destructors, but hey, it's an optimization detail -- something I wouldn't try to generalize to work in every possible use case.

Question 11

... just try to make it so it doesn't silently accept the use cases it can't handle. If you use things like memcpy and realloc, my number one request and suggestion is to do static assertions to make sure that T is, indeed, a POD type (like a C-style struct) so that it's a build error to try to create like a container<std::string>. You can look up the docs for your compiler to do that -- I've seen ways to detect that on MSVC and GCC at least.

Question 12

You may laugh, but I actually tried that also... But actually I went back to the double pointer based solution since it suited my needs better. I fixed the double destructor isdue and basically got no problems with the older implementation at all so it's still in use ;)

Question 13

The "memcopy" container were the first attempt on optimizing a dynamic areay forcthis special case

user7860670 user7860670 3132 silver badges8 bronze badges · Answer 1 · 2017-11-07 08:44:01Z

2

\$\begingroup\$

This container can not serve as vector replacement because it does not store items contiguously. And there seems to be no rationale behind claimed performance improvements. Moreover the given implementation is flawed as it will call destructor on T elements twice if at least some of them were erased. This happens because inside of erase you manually call element destructor m_Elements.at(n).~T(); and then at m_Elements vector destructor or clear call it will be called again. Also use of int as a size type causes signed / unsigned mixing everywhere.

Share

edited Nov 7, 2017 at 9:03

answered Nov 7, 2017 at 8:44

user7860670's user avatar

user7860670 user7860670

3132 silver badges8 bronze badges

\$\endgroup\$

8

\$\begingroup\$ It can, the shrink function restores the order of memory and indices. That way building the vector is faster and if you call shrink, it's completly continous. The item destruction is a goid point and the unsigned mix is not an issue since this is only a prototype test implementation. \$\endgroup\$

Mango
– Mango

2017年11月07日 08:48:59 +00:00
Commented Nov 7, 2017 at 8:48
\$\begingroup\$ The improvment in performance comes due to the lack of memory reorder during insert and erase operations, which directly affects the performance. If you insert into a regular vector, every element on the "right" side of your insert index will be shifted one element size to the right. This is something which won't happen with that implementation until you call shrink. On smaller data types (int, double etc.) the gain is minimal. on larger objects on the other hand it's enormous. If you don't believe it, try it. If you don't want to try it, don't make claims about the performance. \$\endgroup\$

Mango
– Mango

2017年11月07日 08:59:31 +00:00
Commented Nov 7, 2017 at 8:59
\$\begingroup\$ @Mango If you try to add a method data to obtain pointer to buffer (just like a normal vector) then you'll have to call shrink inside of it, and it will turn out to be very ugly. In const-qualified variant of data you'll have to use const_cast (or mark field as mutable, which is even worse). shrink implementation is rather slow and potentially throwing as it must perform mass reallocation and iterate over two vectors (and it suffers from the same double destructor problem). \$\endgroup\$

user7860670
– user7860670

2017年11月07日 08:59:35 +00:00
Commented Nov 7, 2017 at 8:59
\$\begingroup\$ Vector is indeed not the fastest when it comes to inserting in the middle, but you can not claim "performance improvements" just because you've improved only one or two of vector methods potentially considerably slowing others. Also note that your container suffers from extra memory overhead. \$\endgroup\$

user7860670
– user7860670

2017年11月07日 09:02:25 +00:00
Commented Nov 7, 2017 at 9:02
\$\begingroup\$ I tested around a lot, the push_back performance suffers close to nothing, insert and erase gain a lot. Yes the shrink call before a data call might get ugly but you could also call it before you pass it anywhere. And yes indeed, the memory overhead is much higher. You won't get good performance and low memory at once. But if one need continous memory, fast insert/erase/push_back performance you don't have a good choice under the stl container which is bugging me for a long time now. I never said it's a wonder container which handles verything perfekt, but it's an approach to a common problem \$\endgroup\$

Mango
– Mango

2017年11月07日 09:09:50 +00:00
Commented Nov 7, 2017 at 9:09

| Show 3 more comments

user90268user90268 · Answer 2 · 2017-12-01 17:53:26Z

It's an interesting idea though you should not make the mistake of thinking this is a generalized improvement to vector whatsoever. To the contrary this is much more narrowly-applicable since it slows down the common case usage of vector for most people in favor of a rare case scenario where there are many insertions and removals from the middle or beginning. Nevertheless, it might be useful for some exotic cases.

However, it's fundamentally flawed. You can't erase elements like that from the middle and just leave them in the data array, try to destroy them at the time of removal, and then destroy them again when the container is destroyed. To avoid the linear-time removal of the data, you'll have to mark it some way or keep another data structure, like another vector of ints, like vector<int> m_FreeIndices;

If you do it this way, then in your destructor (and you need a manual destructor), you can sort the free indices and then skip over destroying the elements at the sorted freed indices which have already been freed in linear time. Shrinking can clear the free index array.

As for whether you use vector or not to implement it, I actually don't care if you unit test this well and make sure it works in all possible cases, including edge cases, and that it does actually properly support non-POD types of T, properly invoking their constructors and destructors as needed, and being exception-safe.

You might be able to make it much simpler if you reduce its applicability to POD types at which point you can go back to using things like memcpy and realloc for the implementation. However, if you do that, my request is that you do a static assertion to make sure that T, indeed, is a plain-old data type with trivial constructors and destuctors and post compiler errors if the collection is used against anything else.

Thanks for the suggestion, I am aware that this is not a replacer for all purposes. Meanwhile I went with my older (a bit more complicated) vector implementation since the double vector implementation sadly underperforms compared to my original version but still has the same and more flaws. I am using this for small gaming applications mostly.
In my opinion, if you are doing this as an optimization for specific cases, I would actually go back to your old one and exploit the fact that you can just move things around as raw bits and bytes with memcpy and especially realloc which vector can't exploit due to its design with std::allocator. However, that will make it so it can't be used to store C++ classes with non-trivial constructors and destructors, but hey, it's an optimization detail -- something I wouldn't try to generalize to work in every possible use case.
... just try to make it so it doesn't silently accept the use cases it can't handle. If you use things like memcpy and realloc, my number one request and suggestion is to do static assertions to make sure that T is, indeed, a POD type (like a C-style struct) so that it's a build error to try to create like a container<std::string>. You can look up the docs for your compiler to do that -- I've seen ways to detect that on MSVC and GCC at least.
You may laugh, but I actually tried that also... But actually I went back to the double pointer based solution since it suited my needs better. I fixed the double destructor isdue and basically got no problems with the older implementation at all so it's still in use ;)
The "memcopy" container were the first attempt on optimizing a dynamic areay forcthis special case

Stack Exchange Network

Vector replacer

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

Vector replacer

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions