I have attempted to implement a similar version of the STL Vector; several functions are missing but I'd like a few words of advice on whether I am indeed on the right track or whether I should change my approach.
My objective is to understand how the vector works from behind the scenes and not to replace or use it for my own applications. I would like to eventually continue this process for all the STL containers.
namespace mystl
{
template <typename T>
class Vector
{
private:
std::unique_ptr<T[]> values;
std::size_t m_size;
std::size_t m_capacity;
unsigned int m_capacity_inc;
protected:
std::unique_ptr<T[]> get_values();
public:
/* Type definitions */
typedef T* iterator;
typedef const T* const const_iterator;
/* default constructor */
explicit Vector();
/* filling constructor*/
explicit Vector(std::size_t num, const T& val = T());
/* Copy constructor needs to use move since copy is a deleted function in unique_ptr*/
explicit Vector(const Vector<T>& v);
/* Move constructor */
explicit Vector(Vector<T>&& v);
/* Initializer list constructor */
explicit Vector(std::initializer_list<T> l);
/* iterator functions */
iterator begin();
iterator end();
/* capacity info functions */
std::size_t size() const;
std::size_t capacity() const;
bool empty() const;
/* modifier functions*/
void push_back(T t);
void pop_back();
void clear();
void resize(std::size_t new_size);
/* element access*/
T& operator[](int index)
{
return values[index];
}
};
/* move constructor */
template <typename T>
Vector<T>::Vector(Vector<T>&& v)
{
m_capacity = v.capacity();
m_size = v.size();
m_capacity_inc = m_capacity / 2;
values = (v.get_values());
v.clear();
}
/* initializer list constructor */
template <typename T>
Vector<T>::Vector(std::initializer_list<T> l)
{
m_size = 0;
m_capacity = l.size();
m_capacity_inc = m_size / 2;
values = std::make_unique<T[]>(m_capacity);
for (auto val : l)
{
push_back(val);
}
}
/* check is vector is empty */
template <typename T>
bool Vector<T>::empty() const
{
return m_size == 0;
}
/* returns pointer to first vector element */
template <typename T>
typename Vector<T>::iterator Vector<T>::begin()
{
return values.get();
}
/* returns pointer to last vector element */
template <typename T>
typename Vector<T>::iterator Vector<T>::end()
{
return values.get() + m_size;
}
/* size private member getter */
template <typename T>
std::size_t Vector<T>::size() const
{
return m_size;
}
/* capacity private member getter */
template <typename T>
std::size_t Vector<T>::capacity() const
{
return m_capacity;
}
/* pushes new value to the end of the vector*/
template <typename T>
void Vector<T>::push_back(T t)
{
if (m_size >= m_capacity)
{
/* size has reached capacity so we must reallocate a new, larger array */
m_capacity = m_capacity + m_capacity_inc;
/* double the capacity increase size */
m_capacity_inc = m_capacity_inc << 1;
/* allocate the new memory for the vector*/
std::unique_ptr<T[]> new_vec = std::make_unique<T[]>(m_capacity);
/* copy the old values to the new array */
for (std::size_t index = 0; index < m_size; ++index)
{
new_vec[index] = values[index];
}
/* move ownership of the new array to the current vector pointer */
values = std::move(new_vec);
}
/* input the new value and increase the size; */
values[m_size++] = t;
}
/* removes the last item and deletes the memory area related to that object and decreases the size. capacity is not affected.*/
template <typename T>
void Vector<T>::pop_back()
{
(values)[m_size - 1].~T();
--m_size;
}
/* clear function resets capacity info and deallocates memory.*/
template <typename T>
void Vector<T>::clear()
{
m_size = 0;
m_capacity = 0;
m_capacity_inc = 1;
values.reset();
}
/* resizes the container to the argument size. if the new size is less than the old size, values after the new_size are set as the default value of the
type of the template object */
template<typename T>
void Vector<T>::resize(std::size_t new_size)
{
auto new_array = std::make_unique<T[]>(new_size);
m_capacity = new_size;
m_capacity_inc = new_size;
int array_end = m_size < new_size ? m_size : new_size;
for (auto i = 0; i < array_end; i++)
{
new_array = values[i];
}
values = std::move(new_array);
m_size = new_size;
}
/* a protected function that returns the container array. */
template <typename T>
std::unique_ptr<T[]> Vector<T>::get_values()
{
/* returns an l value reference */
return std::move(values);
}
/* default vector constructor, with an empty array allocated of 0 size. */
template <typename T>
Vector<T>::Vector()
{
m_size = 0;
m_capacity = 0;
m_capacity_inc = 1;
values = std::make_unique<T[]>(m_capacity);
}
/* constructs container with size and capacity of the argument, with all values initiated with the value argument */
template <typename T>
Vector<T>::Vector(std::size_t num, const T& val)
{
/* allocate memory */
m_size = m_capacity = num;
m_capacity_inc = num / 2;
values = std::make_unique<T[]>(num);
/* initialize values */
for (std::size_t i = 0; i < m_size; ++i)
{
values[i] = val;
}
}
/* copy constructor: copies the capacity information and values from the argument vector*/
template <typename T>
Vector<T>::Vector(const Vector<T>& v)
{
m_capacity = v.m_capacity;
m_size = v.size();
m_capacity_inc = m_capacity / 2;
values = std::make_unique<T[]>(m_capacity);
for (auto i = 0; i < m_size; ++i)
{
values[i] = v.values[i];
}
}
}
I have omitted several functions which I will continue to implement - such as shrink_to_fit, reverse iterators, emplacers, operator overlaods.
Allocation in my case is being performed using smart pointers - should I opt to use the same allocator functionality implemented by the native vector or would that be an exercise in futility?
2 Answers 2
Placement New
Your main problem (as described by Jerry Coffin in the comments) is that you are creating objects when you should not be.
If we look at your copy constructor:
Vector<T>::Vector(const Vector<T>& v)
{
// Here you are allocating several objects of type T
// This actually calls the default constructor of T
values = std::make_unique<T[]>(m_capacity);
// Two problems with this:
// 1) Not all types have default constructors.
// 2) In the loop below you are then calling the assignment operator.
// Which means you are doing twice the work you need to do.
for (auto i = 0; i < m_size; ++i)
{
values[i] = v.values[i];
}
}
Next lets look at re-size
void Vector<T>::resize(std::size_t new_size)
{
// The problem here.
// If you re-size to double the number of elements (say an extra 'n')
// You are creating n elements that you don't need.
// Especially if the constructor of `T` is expensive of uses
// precious resources you don't want to do that. You do want to reserve
// the space but you don't actually want to construct the object
// until the size is correct
//
// And then you want to construct the object in place not call the
// constructor then perform a copy construction on top of that.
auto new_array = std::make_unique<T[]>(new_size);
}
This is where placement new comes in.
// Create your buffer where the data is going to be placed.
// But do not do any work on construction.
char* data = new char[sizeof(T) * countOfObjects];
// If you want to copy data into the buffer you use placement new.
T* dst = reinterpret_cast<T*>(data);
for(auto i = 0;i < src.size(); ++i) {
new (dst + i) T(src[i]); // Copy construct using placement new.
}
Memory Management
Normally there are two ways to do memory management. Smart pointers and containers. These are complimentary methods and you don't usually mix them together.
So I would expect a vector to do its own memory management and not delegate this work to a smart pointer (though you can).
-
\$\begingroup\$ In this context is
char* data = new char[sizeof(T) * countOfObjects];
equivalent tostatic_cast<T*>(::operator new(sizeof(T) * countofObjects))
? Both allocate that many Bytes without creating/placing theT
objects there. Am I correct? Great answer. \$\endgroup\$KeyC0de– KeyC0de2018年10月27日 10:05:31 +00:00Commented Oct 27, 2018 at 10:05 -
\$\begingroup\$ @Nik-Lz Yes. But also note that both guarantee correct alignment for objects of type
T
\$\endgroup\$Loki Astari– Loki Astari2018年10月27日 17:27:26 +00:00Commented Oct 27, 2018 at 17:27 -
\$\begingroup\$ How so? They don't specify alignment. It's just
char
s/bytes. Wouldn't we have to prepend withalignas(T)
? \$\endgroup\$KeyC0de– KeyC0de2018年10月27日 18:11:36 +00:00Commented Oct 27, 2018 at 18:11 -
\$\begingroup\$ @Nik-Lz: The
new-array-expression
(i.e.new Type[size]
) is a call to::operator new[]
. SeeSection: 8.5.2.4 New Paragraph 11
to see the alignment properties of::operator new[]
then seeSection 6.6.5 Alignment Paragraph 5
to see the guarantees that provide. See the comments at the bottom of this page: lokiastari.com/blog/2016/02/27/vector/index.html \$\endgroup\$Loki Astari– Loki Astari2018年10月28日 00:45:48 +00:00Commented Oct 28, 2018 at 0:45
I think it is well written for the most part. I see only a handful of things that you could change:
Use the standard algorithms
They make your code safer, shorter and convey intent more clearly.
/* initialize values */ for (std::size_t i = 0; i < m_size; ++i) { values[i] = val; }
The above is std::fill()
:
std::fill(values, values + m_size, val);
for (auto i = 0; i < m_size; ++i) { values[i] = v.values[i]; }
That's std::copy()
:
std::copy(values, values + m_size, v.values);
Focus on adding an iterator class
If you're going to extend this further, I'd suggest now adding real iterator types. Right now you're using plain pointers, which provide no error checking or type safety. Implementing iterators for the vector should be a good exercise. Be sure to also add global begin/end
overloads for the Vector
iterators so that you can use your vector type in a range-based for.
std::vector
's iterator category is theRandomAccessIterator
.See also
std::iterator
(which you should inherit from).
Other miscellaneous comments
Some parts of your code are excessively commented. The main case being inside
push_back
. Too obvious and verbose comments that just describe what each line is doing are distracting from the actual code. Let the code speak for itself.Why is
get_values
protected
? I don't suppose any other class inherits fromVector
, so the expected access level isprivate
.You need a
const
overload ofoperator []
, otherwise is is impossible to access elements on aconst Vector<T>
. Same forbegin/end
Declare overloads that are const and returnconst_iterator
. You should also provide the newcbegin/cend/
pair.Take a look at
std::vector::push_back
. It comes in two flavors, one taking a const reference and one taking a&&
move-ref. This avoids unnecessary copies when you're storing large expensive to copy objects. Also consideremplace_back
.Your
clear()
deallocates memory, but that might be quite inefficient. In most usage scenarios, when youclear
a vector you're just prepping to reuse the vector with a new collection, so if clear releases the memory you'd have to allocate again. Even though if the new collection is smaller, avoiding reallocations is still worth most of the time to avoid memory fragmentation overhead. Makeclear()
just destroy the objects, whileshrink_to_fit()
releases the memory.I think you made a good choice by using
unique_ptr
for the baking store. Of course that, as Jerry mentioned in a comment, usingmake_unique
(or evennew T[]
) creates default constructed objects, and that's not the behavior ofstd::vector
. When youreserve
with the standard vector, the new memory is left uninitialized. But I don't suppose that's a major concern in your case. Using aunique_ptr
is better than rawnew/delete
, any day of the week.Annotate the non-throwing methods with
noexcept
. Optional, but recommended in modern C++.
new[]
(even when/if hidden insidemake_unique
) can't work correctly. To make it work correctly, you want to allocate raw memory withoperator new
, then use placement new to create objects in that memory. As it is now, you're creating actual objects in the unused part of the memory, which isn't allowed. \$\endgroup\$