C++ dynamic array implementation

Question 1

As a C++ beginner coming from Java, I have become increasingly confused on the topic of memory management and how to avoid memory leaks. Is the code below risking a memory leak that I'm not currently aware of? Any help or constructive feedback would be greatly appreciated.

#pragma once
template <class T>
class DynamicArray {
private:
 T *m_arr;
 int m_length; //amount of elements currently being stored in the array
 int m_capacity; //actual size of the array
public:
 DynamicArray();
 ~DynamicArray();
 T get(int index); //O(1)
 void add(T obj); //no need to push any objects forward, O(1)
 void insert(int index, T obj); //pushes forward all objects in front of the given index, then sets the obj at the given index, O(n)
 void set(int index, T obj); //sets the given index of m_arr as obj, O(1)
 void remove(int index); //removes the object at the given index and pushes all the array contents back, O(n)
 int size(); //O(1)
 void print();
};

#include <iostream>
#include "Header.h"
template<class T>
DynamicArray<T>::DynamicArray() : m_arr(new T[1]), m_length(0), m_capacity(1) {}
template<class T>
DynamicArray<T>::~DynamicArray() {
 delete[] m_arr;
}
template<class T>
T DynamicArray<T>::get(int index) {
 if (index < m_length && index >= 0)
 return m_arr[index];
 else throw ("Index out of bounds!");
}
template<class T>
void DynamicArray<T>::set(int index, T obj) {
 if (index < m_length && index >= 0) {
 m_arr[index] = obj;
 } else throw ("Index out of bounds!");
}
template<class T>
void DynamicArray<T>::add(T obj) {
 if (m_length == m_capacity) {
 T *new_arr = new T[m_length * 2];
 for (int i = 0; i < m_length; i++) {
 new_arr[i] = m_arr[i];
 }
 delete[] m_arr;
 m_arr = new_arr;
 m_capacity = m_capacity * 2;
 }
 m_arr[m_length] = obj;
 m_length++;
}
template<class T>
void DynamicArray<T>::insert(int index, T obj) {
 if (index < m_length && index >= 0) {
 int size;
 if (m_length == m_capacity) size = m_length * 2;
 else size = m_capacity;
 T *new_arr = new T[size];
 for (int i = 0, j = 0; i < m_length; i++, j++) {
 if (i == index) {
 new_arr[j] = obj;
 j++;
 }
 new_arr[j] = m_arr[i];
 }
 delete[] m_arr;
 m_arr = new_arr;
 m_capacity = m_capacity * 2;
 m_length++;
 } else throw ("Index out of bounds!");
}
template<class T>
void DynamicArray<T>::remove(int index) {
 if (index < m_length && index >= 0) {
 T *new_arr = new T[m_capacity];
 for (int i = 0, j = 0; i < m_length; i++, j++) {
 if (i == index) i++;
 if(i < m_length) new_arr[j] = m_arr[i];
 }
 delete[] m_arr;
 m_arr = new_arr;
 m_capacity = m_capacity * 2;
 m_length--;
 } else throw ("Index out of bounds!");
}
template<class T>
int DynamicArray<T>::size() {
 return m_length;
}
template<class T>
void DynamicArray<T>::print() {
 std::cout << m_arr[0];
 for (int i = 1; i < m_length; i++) {
 std::cout << ", " << m_arr[i];
 }
}

Question 2

Please do not update the code in your question after receiving answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers .

Question 3

If you're relatively new to C++, why are you messing with things like this instead of using std::vector? Implementing vector well, is a fairly non-trivial task even for people with a lot of experience.

Question 4

@JerryCoffin I thought it would be more useful as a beginner to create the vector class myself so that I understand it better.

Question 5

Not at all. Practical programming in C++ doesn't require any of these skills, any more than you need to understand how to build a garbage collector when you are programming in Java. You are probably just getting confused because so many people wrongly teach C++ as if it were C, where you do need these skills.

Question 6

Do note that you are using templates, and you will have problems if you try linking against your code. Templated code should be header-only (recommended, de-facto standard), or you should include your source file in your header. See this for more info

Question 7

Welcome to C++, and welcome to Code Review.

C++ memory management is, as you probably have realized, tough and error-prone. There are many things that can easily go wrong. Assuming that no exception is thrown, I don't see obvious memory leaks in your code; however, there are still some issues worth discussing.

You can take a look at my implementation of a non-resizable dynamic array or a stack-based full-fledged vector for some inspiration.

Special member functions

You have not defined copy constructors or move constructors, so the compiler will synthesize corresponding constructors that simply copy all the members — which is completely wrong, as now the two dynamic arrays will point to the same memory. Not only are the elements shared between the copies, causing modifications to one array to affect the other, but the two copies will attempt to free the same memory upon destruction, leading to a double-free error, which is way more serious than a memory leak.

Initialization semantics

It is generally expected that the constructor of the element type is called \$n\$ times if \$n\$ elements are pushed into the dynamic array. In your code, however, this is not the case, whre the amount of constructors called is determined by the capacity of the dynamic array. Elements are first default initialized, and then copy-assigned to.

The correct way to solve this problem requires allocating an uninitialized buffer, and using placement new (or equivalent features) to construct the elements, which is another can of worms.

Exception safety

Think of what happens when the construction of an element throws an exception — your code will halt halfway, and there will be a memory leak. Resolving this problem would require a manual try block, or standard library facilities like std::uninitialized_copy (which essentially do the same under the hood) if you switched to uninitialized buffers and manual construction.

Move semantics

All of the elements are copied every time, which is wasteful. Make good use of move semantics when appropriate.

Miscellaneous

Used std::size_t instead of int to store sizes and indexes.¹

get, size, and print should be const. Moreover, get should return a const T&. In fact, get and set would idiomatically be replaced by operator[].

Don't throw a const char*. Use a dedicated exception class like std::out_of_range instead.

Manual loops like

for (int i = 0; i < m_length; i++) {
 new_arr[i] = m_arr[i];
}

are better replaced with calls to std::copy (or std::move).

Re-allocating every time insert is called doesn't seem like a good idea. A better trade-off might be to append an element and then std::rotate it to the correct position (assuming rotation doesn't throw).

Also, print might take an std::ostream& (or perhaps std::basic_ostream<Char, Traits>&) argument for extra flexibility.

¹ As Andreas H. pointed out in the comments, this recommendation is subject to debate, since the use of unsigned arithmetic has its pitfalls. An alternative is to use std::ptrdiff_t and std::ssize (C++20) instead. You can write your own version of ssize as shown on the cppreference page if C++20 is not accessible.

Question 8

I agree with everything, but the unsigned size_t for index and size needs a bit more attention. Any operation where index arithmetic is to be performed is going to fail very easily. Think of a convolution loop:

for (size_t n = 0; n < a.size()+b.size(); n++) { out[n] = 0; for (size_t m = std::max(0U, a.size()-1-n); m < std::min(b.size(), n-1); m++) out[n] += a[n - m]*b[m]; }

The min and max expressions are there to prevent out-of-bounds access for arrays a and b. But they dont. The code will in allmost all cases produce a segfault. There is no compiler warning,

Question 9

TL/DR: I know the standard library uses unsigned for index and size_t, but given the arithmetic rules in C & C++ it is, except for the simple cases, error-prone and actually wrong to use an unsigned index when working with containers. So it is perhaps better to use a signed type for index and also size() in the first place (the latter because otherwise you always get a unsigned/signed comparison warning)

Question 10

@AndreasH. I understand your sentiment about the use of unsigned numbers - integral arithmetic in C++ has never been satisfactory. I personally choose to follow the precedent set by the standard library and use unsigned arithmetic to keep consistency and reduce complications due to signedness conversion, as I suppose many would do, so I recommended size_t, but the wrapping semantics does require special treatment, and signed arithmetic is indeed necessary in many scenarios involving negative numbers. The alternative, of course, is to consistently use ptrdiff_t and ssize instead.

Question 11

@AndreasH. I edited the answer to reflect our discussion.

Question 12

I just found: github.com/ericniebler/stl2/issues/182#issuecomment-287683189 There is a link to a video of a panel discussion where a few "C++ gurus" argue that using unsigned size types was a misktake in STL in the first place. I am not saying your answer is wrong, I just think the unexplained recommendation to follow the precedent can be dangerous. Otherwise good answer.

Question 13

delete looks strange. There is no reason to create new_arr. Everything can (and should) be done in place. BTW, m_capacity = m_capacity * 2; there is just wrong (I suspect a copy-paste from insert).

if in a tight loop incurs a strong performance penalty. Consider breaking the insertion loop into two:

 for (int i = 0,; i < index; i++) {
 new_arr[i] = arr[i];
 }
 new_arr[index] = obj;
 for (int i = index; i < m_length; i++) {
 new_arr[index + 1] = arr[index];
 }

Which brings us to the next point: no naked loops. Two loops above copy ranges. Factor them into a function:
```
 copy_range(new_arr, arr, index);
 new_arr[index] = obj;
 copy_range(new_arr + index + 1, arr + index, m_length - index);
```
Idiomatically, copy_range better be expressed in tems of iterators rather then indices.
Consider shrinking the array after too many removals.

Question 14

Const Correctness

A member function that doesn't modify the underlying object should normally be a const member function. For example:

 T get(int index); //O(1)

This doesn't modify the contents of the DynamicArray, and returns the element by value (not reference) so client code can't get access to the DynamicArray's internal data to modify it either. As such, this should almost certainly be a const member function:

 T get(int index) const;

Avoiding Unnecessary Copying

Right now your add, insert and set member functions all accept T objects by value.

void add(T obj);
void insert(int index, T obj);
void set(int index, T obj);

This means that when you're adding/inserting/setting an element in the DynamicArray, you're starting with one copy of the object in the client code, then copying that object to pass as an argument, then doing a copy assignment to put it into the array. For types that are "cheap" to copy (e.g., int) that's fine, but for types that are expensive to copy (e.g., if somebody creates a DynamicArray<DynamicArray<int>> where each sub-array is several megabytes of data) this would get extremely slow.

To avoid (at least some of) that copying, you can pass the input by reference instead:

void add(T const &obj);
void insert(int index, T const &obj);
void set(int index, T const &obj);

This avoids creating the copy solely for argument passing, so you end up with only two instances of the object: one in the client code and one in the array itself, but not an extra one being passed as the parameter.

Using a reference to a const object allows the reference to refer to a temporary object. For example, assume I have some Complex type that represents a complex number. If I have an expression like myComplex + yourComplex, that will create a temporary object. A reference to a const object can refer to that temporary (but without the const qualifier it can't).

Rule of 0/3/5

In C++, there's kind of a general rule that if you need to implement any of assignment, copy construction, and destruction, you probably need to implement all three to get a class that manages its resources correctly.

You have a destructor, and yes, you really need to do something to deal with copy construction and copy assignment. For example, if you create a copy of a DynamicArray:

 DynamicArray<int> foo;
 DynamicArray<int> bar = foo;

...the result tends to be on the ugly side (e.g., typically a core dump).

So you generally want to either implement (at least) copy construction and copy assignment, or else prevent code that tries to do it from compiling, usually with something like this:

 DynamicArray(DynamicArray const &) = delete;
 DynamicArray &operator=(DynamicArray const &) = delete;

Although it's not necessary to get correct (non-crashing) behavior, you may want to support move construction and move assignment as well:

 DynamicArray(DynamicArray &&) = delete;
 DynamicArray &operator=(DynamicArray &&) = delete;

So the rule of three covers destruction, copy construction and copy assignment. Adding move assignment and move construction gives the rule of 5.

That leaves the rule of 0: don't do any of this. In many cases you can use something like an std::unique_ptr to automate handling of the copying/assignment/destruction, so your class doesn't need to do any of the above.

Question 15

Could you elaborate on how I would use std::unique_ptr in this context? I mostly understand how smart pointers work, but not the practical usage of them.

Question 16

@ethanwarco: In this case, it's not clear to me that a unique_ptr actually would work at all well (that's why I only mentioned it in passing, but maybe I should have been more explicit). In a typical case, using a std::unique_ptr to your data will result in a type that can be moved but not copied, and in this case, I'd guess you probably want to allow copying.

Question 17

@JerryCoffin, you can allow copying on a type that includes a unique_ptr, you just have to implement the copy constructor yourself, which you have to do in this case anyway.

Question 18

@JanHudec: Yes, you can--but in a case like this, I think you probably lose about as much as you gain.

Question 19

@JerryCoffin, what would you lose? A unique_ptr would guard you from a bunch of mistakes, take care of all deleting and some exception safety and you'd only have to be careful with swapping it when reallocating.

Question 20

Your implementations for "insert" and "remove" are very inefficient. First, there is no obvious reason why "remove" would reallocate the array. And there is no reason why "insert" would always reallocate the array. Here's what I would do:

Add a private method where you pass the intended capacity. If the intended capacity is greater than the capacity, it increases the array size. If the intended capacity is much less than the capacity, decrease the array size.
For "add", set the intended capacity to length + 1, then add the element. For "insert", set the intended capacity to length + 1, copy the elements at the end of the vector using copy_backwards, then store the new element. For "remove", use std::copy to move everything in the right place, then set the intended size to length - 1.
For "insert", you should allow the position (length), with the same effect as "add".

Using std::copy and std::copy_backwards means that instead of a for-loop you will run the most optimised code possible to move the array elements.

PS. Apple uses an array implementation that makes inserting / removing array elements both at the beginning and the end of the array cheap (operations at index i take O (min (i, length - i)) steps); this is done by allowing element 0 to be stored anywhere in the array, and allowing the array to wrap around at the end of the storage).

PPS. If you shrink the array when it becomes too small, make sure that you can't run into a situation where inserting one element grows the array, then deleting one element shrinks it, etc. etc.

L. F. L. F. 9,6952 gold badges27 silver badges69 bronze badges · Accepted Answer · 2021-02-02 03:40:26Z

Welcome to C++, and welcome to Code Review.

C++ memory management is, as you probably have realized, tough and error-prone. There are many things that can easily go wrong. Assuming that no exception is thrown, I don't see obvious memory leaks in your code; however, there are still some issues worth discussing.

You can take a look at my implementation of a non-resizable dynamic array or a stack-based full-fledged vector for some inspiration.

Special member functions

You have not defined copy constructors or move constructors, so the compiler will synthesize corresponding constructors that simply copy all the members — which is completely wrong, as now the two dynamic arrays will point to the same memory. Not only are the elements shared between the copies, causing modifications to one array to affect the other, but the two copies will attempt to free the same memory upon destruction, leading to a double-free error, which is way more serious than a memory leak.

Initialization semantics

It is generally expected that the constructor of the element type is called \$n\$ times if \$n\$ elements are pushed into the dynamic array. In your code, however, this is not the case, whre the amount of constructors called is determined by the capacity of the dynamic array. Elements are first default initialized, and then copy-assigned to.

The correct way to solve this problem requires allocating an uninitialized buffer, and using placement new (or equivalent features) to construct the elements, which is another can of worms.

Exception safety

Think of what happens when the construction of an element throws an exception — your code will halt halfway, and there will be a memory leak. Resolving this problem would require a manual try block, or standard library facilities like std::uninitialized_copy (which essentially do the same under the hood) if you switched to uninitialized buffers and manual construction.

Move semantics

All of the elements are copied every time, which is wasteful. Make good use of move semantics when appropriate.

Miscellaneous

Used std::size_t instead of int to store sizes and indexes.¹

get, size, and print should be const. Moreover, get should return a const T&. In fact, get and set would idiomatically be replaced by operator[].

Don't throw a const char*. Use a dedicated exception class like std::out_of_range instead.

Manual loops like

for (int i = 0; i < m_length; i++) {
 new_arr[i] = m_arr[i];
}

are better replaced with calls to std::copy (or std::move).

Re-allocating every time insert is called doesn't seem like a good idea. A better trade-off might be to append an element and then std::rotate it to the correct position (assuming rotation doesn't throw).

Also, print might take an std::ostream& (or perhaps std::basic_ostream<Char, Traits>&) argument for extra flexibility.

¹ As Andreas H. pointed out in the comments, this recommendation is subject to debate, since the use of unsigned arithmetic has its pitfalls. An alternative is to use std::ptrdiff_t and std::ssize (C++20) instead. You can write your own version of ssize as shown on the cppreference page if C++20 is not accessible.

I agree with everything, but the unsigned size_t for index and size needs a bit more attention. Any operation where index arithmetic is to be performed is going to fail very easily. Think of a convolution loop: for (size_t n = 0; n < a.size()+b.size(); n++) { out[n] = 0; for (size_t m = std::max(0U, a.size()-1-n); m < std::min(b.size(), n-1); m++) out[n] += a[n - m]*b[m]; } The min and max expressions are there to prevent out-of-bounds access for arrays a and b. But they dont. The code will in allmost all cases produce a segfault. There is no compiler warning,
TL/DR: I know the standard library uses unsigned for index and size_t, but given the arithmetic rules in C & C++ it is, except for the simple cases, error-prone and actually wrong to use an unsigned index when working with containers. So it is perhaps better to use a signed type for index and also size() in the first place (the latter because otherwise you always get a unsigned/signed comparison warning)
@AndreasH. I understand your sentiment about the use of unsigned numbers - integral arithmetic in C++ has never been satisfactory. I personally choose to follow the precedent set by the standard library and use unsigned arithmetic to keep consistency and reduce complications due to signedness conversion, as I suppose many would do, so I recommended size_t, but the wrapping semantics does require special treatment, and signed arithmetic is indeed necessary in many scenarios involving negative numbers. The alternative, of course, is to consistently use ptrdiff_t and ssize instead.
I just found: github.com/ericniebler/stl2/issues/182#issuecomment-287683189 There is a link to a video of a panel discussion where a few "C++ gurus" argue that using unsigned size types was a misktake in STL in the first place. I am not saying your answer is wrong, I just think the unexplained recommendation to follow the precedent can be dangerous. Otherwise good answer.

Stack Exchange Network

C++ dynamic array implementation

4 Answers 4

Special member functions

Initialization semantics

Exception safety

Move semantics

Miscellaneous

Const Correctness

Avoiding Unnecessary Copying

Rule of 0/3/5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

C++ dynamic array implementation

4 Answers 4

Special member functions

Initialization semantics

Exception safety

Move semantics

Miscellaneous

Const Correctness

Avoiding Unnecessary Copying

Rule of 0/3/5

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions