I have tried to implement a basic vector type in C++, yet I am not sure if I can do anything any more efficiently.
template < typename _Ty > class vector
{
public:
typedef _Ty *iterator;
typedef _Ty _Value_type;
typedef vector<_Ty> _Myt;
vector()
: __size(0), __capacity(10), __data((_Value_type *)calloc(10, sizeof(_Value_type)))
{
}
vector(_Myt &_Rhs)
: __data((_Value_type *)calloc((__capacity = _Rhs.size() + 10), sizeof(_Value_type)))
{
memcpy(__data, _Rhs.__data, (__size = _Rhs.size()) * sizeof(_Value_type));
}
~vector()
{
memset(__data, NULL, 1);
}
_Value_type *data() const
{
return __data;
}
_Myt &push_back(const _Value_type &_Rhs)
{
if (++__size > __capacity)
{
reserve(__capacity + 10);
}
__data[__size - 1] = _Rhs;
return *this;
}
_Myt &operator+=(const _Value_type &_Rhs)
{
return push_back(_Rhs);
}
UINT size() const
{
return __size;
}
UINT capacity() const
{
return __capacity;
}
iterator begin() const
{
return &__data[0];
}
iterator rbegin()
{
return reversed_adaptor().begin();
}
iterator end() const
{
return &__data[__size];
}
iterator rend()
{
return reversed_adaptor().end();
}
iterator find(const _Value_type &_Search) const
{
for (iterator i = begin(); i != end(); ++i)
{
if (*i == _Search)
{
return i;
}
}
return NULL;
}
bool contains(const _Value_type &_Search) const
{
return find(_Search) != NULL;
}
_Myt &operator=(_Myt &_Rhs)
{
reserve((__size = _Rhs.size()) + 10);
memcpy(__data, _Rhs.__data, _Rhs.size() * sizeof(_Value_type));
return *this;
}
_Value_type pop_back()
{
_Value_type temp = __data[__size -= 1];
resize(__size);
return temp;
}
const _Value_type &at(UINT _Base) const
{
if (_Base >= 0 && _Base < size())
{
return __data[_Base];
}
throw std::out_of_range("vector::at - out of range");
}
_Value_type &at(UINT _Base)
{
if (_Base >= 0 && _Base < size())
{
return __data[_Base];
}
throw std::out_of_range("vector::at - out of range");
}
_Value_type &operator[](UINT _Base)
{
return __data[_Base];
}
const _Value_type &operator[](UINT _Base) const
{
return __data[_Base];
}
_Myt &swap(_Myt &_Rhs)
{
std::swap(*this, _Rhs);
return *this;
}
_Myt &reserve(UINT _Capacity)
{
__data = (_Value_type *)realloc(__data, (__capacity = _Capacity) * sizeof(_Value_type));
return *this;
}
_Myt &resize(UINT _Size, _Value_type _Value = _Value_type())
{
int over = (_Size > __size), temp = __size;
__data = (_Value_type *)realloc(__data, (__capacity = (__size = _Size)) * sizeof(_Value_type));
if (over)
{
for (iterator i = &__data[temp]; i != end(); ++i)
{
*i = _Value;
}
}
return *this;
}
_Value_type erase(iterator _Iter)
{
if (_Iter == end())
{
return pop_back();
}
for (iterator i = _Iter; i + 1 != end(); ++i)
{
*i = *(i + 1);
}
return pop_back();
}
template < typename _Ty1 > bool operator==(const vector<_Ty1> &_Rhs)
{
if ((typeid(_Value_type) != typeid(_Ty1)) || size() != _Rhs.size())
{
return false;
}
for (iterator i = begin(), j = _Rhs.begin(); i != end(), j != _Rhs.end(); ++i, ++j)
{
if (*i != *j)
{
return false;
}
}
return true;
}
template < typename _Ty1 > bool operator!=(const vector<_Ty1> &_Rhs)
{
return !(*this == _Rhs);
}
bool empty()
{
return size > 0;
}
_Myt &reverse()
{
std::reverse(begin(), end());
return *this;
}
_Myt reversed_adaptor()
{
_Myt adaptor(*this);
return adaptor.reverse();
}
_Myt insert(iterator _Begin, iterator _End)
{
for (iterator i = _Begin; i != _End; ++i)
{
push_back(*i);
}
return *this;
}
private:
_Value_type *__data;
UINT __size, __capacity;
};
-
2\$\begingroup\$ Stop using underscore at the beginning of your identifiers (the rules are non trivial). An initial underscore followed by a capitol letter or double underscore anywhere are actually reserved for use by the implementation. stackoverflow.com/a/228797/14065 This means you should not use them. \$\endgroup\$Loki Astari– Loki Astari2014年02月21日 05:53:50 +00:00Commented Feb 21, 2014 at 5:53
2 Answers 2
Some other points, in completely random order as I came across them...
pop_back
,pop_front
anderase
should have return typevoid
The reason here is that most of the time the caller does not need the erased object and if you return it, the copy has to be made and it can be pretty expensive. Filling a vector (if done properly, see 3. below) amortizes to just 2 copies per inserted element, so vector is a good choice even for quite complex objects and you don't want useless extra copies.
calloc
is pointless, since you are obliged to construct the objects anyway.reserve(__capacity + 10);
inpush_back
is violation of the complexity contract.push_back
is required amortizedO(1)
, which means the reallocation has to be in geometric sequence. Usual values arereserve(__capacity * 2)
orreserve(__capacity + (__capacity+1)/2)
, and minimal capacity of 16 or so.Just like the
realloc
already mentioned by ChrisW, callingmemcpy
on array containing non-POD objects (and vector is for containing non-POD objects) is an UndefinedBehaviourTM.The compiler may choose to do anything it pleases from simply crashing all the way over formatting your hard drive to making daemons fly out of your nose. And modern compilers can be real bitches when punishing UndefinedBehaviourTM. Effect of some optimizations applied to incorrect code is next to impossible to understand.
You have to copy the elements one by one with
std::copy
or a loop to properly invoke constructors and you have to call destructors afterwards.Assigning to unconstructed memory in
push_back
andresize
is also UndefinedBehaviourTM.Non-POD object has to be constructed by constructor and destructed by destructor. Use the placement new:
new (i) _Value_type(_Rhs);
C++ does not have any type called
UINT
.The size type should always be
std::size_t
swap()
method is another violation of the complexity contract.Vector
swap()
is supposed to be O(1). It should simply swap the pointers. That's far from whatstd::swap
would do in C++03 (in C++11 it could do the right thing if you defined the move constructors, but you didn't).Furthermore
std::swap
should be overloaded for vector to call theswap()
method. Yes, you are allowed and sometimes required to overload functions/specialize classes instd
for types you define and this is one such case.You correctly use
reserve
inpush_back
, but you fail to do the same ininsert
.The special condition for
i == end()
inerase
is superfluous.It is not allowed to pass non-deferrable iterator (end or invalid) to
erase
.Your
erase
returns useless element (but see 1. anyway).You return the value of the final pop_back, which returns the last element. If you deleted any other element than the last, it's not the one that was just deleted. Completely useless. You shouldn't be returning anything anyway.
Const correctness failures.
A
const
method should only ever return pointers and references toconst
.Standard collections have
iterator
andconst_iterator
and all methods that return iterators (begin()
,end()
,find()
etc.) have non-const overload returningiterator
and const overload returningconst_iterator
. If you use plain pointers (which is legal, the iterator concept is designed so that pointers are iterators), you should defineconst_iterator
to_Value_type const *
(const applies to the part before it, so this is the correct syntax; even in C).rbegin
andrend
return dangling pointers (provided you fix the memory leak in destructor).Another instance of UndefinedBehaviourTM. The
reversed_adaptor
returns a new instance. By value, which is inefficient, but legal. However the return value is temporary and it will only have to havebegin
/end
called on it and than it will be destroyed at the next semicolon. When you fix the memory leak and actually free the memory in destructor, the returned pointers will point to unallocated memory.rbegin
andrend
don't return valid iterators to invocant.The reverse iterators have to point to the container, not to a copy. Since you can't use plain pointers for reverse operators, it's rather a lot of work to define them. Boost has a generic implementation
You are missing most of the required typedefs.
There should be
value_type
,reference
,pointer
,const_reference
,const_pointer
,const_iterator
,reverse_iterator
,const_reverse_iterator
,size_type
(should bestd::size_t
) anddifference_type
(should bestd::ptrdiff_t
). Andallocator_type
, which is next point.You should be using allocator instead of
malloc
andfree
.Allocator abstracts the memory allocation, wraps the funny placement new and explicit destructor call syntaxes and makes the allocation and construction separate actions as appropriate for likes of vector.
The default allocator,
std::allocator
uses global operatorsnew[]
anddelete[]
for allocation, but users may want to provide special allocators.Also the allocator will properly throw
std::bad_alloc
as willnew[]
anddelete[]
(except on non-conforming platforms like Microsoft Windows) if you use them directly, but when usingmalloc
, you have to check for NULL and throwstd::bad_alloc
on allocation failure yourself.
Plus all the problems already mentioned by chrisW are valid too and I am almost certain I still didn't find everything.
I think your ~vector()
destructor is doing the wrong thing:
- It should free the memory which you allocated using
calloc
- It shouldn't
memset
over the objects contained in the array - My guess is that it should explicitly destroy the objects in the array, using the opposite of placement new ... perhaps by doing something like:
~vector
{
for (iterator i = begin(); i != end(); ++i)
i->~_Ty();
free(__data);
}
Similarly the methods which create an object in the memory should use placement new, for example as described here.
The vector you have at the moment probably works for types which have a trivial/non-existent constructor and destructor (for example int
), but not for types which have a non-default constructor and destructor (for example std::string
).
Your operator==
should be const (so that it can be used when *this
is const).
I suspect it shouldn't include the typeid(_Value_type) != typeid(_Ty1)
expression: instead it shouldn't define the _Ty1
template parameter, so that it can only be used to compare vectors which have an identical type.
This suggests you code it as a free operator (an inline functions which takes two parameters), instead of as a method.
Your find
method should return end()
, not NULL
, if it fails to find.
I suspect it's not safe for your resize
to call realloc
. For example, I might have a type like this ...
class A
{
A* self;
public:
A() { self = this; }
A(const A&) { ... to be supplied ... }
const A& operator=(const A&) { ... to be supplied ... }
A(A&&) { ... to be supplied ... }
A& operator=(A&&) { ... to be supplied ... }
}
... which remembers where itself exists in memory. If you call realloc
that will do a bitwise copy of existing elements, which will cause the moved instance of A to have invalid self
pointers.
Instead you may need to create a new buffer, and explicitly move all existing instances from the old buffer into their new locations in the new buffer.
Your identifiers shouldn't begin with an _
underscore. Those identifiers are reserved for the compiler and/or the standard run-time library.
Your insert
method doesn't make much sense: its signature doesn't match any of the three standard insert methods.