Safe strcpy implementation with C++

Question 1

After reading this article, I've decided to take a crack at implementing a "safe_strcpy" function that would remove some of the errors that the author speaks about. Obviously, my safe_strcpy() function is inspired by his. Below, I've created a simple class that houses a std::array of one of the many different kinds of strings. My safe_strcpy() functions rely on always null-terminating the array given.

I have a few worries about my implementation:

The lines with out[N - 1] = static_cast<T>(0) seem wrong when the underlying character type is wchar_t. In this sense, I want the array to terminate with something to the effect of L"0". Will this work? If not, how can I fix it?
Should I be concerned with any performance issues or any "gotchas" that I didn't think of?
Is this a viable solution to the safe_strcpy conundrum? In other words, would you use this class in production code? What kind of improvements or extensions should this class use?

#include <cstdlib>
#include <cstdio>
#include <cstring>
#include <array>
#include <iostream>
#include <typeinfo>
using namespace std;
template <size_t N, class T = char>
struct char_array
{
 typedef T value_type;
 typedef const T const_value_type;
 typedef T& reference;
 typedef const T& const_reference;
 typedef std::array<T, N> container;
 typedef typename container::size_type size_type;
 typedef typename container::iterator iterator;
 typedef typename container::const_iterator const_iterator;
 typedef typename container::reverse_iterator reverse_iterator;
 typedef typename container::const_reverse_iterator const_reverse_iterator;
 /* Constructor */
 char_array()
 {
 static_assert((is_same<unsigned char, T>::value || is_same<char, T>::value ||
 is_same<wchar_t, T>::value || is_same<signed char, T>::value ||
 is_same<char16_t, T>::value || is_same<char32_t, T>::value) &&
 N > 0,
 "char_array initialized with invalid type or size");
 }
 /* Destructor */
 ~char_array() {}
 /* Copy constructor and assignment */
 char_array(const char_array<N,T> &other) = default;
 char_array& operator=(const char_array<N,T> &other) = default;
 /* Move constructor and assignment */
 char_array(char_array<N,T> &&other) { buf_ = std::move(other.buf_); }
 char_array& operator=(char_array<N,T> &&other) 
 { 
 buf_ = std::move(other.buf_);
 return *this;
 } 
 /* Delegate members of internal container */
 iterator begin() { return buf_.begin(); }
 const_iterator begin() const { return buf_.begin(); }
 iterator end() { return buf_.end(); }
 const_iterator end() const { return buf_.end(); }
 const_iterator cbegin() const { return buf_.cbegin(); } 
 const_iterator cend() const { return buf_.cend(); }
 reverse_iterator rbegin() { return buf_.rbegin(); }
 const_reverse_iterator rbegin() const { return buf_.rbegin(); }
 reverse_iterator rend() { return buf_.rend(); }
 const_reverse_iterator rend() const { return buf_.rend(); }
 const_reverse_iterator crbegin() const { return buf_.crbegin(); }
 const_reverse_iterator crend() const { return buf_.crend(); }
 reference operator[](size_type i) { return buf_[i]; }
 const_reference operator[](size_type i) const { return buf_[i]; }
 size_type size() const { return buf_.size(); }
 size_type max_size() const { return buf_.max_size(); }
 bool empty() const { return buf_.empty(); }
 reference at(size_type n) { return buf_.at(n); }
 const_reference at(size_type n) const { return buf_.at(n); }
 reference front(size_type n) { return buf_.front(n); }
 const_reference front(size_type n) const { return buf_.front(n); }
 reference back(size_type n) { return buf_.back(n); }
 const_reference back(size_type n) const { return buf_.back(n); }
 value_type* data() { return buf_.data(); }
 const value_type* data() const { return buf_.data(); }
 void fill(const value_type &val) { buf_.fill(val); }
 void swap(char_array &x) { buf_.swap(x.buf_); }
private:
 container buf_;
};
/*
 Allow "template magic" to automatically deduce
 the size of the given array. NOTE: this only works with
 statically-allocated arrays (e.g. arrays with a size known
 at compile time)
 @param[in] - out - a reference to the array
 @param[in] - src - the source string
 Usage:
 char buf[10];
 safe_strcpy(buf, "This is a string > 10 chars");
 */
template <class T, size_t N>
void safe_strcpy(T (&out)[N], const T *src)
{
 static_assert((is_same<wchar_t, T>::value || is_same<signed char, T>::value ||
 is_same<unsigned char, T>::value || is_same<char, T>::value || 
 is_same<char16_t, T>::value || is_same<char32_t, T>::value) &&
 N > 0, "Incompatible type for safe_strcpy");
 memcpy(out, src, N * sizeof(T));
 out[N - 1] = static_cast<T>(0);
}
template <class T, size_t N, class S>
void safe_strcpy(char_array<N,T> &out, const S *src)
{
 static_assert(is_same<T, S>::value, "Source content type different from char_array destination");
 memcpy(out.data(), src, N * sizeof(T));
 out[N - 1] = static_cast<T>(0);
}
template <class Os, size_t N, class T>
Os& operator<<(Os &os, const char_array<N,T> &arr)
{
 for (const auto &c : arr) os << c;
 return os;
}
int main()
{
 wchar_t buf[10];
 safe_strcpy(buf, L"really long string that's greater than 10 freaking characters!!!!");
 wcout << buf << "\n";
 char_array<10> buffer;
 safe_strcpy(buffer, "This is a string longer than 10 chars");
 cout << "Max size is: " << buffer.max_size() << "\n";
 cout << buffer << "\n";
}

Question 2

"... with something to the effect of L"0"" You probably meant L"0円".

Question 3

For #1, you can simply make a traits class and specialize for wchar. For example: null_terminator<T>::value is either "0円" or L"0円" depending on whether T is char or wchar, respectively.

Question 4

That's an amazingly over-engineered solution ;). I think something in the lines of safe_strcpy(char * dest, size_t destMax, const char * src, size_t scrLen) is already a big improvement, without 100 lines of template boilerplate involved. But then again, strcopying in C++ is pretty rare. std::string covers 99% of the cases for you already!

Question 5

BTW, please don't use namespace in a header file! stackoverflow.com/questions/1452721/…

Question 6

Break out your type trait

You have this large amount of is_same checks in a couple places. Break it out in its own trait, and call it something like is_char:

template <class T> is_char : std::false_type { };
template <> is_char<wchar_t> : std::true_type { };
template <> is_char<char> : std::true_type { };
template <> is_char<signed char> : std::true_type { };
...

What way, in your char_array, you can just have:

static_assert(is_char<T>::value, "invalid type, expected a char");
static_assert(N > 0, "invalid size, expected N > 0");

Furthermore, those asserts don't need to be in the constructor, they can be in the body of the class.

Lots of code Repetition

Pretty much all of your char_array code is just duplicating what std::array does. What does char_array actually give you? It costs you aggregate-initialization, and that's not nothing. Ultimateyl, it's really just a std::array<> with some extra conditions on T and N.

But we can just use those same conditions to SFINAE the safe_strcpy:

template <class T, size_t N, class S,
 class = std::enable_if_t<is_char<T>::value &&
 (N > 0) &&
 is_same<T, S>::value>>
void safe_strcpy(std::array<T,N> &out, const S *src)
{
 memcpy(out.data(), src, N * sizeof(T));
 out[N - 1] = 0;
}

And now we have no need for char_array at all.

If you really want to keep the type, I'd just inherit:

template <size_t N, class T=char>
struct char_array : std::array<T, N>
{
 static_assert(is_char<T>::value, "invalid type, expected a char");
 static_assert(N > 0, "invalid size, expected N > 0");
};

memcpy to std::copy

Instead of copying raw bytes, and having to remember to multiply, you can just use std::copy - which will under-the-hood do the same thing anyway, but will be less error-prone. So instead of:

memcpy(out.data(), src, N * sizeof(T));

Write:

std::copy(src, src+N, out.data());

Question 7

These points are good, but unfortunately your versions of the code still all overrun the source.

Question 8

@BenVoigt That's already covered in a different review.

Question 9

Gotcha

There is a potential for illegal access. While the code guarantees that the destination buffer is not overflown, it will however access beyond the source buffer if it is smaller than the destination. memcpy is just a wrong tool here.
Will I use it?

Given a very limited use case of this approach (static buffers only) and a template size specialization (a serious potential for code bloat) I will probably stay with the good old strncpy. Rumors of its un-safety are highly exaggerated.

Question 10

You're right about your first point. What would you recommend instead?

Question 11

strncpy is absolutely disastrously unsafe. The design is just horrible. When an overflow happens, you are left with something that isn't a C string.

Question 12

One hard cold truth is that C-Strings cannot be fixed: not knowing the size of the buffer (either inbound or outbound) is a recipe for failure, and the band-aids (strncpy, strlcpy, ...) are just attempting to patch the symptoms.

Another hard cold truth is that wchar_t is broken by design as it is non-portable (16 bits on Windows and 32 bits on Unix). Therefore you should not use it at all and instead rely on explicitly sized types char16_t and char32_t.

Your own code is subject to run-time issues as a result:

you never guarantee that src is not null, at the very least an assert would be useful and the documentation should reflect it, some compilers also have specific annotations (gcc's __attribute__((nonnull(2))) for example)
you never check that src is long enough and blindly copy N bytes, even though in general there is no guarantee that the source is longer than the destination

And has a number of design limitations:

you do not allow copying piecemeal (useful to concatenate...)
you only allow copying into statically allocated buffer
you treat the strings as a meaningless sequence of bytes, and truncate without regard for its encoding (but then, std::string does so too...)

If you cannot afford std::string, or wish to design an alternative where buffer ownership can be external (so as to use fixed-sized buffer), then I advise using a class approach and simply make it compatible with C-Strings for interoperability:

by allowing construction from a C-String
by keeping it NUL-terminated for cheap conversion into a C-String

And then, this class can easily implement safe copy!

Last note: std::string may not be as costly as you think it is, if you are avoiding it for performance reason, measure first...

A rough example (untested!) of a safer alternative to C-String:

//
// Fixed-Capacity String
//
// It never allocates or re-allocates, but instead uses the
// externally provided buffer.
//
template <typename T>
class fcstring {
public:
 // Construction
 explicit fcstring(T* cstring):
 _capacity(cstring ? std::strlen(cstring) : 0),
 _size(_capacity),
 _buffer(cstring)
 {}
 template <size_t N>
 explicit fcstring(T (&str)[N]):
 _capacity(N-1),
 _size(N-1),
 _buffer(str)
 {}
 // "buffer" is assumed to already point at a valid C-String
 // of size "size" (not counting the terminating NUL-byte).
 // "capacity" may be additionally provided if the buffer is known
 // to be larger than "size", it then represents the number of "T"
 // than the "buffer" may contain, not counting the NUL-byte.
 fcstring(T* buffer, size_t size, size_t capacity = 0):
 _capacity(capacity ? capacity : size),
 _size(size),
 _buffer(buffer)
 {}
 // Copy
 void assign(fcstring const& other) {
 size_t const copied = std::min(_capacity, other._size);
 std::memcpy(_buffer, other._buffer, copied);
 _buffer[copied] = T();
 _size = copied;
 }
 // Conversion to C-String
 // /!\ Be mindful that any embedded NUL character will
 // /!\ result in a truncated output
 T const* c_str() const { return _size ? _buffer : ""; }
private:
 size_t _capacity;
 size_t _size;
 T* _buffer;
}; // class fcstring

Barry Barry 18.5k1 gold badge40 silver badges92 bronze badges · Answer 1 · 2015-12-21 22:41:24Z

Break out your type trait

You have this large amount of is_same checks in a couple places. Break it out in its own trait, and call it something like is_char:

template <class T> is_char : std::false_type { };
template <> is_char<wchar_t> : std::true_type { };
template <> is_char<char> : std::true_type { };
template <> is_char<signed char> : std::true_type { };
...

What way, in your char_array, you can just have:

static_assert(is_char<T>::value, "invalid type, expected a char");
static_assert(N > 0, "invalid size, expected N > 0");

Furthermore, those asserts don't need to be in the constructor, they can be in the body of the class.

Lots of code Repetition

Pretty much all of your char_array code is just duplicating what std::array does. What does char_array actually give you? It costs you aggregate-initialization, and that's not nothing. Ultimateyl, it's really just a std::array<> with some extra conditions on T and N.

But we can just use those same conditions to SFINAE the safe_strcpy:

template <class T, size_t N, class S,
 class = std::enable_if_t<is_char<T>::value &&
 (N > 0) &&
 is_same<T, S>::value>>
void safe_strcpy(std::array<T,N> &out, const S *src)
{
 memcpy(out.data(), src, N * sizeof(T));
 out[N - 1] = 0;
}

And now we have no need for char_array at all.

If you really want to keep the type, I'd just inherit:

template <size_t N, class T=char>
struct char_array : std::array<T, N>
{
 static_assert(is_char<T>::value, "invalid type, expected a char");
 static_assert(N > 0, "invalid size, expected N > 0");
};

memcpy to std::copy

Instead of copying raw bytes, and having to remember to multiply, you can just use std::copy - which will under-the-hood do the same thing anyway, but will be less error-prone. So instead of:

memcpy(out.data(), src, N * sizeof(T));

Write:

std::copy(src, src+N, out.data());

These points are good, but unfortunately your versions of the code still all overrun the source.

vnp vnp 58.6k4 gold badges55 silver badges144 bronze badges · Answer 2 · 2015-12-21 22:38:03Z

Gotcha

There is a potential for illegal access. While the code guarantees that the destination buffer is not overflown, it will however access beyond the source buffer if it is smaller than the destination. memcpy is just a wrong tool here.
Will I use it?

Given a very limited use case of this approach (static buffers only) and a template size specialization (a serious potential for code bloat) I will probably stay with the good old strncpy. Rumors of its un-safety are highly exaggerated.

You're right about your first point. What would you recommend instead?
strncpy is absolutely disastrously unsafe. The design is just horrible. When an overflow happens, you are left with something that isn't a C string.

Matthieu M. Matthieu M. 6,4091 gold badge20 silver badges34 bronze badges · Answer 3 · 2015-12-22 10:07:12Z

One hard cold truth is that C-Strings cannot be fixed: not knowing the size of the buffer (either inbound or outbound) is a recipe for failure, and the band-aids (strncpy, strlcpy, ...) are just attempting to patch the symptoms.

Another hard cold truth is that wchar_t is broken by design as it is non-portable (16 bits on Windows and 32 bits on Unix). Therefore you should not use it at all and instead rely on explicitly sized types char16_t and char32_t.

Your own code is subject to run-time issues as a result:

you never guarantee that src is not null, at the very least an assert would be useful and the documentation should reflect it, some compilers also have specific annotations (gcc's __attribute__((nonnull(2))) for example)
you never check that src is long enough and blindly copy N bytes, even though in general there is no guarantee that the source is longer than the destination

And has a number of design limitations:

you do not allow copying piecemeal (useful to concatenate...)
you only allow copying into statically allocated buffer
you treat the strings as a meaningless sequence of bytes, and truncate without regard for its encoding (but then, std::string does so too...)

If you cannot afford std::string, or wish to design an alternative where buffer ownership can be external (so as to use fixed-sized buffer), then I advise using a class approach and simply make it compatible with C-Strings for interoperability:

by allowing construction from a C-String
by keeping it NUL-terminated for cheap conversion into a C-String

And then, this class can easily implement safe copy!

Last note: std::string may not be as costly as you think it is, if you are avoiding it for performance reason, measure first...

A rough example (untested!) of a safer alternative to C-String:

//
// Fixed-Capacity String
//
// It never allocates or re-allocates, but instead uses the
// externally provided buffer.
//
template <typename T>
class fcstring {
public:
 // Construction
 explicit fcstring(T* cstring):
 _capacity(cstring ? std::strlen(cstring) : 0),
 _size(_capacity),
 _buffer(cstring)
 {}
 template <size_t N>
 explicit fcstring(T (&str)[N]):
 _capacity(N-1),
 _size(N-1),
 _buffer(str)
 {}
 // "buffer" is assumed to already point at a valid C-String
 // of size "size" (not counting the terminating NUL-byte).
 // "capacity" may be additionally provided if the buffer is known
 // to be larger than "size", it then represents the number of "T"
 // than the "buffer" may contain, not counting the NUL-byte.
 fcstring(T* buffer, size_t size, size_t capacity = 0):
 _capacity(capacity ? capacity : size),
 _size(size),
 _buffer(buffer)
 {}
 // Copy
 void assign(fcstring const& other) {
 size_t const copied = std::min(_capacity, other._size);
 std::memcpy(_buffer, other._buffer, copied);
 _buffer[copied] = T();
 _size = copied;
 }
 // Conversion to C-String
 // /!\ Be mindful that any embedded NUL character will
 // /!\ result in a truncated output
 T const* c_str() const { return _size ? _buffer : ""; }
private:
 size_t _capacity;
 size_t _size;
 T* _buffer;
}; // class fcstring

Stack Exchange Network

Safe strcpy implementation with C++

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Safe strcpy implementation with C++

3 Answers 3

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions