Create a C style char** from a C++ vector<string>

Question 1

I am dealing with some older C style APIs, like Posix execve that take a char**. In the rest of my code, I prefer to use a fairly modern C++ style, so I have vector of std::string. Is this the best way to handle the glue between them?

char ** strlist(std::vector<std::string> &input) {
 char** result = new char*[input.size()];
 result[input.size()] = nullptr;
 for (int i=0; i<input.size(); i++) {
 char *temp = new char[input[i].size()];
 strcpy(temp, input[i].c_str());
 result[i] = temp;
 }
 return result;
}
bool del_strlist(char ** strings, int limit=1024) {
 bool finished = false;
 for (int i=0; i<limit; i++) {
 char *temp = strings[i];
 if (temp == nullptr) {
 finished = true;
 break;
 }
 delete temp;
 }
 delete strings;
 return !finished;
}

Question 2

Can you explain why del_strlist returns a boolean? What is the result used for?

Question 3

Could you tell us the version of C++ standard you want to stick to please? It can vary the interface and prefered code style a lot. For C there has not been as large of a chance in standards in the last decade. C++11 is quite different to before, C++14 is a significant change again (C++17 not so dramatic IMHO).

Question 4

Perhaps it makes more sense create an execve wrapper (that takes vector<string>) rather than trying to maintain a parallel vector<string>

Question 5

What is the lifespan of the char** you pass? Do the pointers need to remain valid throughout, or can they be discarded after the call?

Question 6

Your second statement overwrites unallocated memory already.

Question 7

Since you already have a std::vector<std::string>, it's much simpler to let that own the memory, and build a parallel std::vector<char*> which just keeps pointers into the original strings. The only difficulty is ensuring this isn't used after the owning vector goes out of scope.

The simplest implementation is something like:

std::vector<char*> strlist(std::vector<std::string> &input) {
 std::vector<char*> result;
 // remember the nullptr terminator
 result.reserve(input.size()+1);
 std::transform(begin(input), end(input),
 std::back_inserter(result),
 [](std::string &s) { return s.data(); }
 );
 result.push_back(nullptr);
 return result;
}

where the caller is responsible for managing the lifetimes of the two vectors. You get the required char ** by calling result.data().

If you want to avoid the potential lifetime problems, you can write a class to own both vectors for you so their lifetimes are synchronized.

Note that I deliberately haven't taken the vector by const ref, because execve takes an array of pointers to non-const char, but it doesn't mutate the contents so there's no need to make a copy. This lack of const-safety is common in old C interfaces.

PS. I didn't force myself to use transform when a simple loop would do hoping for some magic optimization. I used transform instead of a simple loop because it more clearly expressed the semantics of my code.

A "simple" loop is a powerful multipurpose language construct which could do anything. By comparison transform is a simple, special-purpose algorithm, and much easier to reason about as a consequence.

Not least, since I had to take a non-const reference to the argument despite having no intention of mutating it, anything that makes it easier to see at a glance that no mutation is possible, is helpful.

Question 8

I much prefer this approach. One minor point - I think you need to push_back a nullptr at the end. Also - if you go down the class path, you'll need to disable or do the right thing for copy construction and assignment.

Question 9

Good points both!

Question 10

Oh - another thing. I think you should use c_str() instead of data() (which isn't guaranteed to null terminate).

Question 11

... and all known implementations null-terminated the result of data() before C++11.

Question 12

@nurettin, the null pointer is there because the users need it to delimit the array (it's the interface of exec() family of functions).

Question 13

I can see a few potential problems here:

Since you allocated a char* array of input.size() elements, result[input.size()] is out of bounds.
Similarly, std::string's size() is the number of characters - it doesn't include the trailing 0円 needed for C-style strings. So every strcpy here risks doing a buffer overflow (risks, because it is possible for C++ std::strings to contain a null in the middle, terminating the strcpy mid way).
You have set a limit on the number of elements of strings you delete, but then delete strings irrespective of whether that limit was breached. This risks a memory leak if the limit was exceeded.
You use the array version of new new <type>[<size>]; This means you need to use the array version of delete delete [] <object> (Note the square brackets).

Question 14

Different approach

The approach in the question is somewhat procedural. My aim is to use RAII to facilitate ease of use, as the code currently can leak memory and relies on programmer to free it.

Usage cases

Lets first have a look at the usage:

Create cstring style array from std::strings.
Sink the created array into exec family of functions
Wait until child process exits
Reclaim the memory

Now, it clearly looks like constructor and destructor calls, as they operate on the same data, possibly even unmodified.

Code

Here is rough sketch of the class I had in mind:

class owned_cstrings {
 std::vector<char*> cstring_array;
public:
 owned_cstrings(const std::vector<std::string>& source) :
 cstring_array(source.size())
 {
 std::transform(source.begin(), source.end(), cstring_array.begin(), [](const auto& elem_str) {
 char* buffer = new char[elem_str.size() + 1];
 std::copy(elem_str.begin(), elem_str.end(), buffer);
 buffer[elem_str.size()] = 0;
 return buffer;
 });
 cstring_array.push_back(nullptr);
 }
 owned_cstrings(const owned_cstrings& other) = delete;
 owned_cstrings& operator=(const owned_cstrings& other) = delete;
 owned_cstrings(owned_cstrings&& other) = default;
 owned_cstrings& operator=(owned_cstrings&& other) = default;
 char** data() {
 return cstring_array.data();
 }
 ~owned_cstrings() {
 for (char* elem : cstring_array) {
 delete[] elem;
 }
 }
};

Design decisions

The code above would've been much more dangerous footgun without some thought put into it. First of all, it is not copyable, although it could do deep copy, I believe it is not intended. Not doing deep copy will result in more than one delete, which is disastrous. Second, the data access is somewhat limited, because the only usage case covered is sinking into exec family of functions.

Demo

Small demo on Wandbox:

#include <vector>
#include <string>
#include <algorithm>
class owned_cstrings {
 std::vector<char*> cstring_array;
public:
 owned_cstrings(const std::vector<std::string>& source) :
 cstring_array(source.size())
 {
 std::transform(source.begin(), source.end(), cstring_array.begin(), [](const auto& elem_str) {
 char* buffer = new char[elem_str.size() + 1];
 std::copy(elem_str.begin(), elem_str.end(), buffer);
 buffer[elem_str.size()] = 0;
 return buffer;
 });
 cstring_array.push_back(nullptr);
 }
 owned_cstrings(const owned_cstrings& other) = delete;
 owned_cstrings& operator=(const owned_cstrings& other) = delete;
 owned_cstrings(owned_cstrings&& other) = default;
 owned_cstrings& operator=(owned_cstrings&& other) = default;
 char** data() {
 return cstring_array.data();
 }
 ~owned_cstrings() {
 for (char* elem : cstring_array) {
 delete[] elem;
 }
 }
};
#include <iostream>
template <typename T>
std::ostream& operator<<(std::ostream& os, const std::vector<T>& v) {
 if (v.empty()) {
 return os;
 }
 os << v.front();
 for (std::size_t i = 1; i < v.size(); ++i) {
 os << ' ' << v[i];
 }
 return os;
}
std::ostream& operator<<(std::ostream& os, char** words) {
 while (*words) {
 os << *words++ << ' ';
 }
 return os;
}
int main() {
 std::vector<std::string> words = { "What", "a", "beautiful", "world" };
 std::cout << words << '\n';
 owned_cstrings cstring_words(words);
 std::cout << cstring_words.data() << '\n';
}

Question 15

Once you have a class like this, it's a pity not to make the std::vector<std::string> useless by providing insertion/deletion capabilities.

Question 16

I'm a bit unsure what you mean. I believe you meant some sort modification capabilities into the class? In that case, I would start from cstring class, as that would let them combine it in their own way. Anyway, please write down the idea and I'll try to implement/discuss it later this evening.

Question 17

I meant very simply to give owned_cstrings a push_back method, so you don't need to create a std::vector<std::string> before converting it into a owned_cstrings.

Question 18

@papagaga, perhaps range constructor along with initializer list would fit better?

Question 19

new char[] with delete p?

Question 20

Firstly, there's a memory leak possible, because strlist performs two allocations using new[]. If the first succeeds, but the second throws std::bad_alloc, then we have no reference to *result with which to delete[] it.

The answer by Incomputable shows how the interface can be greatly improved.

I have a minor point that might be overlooked when reading that answer, and an efficiency improvement that can be incorporated into the original or into the proposed RAII object.

The minor point is that strlist() requires a reference to a mutable vector for no good reason - the signature should be

char ** strlist(const std::vector<std::string> &input);
// ^^^^^

The efficiency improvement is that we know the total storage requirement for all the strings at the start of the function/constructor, so we can make a single allocation and place all our strings within that block instead of making separate allocations for each string to be accessed. See example code below.

From C++11 onwards, we could go further, and make our object be a view object, simply storing pointers to the data() of the input strings (which would now need to be mutable - consider pass-by-value, and call it using std::move() where that's useful).

Finally, is there a good reason that this should work only with std::vector and not with other containers?

Single-allocation method

Here's how to make two passes over input to save making several small allocations. I'm keeping (nearly) the original interface to make the changes more obvious, but I really recommend you create a type to ensure that the memory management is automatic.

#include <cstring>
#include <string>
#include <vector>
char *const *strlist(const std::vector<std::string>& input)
{
 char **result = new char*[input.size() + 1];
 std::size_t storage_size = 0;
 for (auto const& s: input) {
 storage_size += s.size() + 1;
 }
 try {
 char *storage = new char[storage_size];
 char *p = storage;
 char **q = result;
 for (auto const& s: input) {
 *q++ = std::strcpy(p, s.c_str());
 p += s.size() + 1;
 }
 *q = nullptr; // terminate the list
 return result;
 }
 catch (...) {
 delete[] result;
 throw;
 }
}
void del_strlist(char *const *strings)
{
 // First string is the allocated storage
 delete[] strings[0];
 delete[] strings;
}

#include <iostream>
int main()
{
 std::vector<std::string> args{ "/bin/ls", "ls", "-l" };
 auto v = strlist(args);
 for (auto p = v; *p; ++p) {
 std::cout << '\'' << *p << "'\n";
 }
 del_strlist(v);
}

Single-allocation method, with smart pointer

We can use std::unique_ptr to hold our data, if we don't mind using a custom deleter:

#include <cstring>
#include <memory>
#include <string>
#include <vector>
auto strlist(const std::vector<std::string>& input)
{
 static auto const deleter = [](char**p) {
 // First string is the allocated storage
 delete[] p[0];
 delete[]p;
 };
 std::unique_ptr<char*[], decltype(deleter)>
 result{new char*[input.size() + 1], deleter};
 // Ensure that destructor is safe (in case next 'new[]' fails)
 result[0] = nullptr;
 std::size_t storage_size = 0;
 for (auto const& s: input) {
 storage_size += s.size() + 1;
 }
 char *p = result[0] = new char[storage_size];
 char **q = result.get();
 for (auto const& s: input) {
 *q++ = std::strcpy(p, s.c_str());
 p += s.size() + 1;
 }
 *q = nullptr; // terminate the list
 return result;
}

#include <iostream>
int main()
{
 std::vector<std::string> args{ "/bin/ls", "ls", "-l" };
 auto v = strlist(args);
 for (auto p = v.get(); *p; ++p) {
 std::cout << '\'' << *p << "'\n";
 }
}

You should see that this is a bit simpler to write and to use than my first version.

P.S. Both demos compile with g++ -std=c++17 -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds -Weffc++ and run under Valgrind with no warnings or errors, in case that's not already assumed.

Question 21

Agreed. I want to draw more focus towards std::string_view. If you don't need manage the memory of a string (read only and probably most size-preserving operations) should work on a string_view instead of a string&.

Question 22

May or may not be appropriate in this case - if we accept std::string_view as input, we'll need to copy the contents, in order to get the terminating NUL. So it's a case of swings and roundabouts here.

Question 23

if this were rust you could probably safely get away with temporarily replacing that zero byte in just for the duration of the c call. The same could be done here but it's questionable practice :D

Question 24

It should be noted however that if you were to pass this to the exec() family you must terminate it with a NULL pointer.

Question 25

@Pryftan - that's what the *q = nullptr is for, immediately before the return. I've even commented it to explain!

Question 26

There are some implicit problematic assumptions...

You must not forget to call del_strlist, and call it exactly once.
You clearly have to transport some data via unknown means from strlist call to the point where del_strlist takes place, namely the size of the vector.
The user has to know how each string was allocated. As your char** data structure is read/write, someone outside even might change an element to another, and try to use malloc/free.

All this data aggregation/"use the correct methods"-requirements strongly suggest to write a class.

You could avoid these problems by providing a class constructed from a either a std::vector<std::string> or a const char**, which would store the data in a local char** (and automatically deallocate it in the destructor). This already solves problem #1.

Then you could add a size_t member to your class containing the array size - this solves problem #2 - now the destructor knows the array size.

To tackle #3, you could even add some operator[]s which are able to access or even replace a char* at an index (and reallocate the memory in a correct way).

If you provide an operator char**() which returns the internal (private) member in which the allocated strings are stored, you may use this class anywhere where a char** is expected :)

If you need the stored data as C++ vector again, you might want to add a std::vector<std::string> get() const member to this class.

Question 27

Functions like execve actually take a char* const *, so an array of const pointers, to mutable strings. So strlist could return char* const * instead.
strlist should take the input argument by const reference, because it does not modify it.
char** result = new char*[input.size() + 1]; is needed, because an additional element (the trailing null pointer) is appended. (array[3] is an array of 3 items, with indices 0, 1, 2.)
Similarily, the strings need to allocated as char *temp = new char[input[i].size() + 1];, because std::string::size (or std::string::length) does not count the trailing 0 byte of the string.
std::strcpy is needed, except if a using declaration was used before. In the C++ headers like <cstring>, the C functions are in the std namespace.
in del_strlist, delete[] needs to be used two times instead of delete, because the memory was allocated using operator new[].
Because del_strlist is supposed to always finish (return false), it may be better to throw an exception if it does not, so that the user does not need to check the return value.

To avoid needing to have a deleting function del_strlist, it may be better to have a class that contains the allocated array, and deallocates it in the destructor. For example:

class strlist {
private:
 std::vector<char*> cstrings_;
public:
 template<typename It>
 strlist(It begin, It end) { // does not need std::vector as input
 for(It it = begin; it != end; ++it) {
 const std::string& string = *it;
 char* cstring = new char[string.length() + 1];
 std::strcpy(cstring, string.c_str());
 cstrings_.push_back(cstring);
 }
 cstrings_.push_back(nullptr);
 }
 ~strlist() {
 for(char* cstring : cstrings_)
 if(cstring != nullptr) delete[] cstring;
 }
 char* const * data() const {
 return cstring_.data();
 }
};

And used like:

void f() {
 std::vector<std::string> original_strings = ....
 strlist cstrings(original_strings.begin(), original_strings.end());
 execve("program", cstrings.data(), nullptr);
 // no need to call function to delete cstrings
}

Some older C APIs may take char* arrays as input, but not actually modify the array contents. In that case it is simpler to not copy new strings, but take pointers to strings in the original array, using const_cast. For example:

std::vector<char*> strlist(const std::vector<std::string>& strings) {
 std::vector<char*> cstrings;
 for(const std::string& string : strings)
 cstrings.push_back(const_cast<char*>(string.c_str()));
 cstrings.push_back(nullptr);
 return cstrings;
}

But this is a hack, and mostly useful only to gain performance / save memory.

Useless Useless 1,6939 silver badges10 bronze badges · Answer 1 · 2018-10-10 15:02:43Z

Since you already have a std::vector<std::string>, it's much simpler to let that own the memory, and build a parallel std::vector<char*> which just keeps pointers into the original strings. The only difficulty is ensuring this isn't used after the owning vector goes out of scope.

The simplest implementation is something like:

std::vector<char*> strlist(std::vector<std::string> &input) {
 std::vector<char*> result;
 // remember the nullptr terminator
 result.reserve(input.size()+1);
 std::transform(begin(input), end(input),
 std::back_inserter(result),
 [](std::string &s) { return s.data(); }
 );
 result.push_back(nullptr);
 return result;
}

where the caller is responsible for managing the lifetimes of the two vectors. You get the required char ** by calling result.data().

If you want to avoid the potential lifetime problems, you can write a class to own both vectors for you so their lifetimes are synchronized.

Note that I deliberately haven't taken the vector by const ref, because execve takes an array of pointers to non-const char, but it doesn't mutate the contents so there's no need to make a copy. This lack of const-safety is common in old C interfaces.

PS. I didn't force myself to use transform when a simple loop would do hoping for some magic optimization. I used transform instead of a simple loop because it more clearly expressed the semantics of my code.

A "simple" loop is a powerful multipurpose language construct which could do anything. By comparison transform is a simple, special-purpose algorithm, and much easier to reason about as a consequence.

Not least, since I had to take a non-const reference to the argument despite having no intention of mutating it, anything that makes it easier to see at a glance that no mutation is possible, is helpful.

I much prefer this approach. One minor point - I think you need to push_back a nullptr at the end. Also - if you go down the class path, you'll need to disable or do the right thing for copy construction and assignment.
Oh - another thing. I think you should use c_str() instead of data() (which isn't guaranteed to null terminate).
... and all known implementations null-terminated the result of data() before C++11.
@nurettin, the null pointer is there because the users need it to delimit the array (it's the interface of exec() family of functions).

muru muru 9734 silver badges14 bronze badges · Answer 2 · 2018-10-10 04:58:37Z

I can see a few potential problems here:

Since you allocated a char* array of input.size() elements, result[input.size()] is out of bounds.
Similarly, std::string's size() is the number of characters - it doesn't include the trailing 0円 needed for C-style strings. So every strcpy here risks doing a buffer overflow (risks, because it is possible for C++ std::strings to contain a null in the middle, terminating the strcpy mid way).
You have set a limit on the number of elements of strings you delete, but then delete strings irrespective of whether that limit was breached. This risks a memory leak if the limit was exceeded.
You use the array version of new new <type>[<size>]; This means you need to use the array version of delete delete [] <object> (Note the square brackets).

Incomputable Incomputable 9,7143 gold badges34 silver badges73 bronze badges · Answer 3 · 2018-10-10 05:51:03Z

Different approach

The approach in the question is somewhat procedural. My aim is to use RAII to facilitate ease of use, as the code currently can leak memory and relies on programmer to free it.

Usage cases

Lets first have a look at the usage:

Create cstring style array from std::strings.
Sink the created array into exec family of functions
Wait until child process exits
Reclaim the memory

Now, it clearly looks like constructor and destructor calls, as they operate on the same data, possibly even unmodified.

Code

Here is rough sketch of the class I had in mind:

class owned_cstrings {
 std::vector<char*> cstring_array;
public:
 owned_cstrings(const std::vector<std::string>& source) :
 cstring_array(source.size())
 {
 std::transform(source.begin(), source.end(), cstring_array.begin(), [](const auto& elem_str) {
 char* buffer = new char[elem_str.size() + 1];
 std::copy(elem_str.begin(), elem_str.end(), buffer);
 buffer[elem_str.size()] = 0;
 return buffer;
 });
 cstring_array.push_back(nullptr);
 }
 owned_cstrings(const owned_cstrings& other) = delete;
 owned_cstrings& operator=(const owned_cstrings& other) = delete;
 owned_cstrings(owned_cstrings&& other) = default;
 owned_cstrings& operator=(owned_cstrings&& other) = default;
 char** data() {
 return cstring_array.data();
 }
 ~owned_cstrings() {
 for (char* elem : cstring_array) {
 delete[] elem;
 }
 }
};

Design decisions

The code above would've been much more dangerous footgun without some thought put into it. First of all, it is not copyable, although it could do deep copy, I believe it is not intended. Not doing deep copy will result in more than one delete, which is disastrous. Second, the data access is somewhat limited, because the only usage case covered is sinking into exec family of functions.

Demo

Small demo on Wandbox:

#include <vector>
#include <string>
#include <algorithm>
class owned_cstrings {
 std::vector<char*> cstring_array;
public:
 owned_cstrings(const std::vector<std::string>& source) :
 cstring_array(source.size())
 {
 std::transform(source.begin(), source.end(), cstring_array.begin(), [](const auto& elem_str) {
 char* buffer = new char[elem_str.size() + 1];
 std::copy(elem_str.begin(), elem_str.end(), buffer);
 buffer[elem_str.size()] = 0;
 return buffer;
 });
 cstring_array.push_back(nullptr);
 }
 owned_cstrings(const owned_cstrings& other) = delete;
 owned_cstrings& operator=(const owned_cstrings& other) = delete;
 owned_cstrings(owned_cstrings&& other) = default;
 owned_cstrings& operator=(owned_cstrings&& other) = default;
 char** data() {
 return cstring_array.data();
 }
 ~owned_cstrings() {
 for (char* elem : cstring_array) {
 delete[] elem;
 }
 }
};
#include <iostream>
template <typename T>
std::ostream& operator<<(std::ostream& os, const std::vector<T>& v) {
 if (v.empty()) {
 return os;
 }
 os << v.front();
 for (std::size_t i = 1; i < v.size(); ++i) {
 os << ' ' << v[i];
 }
 return os;
}
std::ostream& operator<<(std::ostream& os, char** words) {
 while (*words) {
 os << *words++ << ' ';
 }
 return os;
}
int main() {
 std::vector<std::string> words = { "What", "a", "beautiful", "world" };
 std::cout << words << '\n';
 owned_cstrings cstring_words(words);
 std::cout << cstring_words.data() << '\n';
}

Once you have a class like this, it's a pity not to make the std::vector<std::string> useless by providing insertion/deletion capabilities.
I'm a bit unsure what you mean. I believe you meant some sort modification capabilities into the class? In that case, I would start from cstring class, as that would let them combine it in their own way. Anyway, please write down the idea and I'll try to implement/discuss it later this evening.
I meant very simply to give owned_cstrings a push_back method, so you don't need to create a std::vector<std::string> before converting it into a owned_cstrings.
@papagaga, perhaps range constructor along with initializer list would fit better?

Toby Speight Toby Speight 87.4k14 gold badges104 silver badges322 bronze badges · Answer 4 · 2018-10-10 08:10:22Z

Firstly, there's a memory leak possible, because strlist performs two allocations using new[]. If the first succeeds, but the second throws std::bad_alloc, then we have no reference to *result with which to delete[] it.

The answer by Incomputable shows how the interface can be greatly improved.

I have a minor point that might be overlooked when reading that answer, and an efficiency improvement that can be incorporated into the original or into the proposed RAII object.

The minor point is that strlist() requires a reference to a mutable vector for no good reason - the signature should be

char ** strlist(const std::vector<std::string> &input);
// ^^^^^

The efficiency improvement is that we know the total storage requirement for all the strings at the start of the function/constructor, so we can make a single allocation and place all our strings within that block instead of making separate allocations for each string to be accessed. See example code below.

From C++11 onwards, we could go further, and make our object be a view object, simply storing pointers to the data() of the input strings (which would now need to be mutable - consider pass-by-value, and call it using std::move() where that's useful).

Finally, is there a good reason that this should work only with std::vector and not with other containers?

Single-allocation method

Here's how to make two passes over input to save making several small allocations. I'm keeping (nearly) the original interface to make the changes more obvious, but I really recommend you create a type to ensure that the memory management is automatic.

#include <cstring>
#include <string>
#include <vector>
char *const *strlist(const std::vector<std::string>& input)
{
 char **result = new char*[input.size() + 1];
 std::size_t storage_size = 0;
 for (auto const& s: input) {
 storage_size += s.size() + 1;
 }
 try {
 char *storage = new char[storage_size];
 char *p = storage;
 char **q = result;
 for (auto const& s: input) {
 *q++ = std::strcpy(p, s.c_str());
 p += s.size() + 1;
 }
 *q = nullptr; // terminate the list
 return result;
 }
 catch (...) {
 delete[] result;
 throw;
 }
}
void del_strlist(char *const *strings)
{
 // First string is the allocated storage
 delete[] strings[0];
 delete[] strings;
}

#include <iostream>
int main()
{
 std::vector<std::string> args{ "/bin/ls", "ls", "-l" };
 auto v = strlist(args);
 for (auto p = v; *p; ++p) {
 std::cout << '\'' << *p << "'\n";
 }
 del_strlist(v);
}

Single-allocation method, with smart pointer

We can use std::unique_ptr to hold our data, if we don't mind using a custom deleter:

#include <cstring>
#include <memory>
#include <string>
#include <vector>
auto strlist(const std::vector<std::string>& input)
{
 static auto const deleter = [](char**p) {
 // First string is the allocated storage
 delete[] p[0];
 delete[]p;
 };
 std::unique_ptr<char*[], decltype(deleter)>
 result{new char*[input.size() + 1], deleter};
 // Ensure that destructor is safe (in case next 'new[]' fails)
 result[0] = nullptr;
 std::size_t storage_size = 0;
 for (auto const& s: input) {
 storage_size += s.size() + 1;
 }
 char *p = result[0] = new char[storage_size];
 char **q = result.get();
 for (auto const& s: input) {
 *q++ = std::strcpy(p, s.c_str());
 p += s.size() + 1;
 }
 *q = nullptr; // terminate the list
 return result;
}

#include <iostream>
int main()
{
 std::vector<std::string> args{ "/bin/ls", "ls", "-l" };
 auto v = strlist(args);
 for (auto p = v.get(); *p; ++p) {
 std::cout << '\'' << *p << "'\n";
 }
}

You should see that this is a bit simpler to write and to use than my first version.

P.S. Both demos compile with g++ -std=c++17 -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds -Weffc++ and run under Valgrind with no warnings or errors, in case that's not already assumed.

Agreed. I want to draw more focus towards std::string_view. If you don't need manage the memory of a string (read only and probably most size-preserving operations) should work on a string_view instead of a string&.
May or may not be appropriate in this case - if we accept std::string_view as input, we'll need to copy the contents, in order to get the terminating NUL. So it's a case of swings and roundabouts here.
if this were rust you could probably safely get away with temporarily replacing that zero byte in just for the duration of the c call. The same could be done here but it's questionable practice :D
It should be noted however that if you were to pass this to the exec() family you must terminate it with a NULL pointer.
@Pryftan - that's what the *q = nullptr is for, immediately before the return. I've even commented it to explain!

jvb jvb 8997 silver badges12 bronze badges · Answer 5 · 2018-10-10 06:38:50Z

There are some implicit problematic assumptions...

You must not forget to call del_strlist, and call it exactly once.
You clearly have to transport some data via unknown means from strlist call to the point where del_strlist takes place, namely the size of the vector.
The user has to know how each string was allocated. As your char** data structure is read/write, someone outside even might change an element to another, and try to use malloc/free.

All this data aggregation/"use the correct methods"-requirements strongly suggest to write a class.

You could avoid these problems by providing a class constructed from a either a std::vector<std::string> or a const char**, which would store the data in a local char** (and automatically deallocate it in the destructor). This already solves problem #1.

Then you could add a size_t member to your class containing the array size - this solves problem #2 - now the destructor knows the array size.

To tackle #3, you could even add some operator[]s which are able to access or even replace a char* at an index (and reallocate the memory in a correct way).

If you provide an operator char**() which returns the internal (private) member in which the allocated strings are stored, you may use this class anywhere where a char** is expected :)

If you need the stored data as C++ vector again, you might want to add a std::vector<std::string> get() const member to this class.

tmlen tmlen 2341 silver badge2 bronze badges · Answer 6 · 2018-10-10 21:04:49Z

Functions like execve actually take a char* const *, so an array of const pointers, to mutable strings. So strlist could return char* const * instead.
strlist should take the input argument by const reference, because it does not modify it.
char** result = new char*[input.size() + 1]; is needed, because an additional element (the trailing null pointer) is appended. (array[3] is an array of 3 items, with indices 0, 1, 2.)
Similarily, the strings need to allocated as char *temp = new char[input[i].size() + 1];, because std::string::size (or std::string::length) does not count the trailing 0 byte of the string.
std::strcpy is needed, except if a using declaration was used before. In the C++ headers like <cstring>, the C functions are in the std namespace.
in del_strlist, delete[] needs to be used two times instead of delete, because the memory was allocated using operator new[].
Because del_strlist is supposed to always finish (return false), it may be better to throw an exception if it does not, so that the user does not need to check the return value.

To avoid needing to have a deleting function del_strlist, it may be better to have a class that contains the allocated array, and deallocates it in the destructor. For example:

class strlist {
private:
 std::vector<char*> cstrings_;
public:
 template<typename It>
 strlist(It begin, It end) { // does not need std::vector as input
 for(It it = begin; it != end; ++it) {
 const std::string& string = *it;
 char* cstring = new char[string.length() + 1];
 std::strcpy(cstring, string.c_str());
 cstrings_.push_back(cstring);
 }
 cstrings_.push_back(nullptr);
 }
 ~strlist() {
 for(char* cstring : cstrings_)
 if(cstring != nullptr) delete[] cstring;
 }
 char* const * data() const {
 return cstring_.data();
 }
};

And used like:

void f() {
 std::vector<std::string> original_strings = ....
 strlist cstrings(original_strings.begin(), original_strings.end());
 execve("program", cstrings.data(), nullptr);
 // no need to call function to delete cstrings
}

Some older C APIs may take char* arrays as input, but not actually modify the array contents. In that case it is simpler to not copy new strings, but take pointers to strings in the original array, using const_cast. For example:

std::vector<char*> strlist(const std::vector<std::string>& strings) {
 std::vector<char*> cstrings;
 for(const std::string& string : strings)
 cstrings.push_back(const_cast<char*>(string.c_str()));
 cstrings.push_back(nullptr);
 return cstrings;
}

But this is a hack, and mostly useful only to gain performance / save memory.

Stack Exchange Network

Create a C style char** from a C++ vector<string>

6 Answers 6

Different approach

Usage cases

Code

Design decisions

Demo

Single-allocation method

Single-allocation method, with smart pointer

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Create a C style char** from a C++ vector<string>

6 Answers 6

Different approach

Usage cases

Code

Design decisions

Demo

Single-allocation method

Single-allocation method, with smart pointer

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions