char* join(char** strings, size_t num_strings, char* seperator)
{
int cache = 0;
int count = 0;
for(int i = 0; i < num_strings; i++){
cache += strlen(strings[i]);
}
char * joined = malloc((cache + num_strings * strlen(seperator) + 1) * sizeof(char));
for(int j = 0; j < num_strings; j++){ //big loop
if(j != 0){
int j = 0;
while(j < strlen(seperator)){ //inserting seperators
joined[count] = seperator[j];
count++;
j++;
}
}
int insert_string = 0;
while( insert_string < strlen(strings[j])){ //inserting strings
joined[count] = strings[j][insert_string];
insert_string++;
count++;
}
}
joined[count] = 0; //Zero Termination
return joined;
}
I wrote this join()
function, which creates a string from num_strings
with specified separator, for example *
. Is there a way to make this more concise? Also is the termination correct?
3 Answers 3
char* join(char** strings, size_t num_strings, char* seperator)
The inputs can be const
as they should not be modified by this function: const char* join(const char** strings, size_t num_strings, const char* seperator)
int cache = 0;
cache
is not a very descriptive name. Prefer input_strings_length
or similar
for(int i = 0; i < num_strings; i++){
Format as for (int i = 0; i < num_strings; i++) {
(added spaces for readability)
char * joined
Why does the *
have spaces around both side, but the rest are char* name
? (space only on right side)
for(int j = 0; j < num_strings; j++){ if(j != 0){ int j = 0;
Prefer not using shadowed variables (same variable name in both inside and outside)
int j = 0; while(j < strlen(seperator)){ //inserting seperators joined[count] = seperator[j]; count++; j++; }
This is same as:
strcpy(joined + count, seperator);
count += strlen(seperator);
which I find more readable
// big loop
This comment is not descriptive at all. What does it mean?
I would rename variable in for(int j = 0; j < num_strings; j++){
to be for (int string_index = 0; string_index < num_strings; string_index++) {
The whole program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char * array[] = {
"First entry",
"Second entry",
"Third entry",
};
const char* join(const char** strings, size_t num_strings, const char* seperator) {
int input_strings_length = 0;
for (int i = 0; i < num_strings; i++) {
input_strings_length += strlen(strings[i]);
}
size_t string_length = input_strings_length + num_strings * strlen(seperator);
char* joined = malloc((string_length + 1) * sizeof(char));
joined[string_length] = '0円';
int array_offset = 0;
for (int string_index = 0; string_index < num_strings; string_index++) {
if (string_index != 0) {
strcpy(joined + array_offset, seperator);
array_offset += strlen(seperator);
}
strcpy(joined + array_offset, strings[string_index]);
array_offset += strlen(strings[string_index]);
}
return joined;
}
int main() {
printf("%s\n", join(array, 3, "SEP"));
return 0;
}
-
\$\begingroup\$ Probably worth
const size_t sep_len - strlen(separator);
rather than callingstrlen()
n times. \$\endgroup\$Toby Speight– Toby Speight2021年05月17日 09:23:23 +00:00Commented May 17, 2021 at 9:23 -
1\$\begingroup\$ Why
int string_length
?size_t string_length
would less likely overflow. \$\endgroup\$chux– chux2021年05月17日 20:30:48 +00:00Commented May 17, 2021 at 20:30 -
\$\begingroup\$
string
should beconst
on both levels, sochar const* const* string
. I'd also always (for consistency) put theconst
to the right of what is not modifyable. \$\endgroup\$uli– uli2021年05月18日 07:13:00 +00:00Commented May 18, 2021 at 7:13
char * joined = malloc((cache + num_strings * strlen(seperator) + 1) * sizeof(char));
Trivial (spelling): separator.
Multiplying by sizeof (char)
(which is 1, by definition) is pointless.
We only need to reserve space for num_strings - 1
separators.
The bigger issue here is that we dereference joined
, even when malloc()
returns a null pointer. That's Undefined Behaviour.
char * joined = malloc(cache + (num_strings - 1) * strlen(separator) + 1);
if (!joined) {
/* allocation failed - pass null pointer back to caller */
return joined;
}
const
Use const
to allow greater uses of join
and employ select optimizations.
Paradigm shift
C2X promotes size before pointer
Avoid running down the strings multiple times to find the length.
One pass of size_t length_separator = strlen(separator); ... size_t len = strlen(string[i]);
is sufficient to know the length.
Check for overflow
Not too hard to add basic length checking.
Some unchecked code.
#define JOIN_N 64
char* join(size_t num_strings, const char** strings, const char* separator) {
// Efficiently handle first strings.
size_t lengths[JOIN_N];
// Find separator length
size_t length_separator = strlen(separator);
if (num_strings && SIZE_MAX/num_strings >= length_separator) {
return NULL; // Too big;
}
// Find total size
size_t size_needed = 1 + num_strings*length_separator;
for (size_t i = 0; i < num_strings; i++) {
size_t len = strlen(string[i]);
if (i < JOIN_N) {
lengths[i] = len; // Save for later
}
if (SIZE_MAX - size_needed > len) {
return NULL; // Too big
}
size_needed += len;
}
// Allocate
char *joined = malloc(size_needed);
if (joined == NULL) {
return NULL; // Too big, OOM
}
// Copy
char *offset = joined;
for (size_t i = 0; i < num_strings; i++) {
if (i > 0) {
strcpy(offset, separator);
offset += length_separator;
}
size_t len = (i < JOIN_N) ? lengths[i] : strlen(strings[i]);
strcpy(offset, strings[i]);
offset += len;
}
join[offset] = '0円';
return join;
}
Also research using memcpy()
or non-standard stpcpy()
instead of strcpy()
.
strcat
andmemcpy
? \$\endgroup\$