2
\$\begingroup\$
char* join(char** strings, size_t num_strings, char* seperator)
{
 int cache = 0;
 int count = 0;
 for(int i = 0; i < num_strings; i++){
 cache += strlen(strings[i]);
 }
 char * joined = malloc((cache + num_strings * strlen(seperator) + 1) * sizeof(char));
 for(int j = 0; j < num_strings; j++){ //big loop
 if(j != 0){
 int j = 0;
 while(j < strlen(seperator)){ //inserting seperators
 joined[count] = seperator[j];
 count++;
 j++;
 }
 }
 int insert_string = 0;
 while( insert_string < strlen(strings[j])){ //inserting strings
 joined[count] = strings[j][insert_string];
 insert_string++;
 count++;
 }
 }
 joined[count] = 0; //Zero Termination
 return joined;
}

I wrote this join() function, which creates a string from num_strings with specified separator, for example *. Is there a way to make this more concise? Also is the termination correct?

Toby Speight
87.2k14 gold badges104 silver badges322 bronze badges
asked May 16, 2021 at 22:31
\$\endgroup\$
3
  • 2
    \$\begingroup\$ Are you purposely avoiding strcat and memcpy? \$\endgroup\$ Commented May 16, 2021 at 23:00
  • \$\begingroup\$ idk of strcat and memcpy im new to c :D \$\endgroup\$ Commented May 16, 2021 at 23:06
  • \$\begingroup\$ There are two aspects lacking. Without going into too much detail, they are tests and documentation. Tests (consider Test-Driven Development/TDD) make sure your code behaves as expected even in corner cases. Documentation is mostly prose that tells the user of that function what its behaviour is and what its requirements are (preconditions and postconditions). Without documenting these guarantees, you often can't tell whether something is right or wrong. \$\endgroup\$ Commented May 18, 2021 at 7:45

3 Answers 3

6
\$\begingroup\$
char* join(char** strings, size_t num_strings, char* seperator)

The inputs can be const as they should not be modified by this function: const char* join(const char** strings, size_t num_strings, const char* seperator)


int cache = 0;

cache is not a very descriptive name. Prefer input_strings_length or similar


for(int i = 0; i < num_strings; i++){

Format as for (int i = 0; i < num_strings; i++) { (added spaces for readability)


char * joined

Why does the * have spaces around both side, but the rest are char* name? (space only on right side)


for(int j = 0; j < num_strings; j++){
 if(j != 0){
 int j = 0;

Prefer not using shadowed variables (same variable name in both inside and outside)


int j = 0;
while(j < strlen(seperator)){ //inserting seperators
 joined[count] = seperator[j];
 count++;
 j++;
}

This is same as:

strcpy(joined + count, seperator);
count += strlen(seperator);

which I find more readable


// big loop

This comment is not descriptive at all. What does it mean?

I would rename variable in for(int j = 0; j < num_strings; j++){ to be for (int string_index = 0; string_index < num_strings; string_index++) {


The whole program:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char * array[] = {
 "First entry",
 "Second entry",
 "Third entry",
};
const char* join(const char** strings, size_t num_strings, const char* seperator) {
 int input_strings_length = 0;
 for (int i = 0; i < num_strings; i++) {
 input_strings_length += strlen(strings[i]);
 }
 size_t string_length = input_strings_length + num_strings * strlen(seperator);
 char* joined = malloc((string_length + 1) * sizeof(char));
 joined[string_length] = '0円';
 int array_offset = 0;
 for (int string_index = 0; string_index < num_strings; string_index++) {
 if (string_index != 0) {
 strcpy(joined + array_offset, seperator);
 array_offset += strlen(seperator);
 }
 strcpy(joined + array_offset, strings[string_index]);
 array_offset += strlen(strings[string_index]);
 }
 return joined;
}
int main() {
 printf("%s\n", join(array, 3, "SEP"));
 return 0;
}
answered May 16, 2021 at 23:23
\$\endgroup\$
3
  • \$\begingroup\$ Probably worth const size_t sep_len - strlen(separator); rather than calling strlen() n times. \$\endgroup\$ Commented May 17, 2021 at 9:23
  • 1
    \$\begingroup\$ Why int string_length? size_t string_length would less likely overflow. \$\endgroup\$ Commented May 17, 2021 at 20:30
  • \$\begingroup\$ string should be const on both levels, so char const* const* string. I'd also always (for consistency) put the const to the right of what is not modifyable. \$\endgroup\$ Commented May 18, 2021 at 7:13
4
\$\begingroup\$
 char * joined = malloc((cache + num_strings * strlen(seperator) + 1)
 * sizeof(char));

Trivial (spelling): separator.

Multiplying by sizeof (char) (which is 1, by definition) is pointless.

We only need to reserve space for num_strings - 1 separators.

The bigger issue here is that we dereference joined, even when malloc() returns a null pointer. That's Undefined Behaviour.

char * joined = malloc(cache + (num_strings - 1) * strlen(separator) + 1);
if (!joined) {
 /* allocation failed - pass null pointer back to caller */
 return joined;
}
answered May 17, 2021 at 9:22
\$\endgroup\$
2
\$\begingroup\$

const

Use const to allow greater uses of join and employ select optimizations.

Paradigm shift

C2X promotes size before pointer

Avoid running down the strings multiple times to find the length.

One pass of size_t length_separator = strlen(separator); ... size_t len = strlen(string[i]); is sufficient to know the length.

Check for overflow

Not too hard to add basic length checking.


Some unchecked code.

#define JOIN_N 64
char* join(size_t num_strings, const char** strings, const char* separator) {
 // Efficiently handle first strings.
 size_t lengths[JOIN_N];
 // Find separator length
 size_t length_separator = strlen(separator);
 if (num_strings && SIZE_MAX/num_strings >= length_separator) {
 return NULL; // Too big;
 }
 // Find total size
 size_t size_needed = 1 + num_strings*length_separator;
 for (size_t i = 0; i < num_strings; i++) {
 size_t len = strlen(string[i]); 
 if (i < JOIN_N) {
 lengths[i] = len; // Save for later
 }
 if (SIZE_MAX - size_needed > len) {
 return NULL; // Too big
 }
 size_needed += len;
 }
 // Allocate
 char *joined = malloc(size_needed);
 if (joined == NULL) {
 return NULL; // Too big, OOM
 }
 // Copy
 char *offset = joined;
 for (size_t i = 0; i < num_strings; i++) {
 if (i > 0) {
 strcpy(offset, separator);
 offset += length_separator;
 }
 size_t len = (i < JOIN_N) ? lengths[i] : strlen(strings[i]);
 strcpy(offset, strings[i]);
 offset += len;
 }
 join[offset] = '0円';
 return join;
}

Also research using memcpy() or non-standard stpcpy() instead of strcpy().

answered May 17, 2021 at 21:23
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.