#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct String {
char* string;
long long unsigned capacity; // uint32_t had trouble printing
long long unsigned len;
} String;
String string_new(char* stuff) {
int len = strlen(stuff);
String ret = {
stuff,
len,
len,
};
return ret;
}
void string_push(String* s, char push) {
if(s->len <= s->capacity) {
s->capacity *= 2;
char* new_stuff = calloc(sizeof(char), s->capacity);
memcpy(new_stuff, s->string, s->capacity);
s->string = new_stuff;
}
s->len += 1;
s->string[s->len-1] = push;
}
int main(void) {
String hello = string_new("");
char* text = "Hello, World!";
// because c treats integers as booleans, i could easily obfuscate this to just `text[i]` but i
// chose readability over code size.
for(int i = 0; text[i] != '0円'; i++) {
string_push(&hello, text[i]);
puts(hello.string);
}
return 0;
}
I'm a complete beginner to the C programming language, so any feedback would be appreciated.
From what I've heard, implementing vectors involves:
- create: allocate some memory
- push: allocate more memory, place previous string in new memory
- pop: decrement length
3 Answers 3
In principle this is the right approach, however there are some issues:
- In
string_new
you need to allocate/duplicate memory. This is because yourString
assumes ownership of thestring
data buffer, so you cannot store a pointer wich is passed by argument into the fieldstring
. I would suggest astrdup(stuff)
here. Because of the null termination character capacity would also be `len+1``. - You need to check for memory allocation failure (return value of NULL) from the
realloc
/calloc
call in string_push and thestrdup
in instring_new
. You need to decide what to do on failure, e.g. return a flag indicating the status or even terminating the program. string_push
is missing a free of the old buffer.string_push
is not wrong (apart from not checking return value and not freeing buffer) but I would prefer usingrealloc
as it is more efficient and shorter:new_string = realloc(s->string, s->capacity)
and after error checking assigns->string = new_string
. This will enlarge the buffer if possible and avoid the copy or if a new buffer had to be allocated, includes the memcpy and frees old buffer.- Probably you want to include more functions such as
string_cat
which concatenates a whole string as repeatedly callingstring_push
is inefficient. Also a deallocation would be needed likestring_free
. Of course this is not strictly wrong, but needs to be included for proper design, I would say. - You should also be aware that your type of string is different from the nullterminated strings in C. You have an explicit length field, which theoretically allows having a 0円 character in the string. Also your string is not necessarily including a null termination character. So you cannot pass your
string
to standard library string functions, including e.g.printf
, andputs
. While string_new ensures that a null termination char is included, this is not the case forstring_push
. I would suggest that string push always writes a null termination character as well, so that you can actually pass thestring
to standard library string functions. (Please be careful to get the len & capacity logic correct in this case).
-
1\$\begingroup\$
s->string = realloc(s->string, s->capacity)
is certainly wrong, as that removes the only way to access the allocated memory whenrealloc()
fails and returns a null pointer. \$\endgroup\$Toby Speight– Toby Speight2022年12月21日 10:33:48 +00:00Commented Dec 21, 2022 at 10:33 -
\$\begingroup\$ @TobySpeight Yes true, thanks. I was missing the obvious: checking for NULL return. \$\endgroup\$Andreas H.– Andreas H.2022年12月21日 21:41:34 +00:00Commented Dec 21, 2022 at 21:41
This code is wrong:
s->capacity *= 2; char* new_stuff = calloc(sizeof(char), s->capacity); memcpy(new_stuff, s->string, s->capacity);
If calloc()
fails, then new_stuff
will be a null pointer, and writing through it with memcpy()
is Undefined Behaviour. That means your code could do absolutely anything: from working as intended right through to overwriting the user's valuable data. (And why do we enter this code only when the capacity is already big enough, rather than when it's too small?)
Even when successful, it's wrong, because we attempt to copy s->capacity
from s->string
after doubling the capacity, so we are overrunning the source of the copy. Actually, when running the provided main()
, the initial capacity is zero, so we never make the string big enough for its contents.
Further, we never free()
any of the memory we allocate.
When I compile with a reasonable set of warning options, my compiler tells me many things that can all be fixed:
gcc-12 -std=c17 -fPIC -gdwarf-4 -g -Wall -Wextra -Wwrite-strings -Wno-parentheses -Wpedantic -Warray-bounds -Wconversion -Wstrict-prototypes -fanalyzer 282067.c -o 282067
282067.c: In function ‘string_new’:
282067.c:12:15: warning: conversion from ‘size_t’ {aka ‘long unsigned int’} to ‘int’ may change value [-Wconversion]
12 | int len = strlen(stuff);
| ^~~~~~
282067.c:15:9: warning: conversion to ‘long long unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
15 | len,
| ^~~
282067.c:16:9: warning: conversion to ‘long long unsigned int’ from ‘int’ may change the sign of the result [-Wsign-conversion]
16 | len,
| ^~~
282067.c: In function ‘main’:
282067.c:33:31: warning: passing argument 1 of ‘string_new’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
33 | String hello = string_new("");
| ^~
282067.c:11:25: note: expected ‘char *’ but argument is of type ‘const char *’
11 | String string_new(char* stuff) {
| ~~~~~~^~~~~
282067.c:34:18: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
34 | char* text = "Hello, World!";
| ^~~~~~~~~~~~~~~
282067.c: In function ‘string_push’:
282067.c:25:9: warning: use of possibly-NULL ‘new_stuff’ where non-null expected [CWE-690] [-Wanalyzer-possible-null-argument]
25 | memcpy(new_stuff, s->string, s->capacity);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
‘string_push’: events 1-4
|
| 22 | if(s->len <= s->capacity) {
| | ^
| | |
| | (1) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (2) ...to here
| 24 | char* new_stuff = calloc(sizeof(char), s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (3) this call could return NULL
| 25 | memcpy(new_stuff, s->string, s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (4) argument 1 (‘new_stuff’) from (3) could be NULL where non-null expected
|
In file included from 282067.c:2:
/usr/include/string.h:43:14: note: argument 1 of ‘memcpy’ must be non-null
43 | extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
| ^~~~~~
282067.c:26:19: warning: leak of ‘hello.string’ [CWE-401] [-Wanalyzer-malloc-leak]
26 | s->string = new_stuff;
| ~~~~~~~~~~^~~~~~~~~~~
‘main’: events 1-4
|
| 32 | int main(void) {
| | ^~~~
| | |
| | (1) entry to ‘main’
|......
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (2) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (3) ...to here
| | (4) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 5-7
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (5) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (6) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (7) ...to here
|
<------+
|
‘main’: events 8-11
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (9) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (10) ...to here
| | (8) returning to ‘main’ from ‘string_push’
| | (11) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 12-14
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (12) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (13) following ‘false’ branch...
|......
| 28 | s->len += 1;
| | ~~~~~~
| | |
| | (14) ...to here
|
<------+
|
‘main’: events 15-18
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (16) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (17) ...to here
| | (15) returning to ‘main’ from ‘string_push’
| | (18) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 19-23
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (19) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (20) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (21) ...to here
| 24 | char* new_stuff = calloc(sizeof(char), s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (22) allocated here
| 25 | memcpy(new_stuff, s->string, s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (23) assuming ‘new_stuff’ is non-NULL
|
<------+
|
‘main’: events 24-27
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (25) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (26) ...to here
| | (24) returning to ‘main’ from ‘string_push’
| | (27) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 28-31
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (28) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (29) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (30) ...to here
|......
| 26 | s->string = new_stuff;
| | ~~~~~~~~~~~~~~~~~~~~~
| | |
| | (31) ‘hello.string’ leaks here; was allocated at (22)
|
282067.c:30:1: warning: leak of ‘<unknown>’ [CWE-401] [-Wanalyzer-malloc-leak]
30 | }
| ^
‘main’: events 1-4
|
| 32 | int main(void) {
| | ^~~~
| | |
| | (1) entry to ‘main’
|......
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (2) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (3) ...to here
| | (4) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 5-9
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (5) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (6) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (7) ...to here
| 24 | char* new_stuff = calloc(sizeof(char), s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (8) allocated here
| 25 | memcpy(new_stuff, s->string, s->capacity);
| | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | |
| | (9) assuming ‘new_stuff’ is non-NULL
|
<------+
|
‘main’: events 10-13
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (11) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (12) ...to here
| | (10) returning to ‘main’ from ‘string_push’
| | (13) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 14-16
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (14) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (15) following ‘false’ branch...
|......
| 28 | s->len += 1;
| | ~~~~~~
| | |
| | (16) ...to here
|
<------+
|
‘main’: events 17-20
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (18) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (19) ...to here
| | (17) returning to ‘main’ from ‘string_push’
| | (20) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 21-23
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (21) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (22) following ‘false’ branch...
|......
| 28 | s->len += 1;
| | ~~~~~~
| | |
| | (23) ...to here
|
<------+
|
‘main’: events 24-27
|
| 37 | for(int i = 0; text[i] != '0円'; i++) {
| | ~~~~~~~~~~~~~~~
| | |
| | (25) following ‘true’ branch...
| 38 | string_push(&hello, text[i]);
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
| | | |
| | | (26) ...to here
| | (24) returning to ‘main’ from ‘string_push’
| | (27) calling ‘string_push’ from ‘main’
|
+--> ‘string_push’: events 28-31
|
| 21 | void string_push(String* s, char push) {
| | ^~~~~~~~~~~
| | |
| | (28) entry to ‘string_push’
| 22 | if(s->len <= s->capacity) {
| | ~
| | |
| | (29) following ‘true’ branch...
| 23 | s->capacity *= 2;
| | ~~~~~~~~~~~
| | |
| | (30) ...to here
|......
| 30 | }
| | ~
| | |
| | (31) ‘<unknown>’ leaks here; was allocated at (8)
|
Here's how I would go about rewriting this program:
Use size_t
for lengths:
typedef struct String {
char* string;
size_t capacity;
size_t len;
} String;
When creating a new String
object, we should copy the source string (including its 0円
terminator). This allows us to pass in a const char*
(e.g. from a string literal) and, crucially, means we can realloc()
its storage right from the beginning (we say that the String
owns the allocated memory):
String string_new(const char *stuff)
{
size_t len = strlen(stuff);
size_t capacity = len + 1; /* include 0円 terminator */
String s = {malloc(capacity), 0, 0};
if (s.string) {
memcpy(s.string, stuff, len+1);
s.len = len;
s.capacity = capacity;
}
return s;
}
When we push characters onto the end of the string, use realloc()
to update our allocation. If it needs to give us a new pointer, it will take care of copying the contents, and if not, then it avoids unnecessary copying. We need to be careful to do the right thing if realloc()
fails - in this case, leaving the String
object unmodified and returning an exit status that the caller must check. And we need to update the 0円
that terminates the string, too.
#include <stdbool.h>
bool string_push(String *s, char push)
{
if (s->len + 1 >= s->capacity) {
size_t new_capacity = 2 * s->capacity + 16;
char* new_stuff = realloc(s->string, new_capacity);
if (!new_stuff) {
/* failed! */
return false;
}
s->string = new_stuff;
s->capacity = new_capacity;
}
s->len += 1;
s->string[s->len-1] = push;
s->string[s->len] = '0円';
return true;
}
We should also provide a function to release the memory owned by the string:
void string_free(String *s)
{
free(s->string);
s->string = NULL;
s->len = s->capacity = 0;
}
Our main()
should take appropriate action when a memory allocation fails (and also use the correct type for string literals):
int main(void)
{
String hello = string_new("");
if (!hello.string) {
fputs("Out of memory.\n", stderr);
return EXIT_FAILURE;
}
const char *text = "Hello, World!";
for (const char *p = text; *p != '0円'; ++p) {
if (!string_push(&hello, *p)) {
fputs("Out of memory.\n", stderr);
return EXIT_FAILURE;
}
puts(hello.string);
}
string_free(&hello);
return EXIT_SUCCESS;
}
Full code
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct String {
char* string;
size_t capacity;
size_t len;
} String;
String string_new(const char *stuff)
{
size_t len = strlen(stuff);
size_t capacity = len + 1; /* include 0円 terminator */
String s = {malloc(capacity), 0, 0};
if (s.string) {
memcpy(s.string, stuff, len+1);
s.len = len;
s.capacity = capacity;
}
return s;
}
bool string_push(String *s, char push)
{
if (s->len + 1 >= s->capacity) {
size_t new_capacity = 2 * s->capacity + 16;
char* new_stuff = realloc(s->string, new_capacity);
if (!new_stuff) {
/* failed! */
return false;
}
s->string = new_stuff;
s->capacity = new_capacity;
}
s->len += 1;
s->string[s->len-1] = push;
s->string[s->len] = '0円';
return true;
}
void string_free(String *s)
{
free(s->string);
s->string = NULL;
s->len = s->capacity = 0;
}
int main(void)
{
String hello = string_new("");
if (!hello.string) {
fputs("Out of memory.\n", stderr);
return EXIT_FAILURE;
}
const char *text = "Hello, World!";
for (const char *p = text; *p != '0円'; ++p) {
if (!string_push(&hello, *p)) {
fputs("Out of memory.\n", stderr);
return EXIT_FAILURE;
}
puts(hello.string);
}
string_free(&hello);
return EXIT_SUCCESS;
}
This improved code compiles without warnings, and also runs cleanly under Valgrind.
-
\$\begingroup\$ "also provide a function to release the memory" --> Perhaps also add
bool string_push_string(String *s, const char *push)
\$\endgroup\$chux– chux2022年12月21日 16:39:29 +00:00Commented Dec 21, 2022 at 16:39 -
\$\begingroup\$ Yes, I thought of suggesting that (and implementing both it and the push-char function in terms of
string_append_chars(String *s, const char *addend, size_t length)
). But this review already was long enough and didn't want to overdo it. \$\endgroup\$Toby Speight– Toby Speight2022年12月21日 20:02:47 +00:00Commented Dec 21, 2022 at 20:02
Common name
String is a common name. To reduce collisions with existing code, select a less common name. Example: jds
?
.capacity
ambiguity
.capacity
is not clear. Is that the size of the allocation - or the maximum length of a storable string? Perhaps length_max
. Or, IMO, use .size
which is the size of the allocation.
Missing header file
As the set of String
routines mature, it is C idiomatic to collect them in a String.c
and provide a String.h
. (or jds.h
considering the above.)
Example employing above ideas.
// jds.h
#ifndef _JDS_
#define _JDS_ 1
#include <stddef.h>
typedef struct jds {
char* string;
size_t size;
size_t len;
} jds;
jds jds_new(const char* stuff);
void jds_push(jds* s, char push);
#endif