I was reading a thread titled "strlen vs sizeof" on CodeGuru, and one of the replies states that "it's anyways [sic] bad practice to initialie [sic] a char
array with a string literal."
Is this true, or is that just his (albeit an "elite member") opinion?
Here is the original question:
#include <stdio.h>
#include<string.h>
main()
{
char string[] = "october";
strcpy(string, "september");
printf("the size of %s is %d and the length is %d\n\n", string, sizeof(string), strlen(string));
return 0;
}
right. the size should be the length plus 1 yes?
this is the output
the size of september is 8 and the length is 9
size should be 10 surely. its like its calculating the sizeof string before it is changed by strcpy but the length after.
Is there something wrong with my syntax or what?
Here is the reply:
It's anyways bad practice to initialie a char array with a string literal. So always do one of the following:
const char string1[] = "october";
char string2[20]; strcpy(string2, "september");
6 Answers 6
It's anyways bad practice to initialie a char array with a string literal.
The author of that comment never really justifies it, and I find the statement puzzling.
In C (and you've tagged this as C), that's pretty much the only way to initialize an array of char
with a string value (initialization is different from assignment). You can write either
char string[] = "october";
or
char string[8] = "october";
or
char string[MAX_MONTH_LENGTH] = "october";
In the first case, the size of the array is taken from the size of the initializer. String literals are stored as arrays of char
with a terminating 0 byte, so the size of the array is 8 ('o', 'c', 't', 'o', 'b', 'e', 'r', 0). In the second two cases, the size of the array is specified as part of the declaration (8 and MAX_MONTH_LENGTH
, whatever that happens to be).
What you cannot do is write something like
char string[];
string = "october";
or
char string[8];
string = "october";
etc. In the first case, the declaration of string
is incomplete because no array size has been specified and there's no initializer to take the size from. In both cases, the =
won't work because a) an array expression such as string
may not be the target of an assignment and b) the =
operator isn't defined to copy the contents of one array to another anyway.
By that same token, you can't write
char string[] = foo;
where foo
is another array of char
. This form of initialization will only work with string literals.
EDIT
I should amend this to say that you can also initialize arrays to hold a string with an array-style initializer, like
char string[] = {'o', 'c', 't', 'o', 'b', 'e', 'r', 0};
or
char string[] = {111, 99, 116, 111, 98, 101, 114, 0}; // assumes ASCII
but it's easier on the eyes to use string literals.
EDIT2
In order to assign the contents of an array outside of a declaration, you would need to use either strcpy/strncpy
(for 0-terminated strings) or memcpy
(for any other type of array):
if (sizeof string > strlen("october"))
strcpy(string, "october");
or
strncpy(string, "october", sizeof string); // only copies as many characters as will
// fit in the target buffer; 0 terminator
// may not be copied, but the buffer is
// uselessly completely zeroed if the
// string is shorter!
-
19
strncpy
is rarely the right answerKeith Thompson– Keith Thompson2013年01月17日 21:52:50 +00:00Commented Jan 17, 2013 at 21:52 -
1@KeithThompson: not disagreeing, just added it for completeness' sake.John Bode– John Bode2013年01月17日 21:58:37 +00:00Commented Jan 17, 2013 at 21:58
-
20Please note that
char[8] str = "october";
is bad practice. I had to literally char count myself to make sure it wasn't an overflow and it breaks under maintenance... e.g. correcting a spelling error fromseprate
toseparate
will break if size not updated.djechlin– djechlin2013年05月14日 21:22:52 +00:00Commented May 14, 2013 at 21:22 -
1I agree with djechlin, it is bad practice for the reasons given. JohnBode's answer doesn't comment at all on the "bad practice" aspect (which is the main part of the question!!), it just explains what you can or cannot do to initialize the array.mastov– mastov2015年06月11日 11:46:48 +00:00Commented Jun 11, 2015 at 11:46
-
1Minor: As 'length" value returned from
strlen()
does not include the null character, usingMAX_MONTH_LENGTH
to hold the maximum size needed forchar string[]
often looks wrong. IMO,MAX_MONTH_SIZE
would be better here.chux– chux2018年05月09日 19:21:19 +00:00Commented May 9, 2018 at 19:21
The only problem I recall is assigning string literal to char *
:
char var1[] = "september";
var1[0] = 'S'; // Ok - 10 element char array allocated on stack
char const *var2 = "september";
var2[0] = 'S'; // Compile time error - pointer to constant string
char *var3 = "september";
var3[0] = 'S'; // Modifying some memory - which may result in modifying... something or crash
For example take this program:
#include <stdio.h>
int main() {
char *var1 = "september";
char *var2 = "september";
var1[0] = 'S';
printf("%s\n", var2);
}
This on my platform (Linux) crashes as it tries to write to page marked as read-only. On other platforms it might print 'September' etc.
That said - initialization by literal makes the specific amount of reservation so this won't work:
char buf[] = "May";
strncpy(buf, "September", sizeof(buf)); // Result "Sep"
But this will
char buf[32] = "May";
strncpy(buf, "September", sizeof(buf));
As last remark - I wouldn't use strcpy
at all:
char buf[8];
strcpy(buf, "very long string very long string"); // Oops. We overwrite some random memory
While some compilers can change it into safe call strncpy
is much safer:
char buf[1024];
strncpy(buf, something_else, sizeof(buf)); // Copies at most sizeof(buf) chars so there is no possibility of buffer overrun. Please note that sizeof(buf) works for arrays but NOT pointers.
buf[sizeof(buf) - 1] = '0円';
-
There's still a risk for buffer overrun on that
strncpy
because it doesn't null terminate the copied string when length ofsomething_else
is greater thansizeof(buf)
. I usually set the last charbuf[sizeof(buf)-1] = 0
to protect from that, or ifbuf
is zero-initialized, usesizeof(buf) - 1
as the copy length.syockit– syockit2016年06月24日 18:30:02 +00:00Commented Jun 24, 2016 at 18:30 -
1Use
strlcpy
orstrcpy_s
or evensnprintf
if you have to.Stack Exchange Broke The Law– Stack Exchange Broke The Law2018年01月23日 01:30:29 +00:00Commented Jan 23, 2018 at 1:30 -
Fixed. Unfortunatly there is no easy portable way of doing this unless you have a luxury of working with newest compilers (
strlcpy
andsnprintf
are not directly accessible on MSVC, at least orders andstrcpy_s
are not on *nix).Maja Piechotka– Maja Piechotka2018年01月23日 01:47:25 +00:00Commented Jan 23, 2018 at 1:47 -
@MaciejPiechotka: Well, thank god Unix rejected the microsoft-sponsored annex k.Deduplicator– Deduplicator2018年01月23日 02:12:05 +00:00Commented Jan 23, 2018 at 2:12
Primarily because you won't have the size of the char[]
in a variable / construct that you can easily use within the program.
The code sample from the link:
char string[] = "october";
strcpy(string, "september");
string
is allocated on the stack as 7 or 8 characters long. I can't recall if it's null-terminated this way or not - the thread you linked to stated that it is.
Copying "september" over that string is an obvious memory overrun.
Another challenge comes about if you pass string
to another function so the other function can write into the array. You need to tell the other function how long the array is so it doesn't create an overrun. You could pass string
along with the result of strlen()
but the thread explains how this can blow up if string
is not null-terminated.
You're better off allocating a string with a fixed size (preferably defined as a constant) and then pass the array and fixed size to the other function. @John Bode's comment(s) are correct, and there are ways to mitigate these risks. They also require more effort on your part to use them.
In my experience, the value I initialized the char[]
to is usually too small for the other values I need to place in there. Using a defined constant helps avoid that issue.
sizeof string
will give you the size of the buffer (8 bytes); use the result of that expression instead of strlen
when you're concerned about memory.
Similarly, you can make a check before the call to strcpy
to see if your target buffer is large enough for the source string: if (sizeof target > strlen(src)) { strcpy (target, src); }
.
Yes, if you have to pass the array to a function, you'll need to pass its physical size as well: foo (array, sizeof array / sizeof *array);
. – John Bode
-
3
sizeof string
will give you the size of the buffer (8 bytes); use the result of that expression instead ofstrlen
when you're concerned about memory. Similarly, you can make a check before the call tostrcpy
to see if your target buffer is large enough for the source string:if (sizeof target > strlen(src)) { strcpy (target, src); }
. Yes, if you have to pass the array to a function, you'll need to pass its physical size as well:foo (array, sizeof array / sizeof *array);
.John Bode– John Bode2013年01月16日 17:03:05 +00:00Commented Jan 16, 2013 at 17:03 -
1@JohnBode - thanks, and those are good points. I have incorporated your comment into my answer.user53019– user530192013年01月16日 17:16:34 +00:00Commented Jan 16, 2013 at 17:16
-
1More precisely, most references to the array name
string
result in an implicit conversion tochar*
, pointing to the first element of the array. This loses the array bounds information. A function call is just one of the many contexts in which this happens.char *ptr = string;
is another. Evenstring[0]
is an example of this; the[]
operator works on pointers, not directly on arrays. Suggested reading: Section 6 of the comp.lang.c FAQ.Keith Thompson– Keith Thompson2013年01月17日 21:47:37 +00:00Commented Jan 17, 2013 at 21:47 -
Finally an answer that actually refers to the question!mastov– mastov2015年06月11日 11:48:54 +00:00Commented Jun 11, 2015 at 11:48
One thing that neither thread brings up is this:
char whopping_great[8192] = "foo";
vs.
char whopping_great[8192];
memcpy(whopping_great, "foo", sizeof("foo"));
The former will do something like:
memcpy(whopping_great, "foo", sizeof("foo"));
memset(&whopping_great[sizeof("foo")], 0, sizeof(whopping_great)-sizeof("foo"));
The latter only does the memcpy. The C standard insists that if any part of an array is initialized, it all is. So in this case, it's better to do it yourself. I think that may have been what treuss was getting at.
For sure
char whopping_big[8192];
whopping_big[0] = 0;
is better than either:
char whopping_big[8192] = {0};
or
char whopping_big[8192] = "";
p.s. For bonus points, you can do:
memcpy(whopping_great, "foo", (1/(sizeof("foo") <= sizeof(whopping_great)))*sizeof("foo"));
to throw a compile time divide by zero error if you're about to overflow the array.
I think the "bad practise" idea comes from the fact that this form :
char string[] = "october is a nice month";
makes implicitly a strcpy from the source machine code to the stack.
It is more efficient to handle only a link to that string. Like with :
char *string = "october is a nice month";
or directly :
strcpy(output, "october is a nice month");
(but of course in most code it probably doesn't matter)
-
Wouldn't it only make a copy if you attempt to modify it? I would think the compiler would be smarter than thatCole Tobin– Cole Tobin2015年09月02日 23:38:51 +00:00Commented Sep 2, 2015 at 23:38
-
2What about cases like
char time_buf[] = "00:00";
where you're going to be modifying a buffer? Achar *
initialized to a string literal is set to the address of the first byte, so trying to modify it results in undefined behavior because the method of the string literal's storage is unknown (implementation defined), while modifying the bytes of achar[]
is perfectly legal because the initialization copies the bytes to a writeable space allocated on the stack. To say that it's "less efficient" or "bad practice" without elaborating on the nuances ofchar* vs char[]
is misleading.Braden Best– Braden Best2016年08月08日 16:49:36 +00:00Commented Aug 8, 2016 at 16:49
Never is really long time, but you should avoid initialization char[] to string, because, "string" is const char*, and you are assigning it to char*. So if you pass this char[] to method who changes data you can have interesting behavior.
As commend said I mixed a bit char[] with char*, that is not good as they differs a bit.
There's nothing wrong about assigning data to char array, but as intention of using this array is to use it as 'string' (char *), it is easy to forget that you should not modify this array.
-
3Incorrect. The initialization copies the contents of the string literal into the array. The array object isn't
const
unless you define it that way. (And string literals in C are notconst
, though any attempt to modify a string literal does have undefined behavior.)char *s = "literal";
does have the kind of behavior you're talking about; it's better written asconst char *s = "literal";
Keith Thompson– Keith Thompson2013年01月17日 21:48:51 +00:00Commented Jan 17, 2013 at 21:48 -
3"And in generally "asdf" is a constant, so it should be declared as const." -- The same reasoning would call for a
const
onint n = 42;
, because42
is a constant.Keith Thompson– Keith Thompson2013年01月18日 08:21:15 +00:00Commented Jan 18, 2013 at 8:21 -
2It doesn't matter what machine you're on. The language standard guarantees that
c
is modifiable. It's exactly as strong a guarantee as the one that1 + 1
evaluates to2
. If the program I linked to above does anything other than printingEFGH
, it indicates a non-conforming C implementation.Keith Thompson– Keith Thompson2013年01月18日 15:40:16 +00:00Commented Jan 18, 2013 at 15:40 -
1@Dainus: the MSVC compiler has an optimisation called 'string pooling' which will put a single copy of identical strings into a read-only segment if it can guarantee that uses of them are read-only. Turn off the optimisation to see 'normal' behaviour. FYI the "Edit and Continue" requires this option to be on. More info here: msdn.microsoft.com/en-us/library/s0s0asdt.aspxJBRWilkinson– JBRWilkinson2013年05月06日 20:44:22 +00:00Commented May 6, 2013 at 20:44
-
1I think Dainius is suggesting that in many cases the error is that the variable itself should be marked
const char *const
to prevent modification of the bytes or the pointer itself, but in many cases programmers will leave one or both mutable allowing some runtime code to modify what appears to be a typed constant (but is not constant).T Percival– T Percival2015年03月20日 18:57:11 +00:00Commented Mar 20, 2015 at 18:57
Explore related questions
See similar questions with these tags.