The spec says the following for creating an object using an existing storage - [intro.object#3] :
If a complete object is created ([expr.new]) in storage associated with another object e of type "array of N unsigned char" or of type "array of N std::byte" ([cstddef.syn]), that array provides storage for the created object [...]
[basic.types.general#4] , also refers specifically to:
sequence of N unsigned char objects
as the potentially valid object representation of a complete object T.
From above paragraphs, one can assume that creating an object in storage associated with another object e, which is NOT of type "array of N unsigned char" or "array of N std::byte", would be undefined behavior, for not being defined by the spec as valid.
Another good reason for preferring unsigned char*
over char*
for binary buffers, is to avoid any confusion between buffer of chars that is aimed for managing text vs. buffer of chars for managing bytes, for binary usage.
Given the above, why did the spec use char*
for read
and write
stream functions instead of unsigned char*
?
[input.output#iosfwd.syn] - setting the charT template argument to be char:
using ostream = basic_ostream<char>;
using iostream = basic_iostream<char>;
using char_type = charT;
basic_istream& read(char_type* s, streamsize n);
using char_type = charT;
basic_ostream& write(const char_type* s, streamsize n);
If char* is not valid or not recommended for a buffer, why not use unsigned char*
here?
1 Answer 1
Stroustrup explains in his book "The design and evolution of C++" one of the key design rules:
Always provide a transition path: C++ must grow gradually to serve its users and benefit from their feedback. This implies that great care must be taken to ensure that older code continues to work. (...) The general strategy for eliminating an unsafe (..) language feature is to first provide a better alternative, then recommend that people avoid the old feature or technique, and only years later - if at all - remove the offending feature.
For decades, people used to work with char
based streams for file io (in the first edition of "The C++ programming language", when templates did not even exist, streams where already defined with char*
). There are now billions of lines of code out there that rely on this assumption, and nobody really wants to review it all if it is not absolutely necessary. I think this weights heavily on any decision to change the standard library for the sake of definitions.
The use of unsigned char
in the definition of the object representation is already relatively old. It was introduced with C++11. In C++98 it was still referred to " char
or unsigned char
". So, indeed it would be tempting to align the standard library. But there are plenty of uses cases where people use really expect streams to deal characters. It would then be difficult to distinguish when to use the one rather than the other. (I also personally believe it doesn't change so much our reality, to write binary chars or unsigned chars, and that as long as you don't do math on them or print them as integers; but perhaps I'm too naive).
Annoter important design principle that Stroustrup gives :
C++'s evolution must be driven by real problems: (...) The right motivation for a change to C++ is for several programmes to demonstrate how the language is insufficiently expressive for their projects. (...)
Here, I would argue that the language expressivity is sufficient, and rather than changing the definition of istream
and ostream
, if you think it's essential, nothing prevents you from fully using the available template arguments, for example:
// define your own serialization stream types:
using slz_ostream = basic_ostream<unsigned char>;
using slz_iostream = basic_iostream<unsigned char>;
Comments
Explore related questions
See similar questions with these tags.
open
,close
,read
andwrite
(or, in some non-UNIX systems, C'sfopen
,fclose
,fread
andfwrite
). All of those dealt withchar
s, so iostreams did too (even though as things deveoped in both C and C++,unsigned char
would probably have been a better fit for both iostreams andfread
,fwrite
, etc.) But by the time that was obvious, they were in too wide of use to change.