class template
<codecvt>

std::codecvt_utf8_utf16

template < class Elem, unsigned long MaxCode = 0x10ffffUL, codecvt_mode Mode = (codecvt_mode)0 > class codecvt_utf8_utf16 : public codecvt <Elem, char, mbstate_t>
Convert between UTF-8 and UTF-16

Converts between multibyte sequences encoded in UTF-8 and UTF-16.

The facet uses Elem as its internal character type (encoded as UTF-16), and char as its external character type (encoded as UTF-8). Therefore:

Template parameters

Elem
The internal character type, aliased as member intern_type. This shall be a wide character type: wchar_t, char16_t or char32_t.
For 32bit-wide characters, conversions in of characters result in one UTF-16 code unit stored per wide character (as a 32-bit value).
The external character type in this facet is always char.
MaxCode
The largest code point that will be translated without reporting a conversion error.
Mode
Bitmask value of type codecvt_mode:
labelvaluedescription
consume_header4An optional initial header sequence (BOM) is read to determine whether a multibyte sequence converted in is big-endian or little-endian.
generate_header2An initial header sequence (BOM) shall be generated to indicate whether a multibyte sequence converted out is big-endian or little-endian.
little_endian1The multibyte sequence generated on conversions out shall be little-endian (as opposed to the default big-endian).

Member types

The following aliases are member types of codecvt_utf8_utf16, inherited from codecvt:

member typedefinitionnotes
intern_typeThe first template parameter (Elem)The internal character type (encoded as UTF-16).
extern_typecharThe external character type (encoded as UTF-8).
state_typembstate_tConversion state type (see mbstate_t).
result codecvt_base::result Enum type with the result of a conversion operation (see codecvt_base::result ).

Public member functions inherited from codecvt

(constructor)
codecvt constructor (public member function)

Conversion functions:
in
Translate in characters (public member function)
out
Translate out characters (public member function)
unshift
Unshift translation state (public member function)

Character encoding properties:
always_noconv
Return noconv characteristics (public member function)
encoding
Return encoding width (public member function)
length
Return length of translated sequence (public member function)
max_length
Return max length of one character (public member function)

Virtual protected member functions

The class defines its functionality through its virtual protected member functions:
member functionbehavior in codecvt_utf16
do_always_no_conv Returns 0 (not all conversions will yield a noconv result).
do_encoding Returns 0 (the external encoding is not fixed-width).
do_in Converts from UTF-8 to UTF-16.
do_length Returns length (for codecvt::length).
do_max_length Returns the maximum length (in bytes) of a code point.
do_out Converts from UTF-16 to UTF-8.
do_unshift Brings the mbstate_t object to an initial state.
(destructor) Releases resources.

Example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// codecvt_utf8_utf16 example
#include <iostream>
#include <locale>
#include <string>
#include <codecvt>
int main ()
{
 std::wstring_convert<std::codecvt_utf8_utf16<char16_t>,char16_t> conversion;
 std::string mbs = conversion.to_bytes( u"\u4f60\u597d" ); // ni hao (你好)
 // print out hex value of each byte:
 std::cout << std::hex;
 for (int i=0; i<mbs.length(); ++i)
 std::cout << int(unsigned char(mbs[i])) << ' ';
 std::cout << '\n';
 return 0;
}

Output:

e4 bd a0 e5 a5 bd 


See also

codecvt
Convert codeset facet (class template)
codecvt_utf8
Convert UTF-8 (class template)
codecvt_utf8_utf16
Convert between UTF-8 and UTF-16 (class template)

AltStyle によって変換されたページ (->オリジナル) /