Loading...
Searching...
No Matches
unicode.cpp File Reference
#include "unicode.h"
#include "invariant.h"
#include <codecvt>
#include <cstdint>
#include <iomanip>
#include <locale>
#include <sstream>
+ Include dependency graph for unicode.cpp:
Go to the source code of this file.
Appends a unicode character to a utf8-encoded string.
Convert UTF8-encoded string to UTF-16 with architecture-native endianness.
Convert UTF8-encoded string to UTF-32 with architecture-native endianness.
Escapes non-printable characters, whitespace except for spaces, double quotes and backslashes.
Escapes non-printable characters, whitespace except for spaces, double- and single-quotes and backslashes.
Escapes non-printable characters, whitespace except for spaces, double quotes and backslashes.
Function Documentation
◆ codepoint_hex_to_utf16_native_endian()
char16_t codepoint_hex_to_utf16_native_endian
(
const std::string &
hex )
- Parameters
-
hex representation of a BMP codepoint as a four-digit string (e.g. "0041" for \u0041)
- Returns
- encoding of the codepoint as a single UTF-16 character in architecture-native endianness encoding
Definition at line 378 of file unicode.cpp.
◆ codepoint_hex_to_utf8()
std::string codepoint_hex_to_utf8
(
const std::string &
hex )
- Parameters
-
hex representation of a BMP codepoint as a four-digit string (e.g. "0041" for \u0041)
- Returns
- UTF-8 encoding of the codepoint
Definition at line 384 of file unicode.cpp.
◆ narrow() [1/2]
std::string narrow
(
const std::wstring &
s )
◆ narrow() [2/2]
◆ narrow_argv()
std::vector< std::string > narrow_argv
(
int
argc,
)
◆ utf16_append_code()
std::wstring &
result
)
static
◆ utf16_native_endian_to_java() [1/2]
- Parameters
-
ch UTF-16 character in architecture-native endianness encoding
- Returns
- String in US-ASCII format, with \uxxxx escapes for other characters
Definition at line 335 of file unicode.cpp.
◆ utf16_native_endian_to_java() [2/2]
std::ostringstream &
result,
)
static
Escapes non-printable characters, whitespace except for spaces, double- and single-quotes and backslashes.
This should yield a valid Java identifier.
- Parameters
-
ch UTF-16 character in architecture-native endianness encoding
result stream to receive string in US-ASCII format, with \uxxxx escapes for other characters
loc locale to check for printable characters
Definition at line 316 of file unicode.cpp.
◆ utf16_native_endian_to_java_string() [1/2]
std::string utf16_native_endian_to_java_string
(
const std::wstring &
in )
Escapes non-printable characters, whitespace except for spaces, double quotes and backslashes.
This should yield a valid Java string literal. Note that this specifically does not escape single quotes, as these are not required to be escaped for Java string literals.
- Parameters
-
in String in UTF-16 (native endianness) format
- Returns
- Valid Java string literal in US-ASCII format, with \uxxxx escapes for other characters
Definition at line 350 of file unicode.cpp.
◆ utf16_native_endian_to_java_string() [2/2]
std::ostringstream &
result,
)
static
Escapes non-printable characters, whitespace except for spaces, double quotes and backslashes.
This should yield a valid Java string literal. Note that this specifically does not escape single quotes, as these are not required to be escaped for Java string literals.
- Parameters
-
ch UTF-16 character in architecture-native endianness encoding
result stream to receive string in US-ASCII format, with \uxxxx escapes for other characters
loc locale to check for printable characters
Definition at line 272 of file unicode.cpp.
◆ utf16_native_endian_to_utf8() [1/2]
std::string utf16_native_endian_to_utf8
(
char16_t
utf16_char )
- Parameters
-
utf16_char UTF-16 character in architecture-native endianness encoding
- Returns
- UTF-8 encoding of the same codepoint
Definition at line 359 of file unicode.cpp.
◆ utf16_native_endian_to_utf8() [2/2]
std::string utf16_native_endian_to_utf8
(
const std::u16string &
utf16_str )
- Parameters
-
utf16_str UTF-16 string in architecture-native endianness encoding
- Returns
- UTF-8 encoding of the string
Definition at line 364 of file unicode.cpp.
◆ utf32_native_endian_to_utf8()
std::string utf32_native_endian_to_utf8
(
const std::basic_string<
char32_t > &
s )
- Parameters
-
s UTF-32 encoded wide string
- Returns
- utf8-encoded string with the same unicode characters as the input.
Definition at line 136 of file unicode.cpp.
◆ utf8_append_code()
std::string &
result
)
static
Appends a unicode character to a utf8-encoded string.
- parameters: character to append, string to append to
Definition at line 110 of file unicode.cpp.
◆ utf8_to_utf16_native_endian()
std::wstring utf8_to_utf16_native_endian
(
const std::string &
in )
Convert UTF8-encoded string to UTF-16 with architecture-native endianness.
- parameters: String in UTF-8 format
- Returns
- String in UTF-16 format. The encoding follows the endianness of the architecture iff swap_bytes is true.
Definition at line 191 of file unicode.cpp.
◆ utf8_to_utf32()
std::u32string utf8_to_utf32
(
const std::string &
utf8_str )
Convert UTF8-encoded string to UTF-32 with architecture-native endianness.
- parameters: String in UTF-8 format
- Returns
- String in UTF-32 format.
Definition at line 205 of file unicode.cpp.
◆ widen() [1/2]
◆ widen() [2/2]
std::wstring widen
(
const std::string &
s )