I'm using Visual Studio and C++ on Windows to work with small caps text like ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ using e.g. this website. Whenever I read this text from a file or put this text directly into my source code using std::string, the text visualizer in Visual Studio shows it in the wrong encoding, presumably the visualizer uses Windows (ANSI). How can I force Visual Studio to let me work with UTF-8 strings properly?
std::string message_or_file_path = "...";
auto message = message_or_file_path;
// If the file path is valid, read from that file
if (GetFileAttributes(message_or_file_path.c_str()) != INVALID_FILE_ATTRIBUTES
&& GetLastError() != ERROR_FILE_NOT_FOUND)
{
std::ifstream file_stream(message_or_file_path);
std::string text_file_contents((std::istreambuf_iterator<char>(file_stream)),
std::istreambuf_iterator<char>());
message = text_file_contents; // Displayed in wrong encoding
message = "ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ"; // Displayed in wrong encoding
std::wstring wide_message = L"ʜᴇʟʟᴏ ꜱᴛᴀᴄᴋᴏᴠᴇʀꜰʟᴏᴡ"; // Displayed in correct encoding
}
I tried the additional command line option /utf-8 for compiling and setting the locale:
std::locale::global(std::locale(""));
std::cout.imbue(std::locale());
Neither of those fixed the encoding issue.
2 Answers 2
From What’s Wrong with My UTF-8 Strings in Visual Studio?, there are a couple of ways to see the contents of a std::string with UTF-8 encoding.
Let's say you have a variable with the following initialization:
std::string s2 = "\x7a\xc3\x9f\xe6\xb0\xb4\xf0\x9f\x8d\x8c";
Use a Watch window.
- Add the variable to Watch.
- In the Watch window, add
,s8to the variable name to display its contents as UTF-8.
Here's what I see in Visual Studio 2015.
Use the Command Window.
- In the Command Window, use
? &s2[0],s8to display the text as UTF-8.
Here's what I see in Visual Studio 2015.
3 Comments
std::string object to the clipboard and when I paste it, it's screwed up. With a std::wstring version it works fine.A working solution was simply rewriting all std::strings as std::wstrings and adjusting the code logic properly to work with std::wstrings, as indicated in the question as well. Now everything works as expected.
Comments
Explore related questions
See similar questions with these tags.
std::ifstreamin binary mode to avoid any data conversions while reading thechars. That will at least ensure thestd::stringhas the correct bytes. That doesn't mean the IDE will display it correctly, though. Otherwise, usestd::wstringinstead, as you already discovered. You can read it with astd::wifstreamthat has a UTF-8 localeimbue()'ed into it. Or read the raw bytes first and then useMultiByteToWideChar()orstd::wstring_convertto convert the bytes tostd:::wstring