I am printing some ASCII art to the Serial monitor from Arduino UNO, with some success. However, using string literals uses more memory than I would like. I wanted to try constructing the strings/chars from other data types so that I can manipulate the data and print the ascii art without storing it in string literals.
However, I have found that there seems to be no way to print UTF-8 characters other than from a string literal. Is this the case? Is there no way to construct a string containing characters that have numeric values that are too big for char
?
As an example, to print "▓" works fine as a String literal but, it seams, no other way.
Serial.println("▓"); // works fine
Serial.println('▓'); // char can't store value
Serial.println(String('▓')); // char can't store value
Serial.println(0x2593); // just prints the numeric value
Serial.println((char)0x2593); // char can't store value
Serial.println((wchar_t)0x2593); // doesn't work
Serial.println(String(0x2593)); // doesn't work
Serial.println(String((wchar_t)0x2593)); // doesn't work
Similarily with write() instead of print():
Serial.write("▓"); // Works fine
Serial.write('▓'); // char can't store value
Serial.write(0x2593); // just prints the numeric value
Serial.write((char)0x2593); // char can't store value
Serial.write((wchar_t)0x2593); // doesn't work
I also tried deriving new component strings from a string literal using substring()
and charAt()
. Neither works. Both produce � in the output.
String((char)65)
constructs the string "A" from the numeric value 65. Neither String((char)0x2593)
nor String((wchar_t)0x2593)
produce the desired results. Is there a way to construct a string from numeric values that are too big to store in 'char'?
1 Answer 1
As you have noticed, Serial
doesn't know how to deal with wchar_t
.
If you are building your strings algorithmically from Unicode code
points, you need to convert those code points to UTF-8 for printing. I
am not aware of any built-in function that does that. You may want to
search the library manager for a library providing this functionality.
Alternatively, you could write the conversion yourself: it is not that complicated. For example, here is a function that converts any code point from the BMP (i.e. < 216) to an UTF-8 string:
// A BMP multi-byte character, with terminating NUL byte.
struct Mbchar {
char utf8[4];
};
// Convert a wide character to UTF-8. Only works within the BMP.
Mbchar wchar_to_utf8(wchar_t c) {
Mbchar result;
if (c < 128) { // 0xxx.xxxx
result.utf8[0] = c;
result.utf8[1] = 0;
} else if (c < 2048) { // 110x.xxxx 10xx.xxxx
result.utf8[0] = 0xc0 | (c >> 6);
result.utf8[1] = 0x80 | (c & 0x3f);
result.utf8[2] = 0;
} else { // 1110.xxxx 10xx.xxxx 10xx.xxxx
result.utf8[0] = 0xe0 | (c >> 12);
result.utf8[1] = 0x80 | (c >> 6 & 0x3f);
result.utf8[2] = 0x80 | (c & 0x3f);
result.utf8[3] = 0;
}
return result;
}
Keep in mind that, depending on the Arduino you are using, wchar_t
may
not support characters outside the BMP. The AVR-based Arduinos
definitively don't.
This function can be used like this:
Serial.println(wchar_to_utf8(0x2593).utf8); // prints "▓"
-
1Might be worth adding that their
'▓'
usage will not produce awchar_t
type, it'll just produce some uselesschar
value. They'd needL'▓'
which they should be able to use in conjunction with your code inSerial.println(wchar_to_utf8(L'▓').utf8);
timemage– timemage2024年01月05日 17:31:39 +00:00Commented Jan 5, 2024 at 17:31
Serial.println("\u2593");