musl/src/multibyte/btowc.c, branch master musl - an implementation of the standard library for Linux-based systems byte-based C locale, phase 1: multibyte character handling functions 2015年06月16日T05:28:48+00:00 Rich Felker dalias@aerifal.cx 2015年06月16日T04:44:17+00:00 1507ebf837334e9e07cfab1ca1c2e88449069a80 this patch makes the functions which work directly on multibyte characters treat the high bytes as individual abstract code units rather than as multibyte sequences when MB_CUR_MAX is 1. since MB_CUR_MAX is presently defined as a constant 4, all of the new code added is dead code, and optimizing compilers' code generation should not be affected at all. a future commit will activate the new code. as abstract code units, bytes 0x80 to 0xff are represented by wchar_t values 0xdf80 to 0xdfff, at the end of the surrogates range. this ensures that they will never be misinterpreted as Unicode characters, and that all wctype functions return false for these "characters" without needing locale-specific logic. a high range outside of Unicode such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's char16_t also needs to be able to represent conversions of these bytes, the surrogate range was the natural choice.
this patch makes the functions which work directly on multibyte
characters treat the high bytes as individual abstract code units
rather than as multibyte sequences when MB_CUR_MAX is 1. since
MB_CUR_MAX is presently defined as a constant 4, all of the new code
added is dead code, and optimizing compilers' code generation should
not be affected at all. a future commit will activate the new code.
as abstract code units, bytes 0x80 to 0xff are represented by wchar_t
values 0xdf80 to 0xdfff, at the end of the surrogates range. this
ensures that they will never be misinterpreted as Unicode characters,
and that all wctype functions return false for these "characters"
without needing locale-specific logic. a high range outside of Unicode
such as 0x7fffff80 to 0x7fffffff was also considered, but since C11's
char16_t also needs to be able to represent conversions of these
bytes, the surrogate range was the natural choice.
fix btowc corner case 2015年06月16日T04:21:38+00:00 Rich Felker dalias@aerifal.cx 2015年06月16日T04:21:38+00:00 38e2f727237230300fea6aff68802db04625fd23 btowc is required to interpret its argument by conversion to unsigned char, unless the argument is equal to EOF. since the conversion to produces a non-character value anyway, we can just unconditionally convert, for now.
btowc is required to interpret its argument by conversion to unsigned
char, unless the argument is equal to EOF. since the conversion to
produces a non-character value anyway, we can just unconditionally
convert, for now.
initial check-in, version 0.5.0 2011年02月12日T05:22:29+00:00 Rich Felker dalias@aerifal.cx 2011年02月12日T05:22:29+00:00 0b44a0315b47dd8eced9f3b7f31580cf14bbfc01

AltStyle によって変換されたページ (->オリジナル) /