Grapheme clusters, a.k.a.real characters

Steven D'Aprano steve at pearwood.info
Sun Jul 16 01:44:38 EDT 2017


On 2017年7月16日 12:33:10 +1000, Ben Finney wrote:
> And yet the ASCII and Unicode standard says code point 0x0A (U+000A LINE
> FEED) is a character, by definition.
[...]
> > Is an acute accent a character?
>> Yes, according to Unicode. ‘´’ (U+0301 ACUTE ACCENT) is a character.

Do you have references for those claims?
Because I'm pretty sure that Unicode is very, very careful to never use 
the word "character" in a formal or normative manner, only as an informal 
term for "the kinds of things that regular folk consider letters or 
characters or similar".
And I don't think regular folks would know what a line feed was if it 
jumped out of their computer and bit them :-) They would know what an 
accent is, and I doubt they would consider an accent not on a base letter 
to be a character. (I know I don't.)
-- 
Steve


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /