Message144683
| Author |
loewis |
| Recipients |
Arfrever, ezio.melotti, gvanrossum, loewis, tchrist, terry.reedy, vstinner |
| Date |
2011年09月30日.10:36:38 |
| SpamBayes Score |
3.0550922e-07 |
| Marked as misclassified |
No |
| Message-id |
<4E859BAF.2050505@v.loewis.de> |
| In-reply-to |
<1317353803.72.0.585410325489.issue12737@psf.upfronthosting.co.za> |
| Content |
> Martin, do you think that str.title() should follow the Unicode standard?
I don't think that "follow the Unicode standard" has any meaning in this
context: the Unicode standard doesn't specify (AFAIK) what a .title()
method in a programming language should do.
> Should string methods work with all the normalizations or just with NFC?
When we know what .title() should do, it should do so correctly for all
strings. I try to propose a definition for .title()
"Split S into words. Change the first letter in a word to upper-case,
and all subsequent letters to lower case. A word is a sequence that
starts with a letter, followed by letter-related characters."
Letters are all characters from the "Alphabetic" category, i.e.
Lu+Ll+Lt+Lm+Lo+Nl + Other_Alphabetic.
"letter-related" characters are letters + marks (Mn, Mc, Me). |
|