Message155370
| Author |
loewis |
| Recipients |
Arfrever, Nicholas.Cole, ezio.melotti, inigoserna, loewis, poq, tchrist, vstinner, zeha |
| Date |
2012年03月11日.03:14:51 |
| SpamBayes Score |
1.9291284e-09 |
| Marked as misclassified |
No |
| Message-id |
<1331435693.42.0.0207181032933.issue12568@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Tom: I don't think Unicode::GCString implements UAX#11 correctly (but this is really out of scope of this issue). In particular, it contains an ad-hoc decision to introduce the EA_Z east-asian width that UAX#11 doesn't talk about.
In most cases, it's probably reasonable to introduce this EA_Z feature. However, there are some significant deviations from UAX#11 here:
- combining characters are given EA_Z in sombok/data/custom.pl, even though UAX#11 assigns A or N. UAX#11 points out that the advance width depends on whether or not the terminal performs character combination or not. It's not clear whether Unicode::GCString aims for "strict" UAX#11, or "advance width".
- control characters are also given EA_Z, even though UAX#11 gives them EA_N. In this case, it's neither UAX#11 width nor advance width since control characters will have various effects on the terminal (in particular for the tab character) |
|