Message122296
| Author |
belopolsky |
| Recipients |
belopolsky, eric.smith, pitrou |
| Date |
2010年11月24日.19:06:15 |
| SpamBayes Score |
3.024129e-08 |
| Marked as misclassified |
No |
| Message-id |
<AANLkTi=_vgBhVBnt3+4DVt7xOPEruyA=GssoZcyL_Cjx@mail.gmail.com> |
| In-reply-to |
<1290612823.89.0.808626412832.issue10521@psf.upfronthosting.co.za> |
| Content |
On Wed, Nov 24, 2010 at 10:33 AM, Antoine Pitrou <report@bugs.python.org> wrote:
..
> The question is, what should it do with such an input?
I think the rule for such functions should be that if
input.encode('utf-8') is the same on wide and narrow builds, then the
output.encode('utf-8') should be the same.
> Pretend it's a single char (but other chars in the source string won't get the same treatment)?
Yes, *and* surrogate pairs in the source string should count for one
char as well.
> Treat it as a two-char string (but then center() and friends should logically be
> extended to accept strings of arbitrary lengths)?
No. For better or worse, on wide builds these methods effectively
operate on code points. They don't interpret multi-code-point-
graphemes or take grapheme width into account:
--------------------
123
--------------------
Application code has to ascertain that it is dealing with with fixed
width characters in the target font before using these methods for
text alignment. |
|