[Python-Dev] Divorcing str and unicode (no more implicit conversions).

Neil Hodgson nyamatongwe at gmail.com
Tue Oct 25 01:13:51 CEST 2005


M.-A. Lemburg:
> Unicode has the concept of combining code points, e.g. you can
> store an "é" (e with a accent) as "e" + "'". Now if you slice
> off the accent, you'll break the character that you encoded
> using combining code points.
> ...
> next_<indextype>(u, index) -> integer
>> Returns the Unicode object index for the start of the next
> <indextype> found after u[index] or -1 in case no next element
> of this type exists.

 Should entity breakage be further discouraged by returning a slice
here rather than an object index?
 Something like:
i = first_grapheme(u)
x = 0
while x < width and u[i] != "\n":
 x, _ = draw(u[i], (x, y))
 i = next_grapheme(u, i)
 Neil


More information about the Python-Dev mailing list

AltStyle によって変換されたページ (->オリジナル) /