Message299706
| Author |
Guillaume Sanchez |
| Recipients |
Guillaume Sanchez, Socob, benjamin.peterson, ezio.melotti, lemburg, loewis, mrabarnett, r.david.murray, serhiy.storchaka, steven.daprano, terry.reedy, vstinner |
| Date |
2017年08月03日.13:05:32 |
| SpamBayes Score |
-1.0 |
| Marked as misclassified |
Yes |
| Message-id |
<1501765532.54.0.799356866824.issue30717@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
I have a few criticism to do against that proto-PEP
http://mail.python.org/pipermail/python-dev/2001-July/015938.html
In particular, the fact that all those functions return an index prevents any state keeping.
That's a problem because:
> next_<indextype>(u, index) -> integer
As you've seen it, in grapheme clustering (as well as words and line breaking), we have to have an automaton to decide on the breaking point. Which means that starting at an arbitrary index is not possible.
> prev_<indextype>(u, index) -> integer
Is it really necessary? It means implementing the same logic to go backward. In our current case, we'd need a backward grapheme cluster break automaton too.
> <indextype>_start(u, index) -> integer
> <indextype>_end(u, index) -> integer
Not doable in O(1) for the same reason as next_<indextype>(). We need a context, and the code point itself cannot give enough information to know if it's the start/end of a given indextype. |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2017年08月03日 13:05:32 | Guillaume Sanchez | set | recipients:
+ Guillaume Sanchez, lemburg, loewis, terry.reedy, vstinner, benjamin.peterson, ezio.melotti, mrabarnett, steven.daprano, r.david.murray, serhiy.storchaka, Socob |
| 2017年08月03日 13:05:32 | Guillaume Sanchez | set | messageid: <1501765532.54.0.799356866824.issue30717@psf.upfronthosting.co.za> |
| 2017年08月03日 13:05:32 | Guillaume Sanchez | link | issue30717 messages |
| 2017年08月03日 13:05:32 | Guillaume Sanchez | create |
|