Message102089
| Author |
lemburg |
| Recipients |
dangra, ezio.melotti, lemburg, sjmachin |
| Date |
2010年04月01日.13:19:03 |
| SpamBayes Score |
0.0029944375 |
| Marked as misclassified |
No |
| Message-id |
<4BB49D46.7010209@egenix.com> |
| In-reply-to |
<1270123022.09.0.351284182872.issue8271@psf.upfronthosting.co.za> |
| Content |
John Machin wrote:
>
> John Machin <sjmachin@users.sourceforge.net> added the comment:
>
> Unicode has been frozen at 0x10FFFF. That's it. There is no such thing as a valid 5-byte or 6-byte UTF-8 string.
The UTF-8 codec was written at a time when UTF-8 still included
the possibility to have 5 or 6 bytes:
http://www.rfc-editor.org/rfc/rfc2279.txt
Use of those encodings has always raised an error, though. For error
handling purposes it still has to support those possibilities. |
|