Re: UTF-8 testing
[
Date Prev][
Date Next][
Thread Prev][
Thread Next]
[
Date Index]
[
Thread Index]
- Subject: Re: UTF-8 testing
- From: Javier Guerra Giraldez <javier@...>
- Date: Fri, 7 Jan 2011 09:55:05 -0500
On Fri, Jan 7, 2011 at 9:30 AM, Tony Finch <dot@dotat.at> wrote:
> That's incorrect. Codepoints in UTF-8 can be at most 4 octets long.
Unicode is defined at 32bit at most (i think), but UTF-8 needs more
that 4 octets to encode 32 bits. UTF-8 is defined up to 6 octets (5
'trailing' bytes on this snippet)
--
Javier