Re: unicode support in lua
[
Date Prev][
Date Next][
Thread Prev][
Thread Next]
[
Date Index]
[
Thread Index]
- Subject: Re: unicode support in lua
- From: David Jones <drj@...>
- Date: 2007年4月26日 14:38:12 +0100
On 26 Apr 2007, at 13:35, David Kastrup wrote:
It may also be considered somewhat counterintuitive that the call
unicode.utf8.byte(unicode.utf8.char(5000))
returns 5000, something which naive people like myself would not
exactly choose to call a "byte".
string.char and string.byte are inverses, and it seems sensible to
extend this inverse into the unicode.utf8 domain.
When I implemented Lua in Java, strings were implemented using
java.lang.String (so using Java's 16-bit unsigned char type). I took
a similar position, string.byte returned an integer between 0 and 65535.
string.byte should probably be named string.code to avoid any
emotional attachment to byte.
Whilst almost all bytes are 8-bit (octets), byte does have other
meanings apart from 8-bit number. Google for "14-bit byte", etc.
David Jones