lua-users home
lua-l archive

Re: unicode support in lua

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 26 Apr 2007, at 13:35, David Kastrup wrote:
It may also be considered somewhat counterintuitive that the call
unicode.utf8.byte(unicode.utf8.char(5000))
returns 5000, something which naive people like myself would not
exactly choose to call a "byte".
string.char and string.byte are inverses, and it seems sensible to extend this inverse into the unicode.utf8 domain. When I implemented Lua in Java, strings were implemented using java.lang.String (so using Java's 16-bit unsigned char type). I took a similar position, string.byte returned an integer between 0 and 65535. string.byte should probably be named string.code to avoid any emotional attachment to byte. Whilst almost all bytes are 8-bit (octets), byte does have other meanings apart from 8-bit number. Google for "14-bit byte", etc.
David Jones

AltStyle によって変換されたページ (->オリジナル) /