lua-users home
lua-l archive

Re: Changes in the validation of UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


>>>>> "Dirk" == Dirk Laurie <dirk.laurie@gmail.com> writes:
 Dirk> Lua in no way even comes close to validating against the current
 Dirk> UTF-8 standard. We've been through this before. Marc Balmer in
 Dirk> particular has been quite trenchant on this point.
Other than the fact that it fails to reject encoded surrogates, what
invalid sequence does the code in lua 5.3.5 accept?
 Dirk> All that Lua does is to verify that a string satisfies the basic
 Dirk> UTF-8 encoding: ASCII or a starting byte whose introductory
 Dirk> string of 1's says how many bytes in total are being encoded,
 Dirk> followed by the right number of 10... bytes.
That's ... not what the 5.3.5 utf8_decode does. Did you read it? Test
it?
-- 
Andrew.

AltStyle によって変換されたページ (->オリジナル) /