lua-users home
lua-l archive

Lua 5.3.0-work2: When does utf8.offset work?

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


The manual says:
---
utf8.offset (s, n [, i])
Returns the byte index where the encoding of the n-th character of s starts,
counting from position i. A negative n gets characters before position i.
The default for i is 1. Returns nil if the subject does not have such character.
As a special case, when n is 0 the function returns the start of the encoding
of the character that contains the i-th byte of s.
This function assumes that s is a valid UTF-8 string.
---
Actually, the routine seems always to return something, even if s is not valid.
The result when n>0 seems to be correct if there are n-1 valid UTF-8 characters.
> s='voilà'
> #s
6
> utf8.offset(s,6)
7
> s=s:sub(1,-2).."\xFC"
> s
voil�
> utf8.offset(s,5)
5

AltStyle によって変換されたページ (->オリジナル) /