lua-users home
lua-l archive

Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


It was thus said that the Great Gregg Reynolds once stated:
> On Tue, Jul 10, 2018, 4:44 PM Dirk Laurie <dirk.laurie@gmail.com> wrote:
> ...
> 
> >
> > I. Am. Not. Asking. For. Unicode.
> >
> > I am merely asking for extra functions along the lines of what the
> > utf8 library already does.
> > E.g. Sam's examples:
> >
> > > s1 = "Hélène"
> > > s2 = "Hélène"
 They look similar, but they are construct differently.
> FYI these look identical on Android.
> 
> > > utf8.len(s1)
> > 6
> > > utf8.len(s2)
> > 7
> >
> > If you really not understand what I mean, I can elaborate.
> 
> Please do.
> 
> What does "len" mean? Number of Unicode chars ot number of bytes?
 The number of Unicode code points. The second one has a letter 'e'
followed by a combining accent (I'm not sure which accent is the combining
one), thus the different number of Unicode code points.
 -spc

AltStyle によって変換されたページ (->オリジナル) /