Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
From: "Soni \"They/Them\" L." <fakedme@...>
Date: 2018年7月10日 18:28:02 -0300

On 2018年07月10日 06:20 PM, Gregg Reynolds wrote:

You point being?

I mean, it's a joke, really, but if I were to actually redesign unicode,I'd throw away all those annoying character tables and encode them aspart of the bits.It would solve all practical problems with unicode. But we aren't gonnahave that, so we should instead stick with no unicode support for thetime being. At least until they finally decide that unicode was a hugemistake and restart the whole thing.

On Tue, Jul 10, 2018, 4:15 PM Soni "They/Them" L. <fakedme@gmail.com<mailto:fakedme@gmail.com>> wrote:

 On 2018年07月10日 05:31 PM, Gregg Reynolds wrote:
 >
 >
 > On Tue, Jul 10, 2018, 9:00 AM Dirk Laurie <dirk.laurie@gmail.com
 <mailto:dirk.laurie@gmail.com>
 > <mailto:dirk.laurie@gmail.com <mailto:dirk.laurie@gmail.com>>>
 wrote:
 >
 >     2018年07月10日 15:30 GMT+02:00 Lorenzo Donati
 >     <lorenzodonatibz@tiscali.it
 <mailto:lorenzodonatibz@tiscali.it>
 <mailto:lorenzodonatibz@tiscali.it
 <mailto:lorenzodonatibz@tiscali.it>>>:
 >
 >     > Unicode is great for typesetting (I use regularly LaTeX
 and it's
 >     fun to find
 >     > almost every symbol you may imagine, even ancient German runic
 >     scripts!),
 >     > but it sucks (IMHO) for general programming or
 computer-related
 >     stuff. Too
 >     > much mind overhead to use correctly for little gain.
 >
 >     Yes, yes, but — if you will allow me to return to Lua and
 UTF-8 —
 >     there would
 >     be more gain for a programmer if we had (if it is not too
 late already
 >     for Lua 5.4)
 >     utf8 versions of find, sub, match, gsub, gmatch, reverse. Just
 >     those, not asking
 >     for upper/lower, operating only on simple codepoints, no
 combining
 >     characters,
 >     no need for a C library.
 >
 >
 > Utf8 != Unicode. It's an encoding; you don't get to pick a
 subset and
 > still claim Unicode support.
 >
 > "Simple codepoints"? Does Unicode define that? If not, who decides
 > what that means? Zero-width space is pretty simple.
 >
 > No combining chars? Ok, but that would not be Unicode. Practical
 > result: massive confusion and complaining. You cannot accept
 Unicode
 > and reject combining chars.
 >
 >
 >
 >     utf8.find ("Hélène",'n')  --> 5 5
 >     utf8.sub ("Hélène",5)   --> 'ne'
 >     utf8.gsub ("Hélène","[éè]","e")  --> 'Helene' 2
 >     utf8.reverse ("Hélène")   --> 'enèléH'
 >
 https://gist.github.com/SoniEx2/ecd119507f160d9c26e3eabd9e012dc0

Follow-Ups:
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds

References:
- Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Hugo Musso Gualandi
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Axel Kittenberger
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Albert Chan
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Sean Conner
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Dirk Laurie
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Soni "They/Them" L.
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Gregg Reynolds

Prev by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Previous by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Index(es):
- Date
- Thread