Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]

Subject: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
From: Lorenzo Donati <lorenzodonatibz@...>
Date: 2018年7月10日 15:30:13 +0200

On 09/07/2018 22:06, Sean Conner wrote:

It was thus said that the Great Albert Chan once stated:

BTW, non-breaking "fake" space in filename is a bad idea.
http://boston.conman.org/2018/02/28.2

 What's bad about it?
 -spc (Or are you in the "no spaces in filename" camp?)

Well, it's a nice and smart trick, but I'm in the camp of "If I see aspace, I want to know it's a space (0x20)".Moreover it is not ASCII, and I tend to avoid non-ASCII names. I workmainly on Windows, which doesn't support UTF-8 natively the way Linuxdoes. So handling characters outside the ASCII set may be a nightmare.BTW, try handling such a file to someone (especially a non-programmer)who is unaware and see the puzzlement in his eyes when he cannotunderstand why he cannot delete "my invoice.pdf" using the command lineor why he sees "my invoice.pdf" and "my invoice.pdf" in the samedirectory!!!If you want to be very evil, put /two/ spaces between words, where thefirst is ASCII 0x20 and the second is char 160!Moreover, I've the gut feeling that there are plenty of badly-writtenWindows programs/scripts that will choke on char 160 when some code-pageis set differently than the way the programmer has assumed.Yes, underscore is not nice to see, but at least I see exactly whatcharacter is that (well, in ASCII at least. I'm sure there is someobscure UNICODE code point that is almost identical to an underscore andthat will appear identical in some font!)Unicode is great for typesetting (I use regularly LaTeX and it's fun tofind almost every symbol you may imagine, even ancient German runicscripts!), but it sucks (IMHO) for general programming orcomputer-related stuff. Too much mind overhead to use correctly forlittle gain.

<disclaimer>

I know my view is a bit "western-centric" (or "latin-centrinc") andpeople speaking languages who need thousands of symbols to be writtenmight think differently (especially Asian languages).

Anyway I'm curious to know how, say, Chinese programmers view the thing.

Would they find coding more "easy" if they could write programs usingideograms or do they think using transliteration of their words in aLatin alphabet.BTW, I code religiously "in English" (even comments), and I teach mystudents to try to do so, but I understand that sometimes this requireshigher English skills than many programmers have.

</disclaimer>

On a related note: sometimes I've dreamt of having an universally*standardized* "extended ASCII" charset for programming, without all thehuman-language-related stuff of unicode. A 16 bit "universal" charsetshould be big enough to accomodate any symbol useful in programming(e.g. common math operators and symbols, greek letters, currencysymbols), but I'm digressing. :-)

 Cheers!
-- Lorenzo

Follow-Ups:
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Dirk Laurie
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, 云风 Cloud Wu

References:
- Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Hugo Musso Gualandi
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Alysson Cunha
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Axel Kittenberger
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Lorenzo Donati
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Albert Chan
- Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8, Sean Conner

Prev by Date: Subject: [ANN] Lua 5.3.5 now available
Next by Date: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Previous by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Next by thread: Re: Issues: Character 160 - Non-breaking space + Additional Issue with UTF-8
Index(es):
- Date
- Thread