1.7.0-48: [BUG] Passing characters above 128 from bash command line
Edward Lam
edward@sidefx.com
Fri May 29 19:43:00 GMT 2009
IWAMURO Motonori wrote:
> I think that you should set "export LANG=en_US.ISO-8859-1" instead of
> "export LANG=LANG=en_US.ISO-8859-1".
Ah, sorry, copy/paste error. Yes, that finally works. Thank you!
I think there is still a bug here? I set LANG=C, then shouldn't be just
NOT doing any encoding, thus work? If I do this on Linux, it works. If I
use a cygwin compiled app, it also works.
-Edward
> 2009年5月30日 Edward Lam <edward@sidefx.com>:
>> IWAMURO Motonori wrote:
>>> The encoding of C locale is ASCII, and not ISO-8859-1.
>>> I don't think ASCII is the same as ISO-8859-1.
>>> Does it work on LANG=en_US.ISO-8859-1?
>> No, it doesn't. Mind you though, I haven't managed to get piconv to
>> recognize any of my LANG settings other than C in cygwin 1.7.
>>>> $ export LANG=LANG=en_US.ISO-8859-1
>>>> $ piconv
>> perl: warning: Setting locale failed.
>> perl: warning: Please check that your locale settings:
>> LC_ALL = (unset),
>> LANG = "LANG=en_US.ISO-8859-1"
>> are supported and installed on your system.
>>>> (... usage omitted...)
>>>> $ ./bug arg1 "before `cat copyright.txt` after" arg3
>> 0: E:\cygwin1.7\tmp\bug.exe
>> 1: arg1
>> 2: before
>>>> Regards,
>> -Edward
>>>>> 2009年5月29日 Edward Lam <edward@sidefx.com>:
>>>> Alexey Borzenkov wrote:
>>>>> On Thu, May 28, 2009 at 7:28 PM, Edward Lam <edward@sidefx.com> wrote:
>>>>>> PS. In case you haven't noticed, copyright.txt is not a long file. It
>>>>>> consists of a single byte, 0xA9.
>>>>> Did you try utf-8 encoding copyright.txt? Perhaps your locale is utf-8
>>>>> and the encoder fails.
>>>> How is one supposed to determine one's locale in cygwin? I do NOT have
>>>> LANG,
>>>> or any of the LC environment variables set. I even tried explicitly
>>>> setting
>>>> LANG=C and it still fails.
>>>>>>>> The problem does seem to stem from the new UTF-8 support in cygwin 1.7.
>>>> However, I think something is going on here that is unexpected because
>>>> trying something similar on Linux has no problems. To confirm that it was
>>>> an
>>>> UTF-8 related problem, let me repeat the steps slightly differently
>>>> again.
>>>> Here we assume that I've already got bug.exe compiled which simply prints
>>>> out its arguments.
>>>>>>>> $ export LANG=C
>>>>>>>> $ ./bug arg1 "before `cat copyright.txt` after" arg3
>>>> 0: E:\cygwin1.7\tmp\bug.exe
>>>> 1: arg1
>>>> 2: before
>>>>>>>> *Notice that argc is 3 when it should be 4!*
>>>>>>>> $ piconv -f iso-8859-1 -t utf8 < copyright.txt > fubar.txt
>>>>>>>> $ ./bug arg1 "before `cat fubar.txt` after" arg3
>>>> 0: E:\cygwin1.7\tmp\bug.exe
>>>> 1: arg1
>>>> 2: before © after
>>>> 3: arg3
>>>>>>>> *So now everything works because I converted the character into UTF-8.*
>>>>>>>> I think what this points to is some form of invalid source encoding of
>>>> the
>>>> command line argument when spawning NATIVE applications.
>>>>>>>> Here's what happens when I try to compile bug.c using cygwin's gcc:
>>>>>>>> $ gcc bug.c -o bug-gcc.exe
>>>>>>>> $ ./bug-gcc arg1 "before `cat copyright.txt` after" arg3
>>>> 0: ./bug-gcc
>>>> 1: arg1
>>>> 2: before © after
>>>> 3: arg3
>>>>>>>> So there seems to be some sort of special marshaling of the command line
>>>> arguments that only works when spawning cygwin apps, but breaks when
>>>> running
>>>> under native apps.
>>>>>>>> Regards,
>>>> -Edward
>>>>>>>> --
>>>> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
>>>> Problem reports: http://cygwin.com/problems.html
>>>> Documentation: http://cygwin.com/docs.html
>>>> FAQ: http://cygwin.com/faq/
>>>>>>>>>>>>>>>>>> --
>> Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
>> Problem reports: http://cygwin.com/problems.html
>> Documentation: http://cygwin.com/docs.html
>> FAQ: http://cygwin.com/faq/
>>>>>>>
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Problem reports: http://cygwin.com/problems.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/
More information about the Cygwin
mailing list