[BUG REPORT]sed -e 's/[B-D]/_/g' replaces unexpected characters

Corinna Vinschen corinna-cygwin@cygwin.com
Tue Jun 25 19:44:00 GMT 2013


On Jun 25 18:03, Corinna Vinschen wrote:
> On Jun 25 15:38, Lavrentiev, Anton (NIH/NLM/NCBI) [C] wrote:
> > > Your locale is zh_CN.UTF-8. What you're expecting is only guaranteed
> > > in the C locale:
> > 
> > I'm not quite sure it applies here. I'm using US English Windows 7.
> > 
> > LANG = 'en_US.UTF-8'
> > 
> > I get the same result:
> > 
> > $ echo abcdeABCDE | sed -e 's/[B-D]/_/g'
> > ab__eA___E
> > 
> > BUT:
> > 
> > $ echo abcdeABCDE | LANG=C sed 's/[B-D]/_/g'
> > abcdeA___E
> > 
> > This is very weird, indeed.
> > 
> > OTOH, in Linux I have the same LANG setup, yet it does work
> > correctly:
> > 
> > > echo $LANG
> > en_US.UTF-8
> > > echo abcdeABCDE | sed -e 's/[B-D]/_/g'
> > abcdeA___E
> > 
> > I believe that an en_US UTF-8 string representation for
> > "abcdeABCDE" is not any different from ASCII.
>> Wrong. Try this:
>> $ sort
> a
> b
> c
> d
> e
> A
> B
> C
> D
> E
> <Ctrl-D>
> a
> A
> b
> B
> c
> C
> d
> D

Which also means, AFAICS, Cygwin's sed is doing it right, Linux' sed
is doing it wrong. Yes, that puzzles me a bit at the moment, too.
Corinna
-- 
Corinna Vinschen Please, send mails regarding Cygwin to
Cygwin Maintainer cygwin AT cygwin DOT com
Red Hat
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple


More information about the Cygwin mailing list

AltStyle によって変換されたページ (->オリジナル) /