1.3.18: BUG: Piping DOS files to grep (v2.5) doesn't work properl y

Stacey Sheldon ssheldon@catena.com
Thu Jan 16 06:31:00 GMT 2003


Mailing list search didn't find this, nor does it appear
in the FAQ... hopefully this isn't old news to all of you.
Files read from a pipe are treated differently by grep
than files read directly. This results in some unexpected
(by me) behaviour when using grep on files which use
the a DOS line-end (cr/nl). This looks like a bug to me.
I'd expect the following commands to have equivalent
results:
 grep myregex blah
 grep myregex < blah
 cat blah | grep myregex
They are equivalent when the regular file blah uses
Unix line ends, but they differ for a file blahdos which
uses DOS line ends. It appears to me as though grep
is treating its input as binary when reading from a pipe,
but correctly using "undossify_input()" in other cases.
Here is an example. I've created two files, blah (nl line-end)
and blahdos (cr/nl line-end).
 $ cat blah
 foobarTest
 $ od -Ax -a blah
 000000 f o o b a r T e s t nl
 00000b
 $ od -Ax -a blahdos
 000000 f o o b a r T e s t cr nl
 00000c
These files should match the regex 'Test$' in all cases,
but grep on blahdos fails for this case:
 $ cat blahdos | grep 'Test$'
 $
And here's why (not the -v to invert the match so we have
something to look at):
 $ cat blahdos | grep -v 'Test$' | od -Ax -a
 000000 f o o b a r T e s t cr nl
 00000c
There's still a cr/nl on the output which wouldn't be there if
grep had interpreted its input as having DOS line ends. Here's
what a successful grep of the UNIX line end file looks like:
 $ cat blah | grep 'Test$' | od -Ax -a
 000000 f o o b a r T e s t nl
 00000b
In fact, if I read the blahdos file in any other way except through
a pipe, it successfully matches (note the stripped out cr on the output):
 $ grep 'Test$' blahdos | od -Ax -a
 000000 f o o b a r T e s t nl
 00000b
 $ grep 'Test$' < blahdos | od -Ax -a
 000000 f o o b a r T e s t nl
 00000b
Just in case you might think that this has something to do with cat
(I did), here's the output of cat for each file:
 $ cat blah | od -Ax -a
 000000 f o o b a r T e s t nl
 00000b
 $ cat blahdos | od -Ax -a
 000000 f o o b a r T e s t cr nl
 00000c
Using head instead of cat gives the same results as well, just to 
completely remove cat from the picture.
I'm currently running these versions of tools on win2k:
 cygwin 1.3.18-1
 textutils 2.0.21 (cat, od, head)
 grep 2.5
 bash 2.05b.0(8)-release
I also tried this out with cygwin 1.3.17-1 with identical results.
If you need any further information, please cc me directly since I
don't read the mailing lists very often.
Stacey.
--
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
Bug reporting: http://cygwin.com/bugs.html
Documentation: http://cygwin.com/docs.html
FAQ: http://cygwin.com/faq/


More information about the Cygwin mailing list

AltStyle によって変換されたページ (->オリジナル) /