Package: grep;
Reported by: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it>
Date: Sun, 8 Nov 2015 21:58:02 UTC
Severity: wishlist
To reply to this bug, email your comments to 21865 AT debbugs.gnu.org.
the display of automated, internal messages from the tracker.
View this report as an mbox folder, status mbox, maintainer mbox
bug-grep <at> gnu.org:bug#21865; Package grep.
(2015年11月08日 21:58:02 GMT) Full text and rfc822 format available.Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it>:bug-grep <at> gnu.org.
(2015年11月08日 21:58:02 GMT) Full text and rfc822 format available.Message #5 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> To: bug-grep <at> gnu.org Subject: Parenthesis subexpressions Date: 2015年11月08日 21:42:44 +0100
[Message part 1 (text/plain, inline)]
Hi,
(First time in a GNU mailing list!)
I've already asked this question to my local GNU/Linux user group and in #grep <at> Freenode... I'm still confused.
GNU Grep don't have an arg to choose the subexpression. Right?
Stupid e.g.:
echo abcde | grep -o -E 'b([a-z])d'
=> "bcd"
What if I want the first subexpression? ("b")? GNU Grep can't do it. Isn't it? (Why?)
I actually use GNU Awk, or GNU Bash with $BASH_REMATCH[$n_sub].
Thank you for the clarification!
--
Valerio Bozzolan
Email sent from Android (CyanogenMod) using K-9 Mail.
[Message part 2 (text/html, inline)]
bug-grep <at> gnu.org:bug#21865; Package grep.
(2015年11月08日 22:36:02 GMT) Full text and rfc822 format available.Message #8 received at submit <at> debbugs.gnu.org (full text, mbox):
From: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> To: bug-grep <at> gnu.org Subject: Re: Parenthesis subexpressions Date: 2015年11月08日 21:49:03 +0100
[Message part 1 (text/plain, inline)]
Sorry... typo...
echo abcde | grep -o -E 'b([a-z])d'
=> "bcd"
Can't I choose to have only "c"?
Thanks again!
On 8 November 2015 21:42:44 CET, Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> wrote:
>Hi,
>
>(First time in a GNU mailing list!)
>
>I've already asked this question to my local GNU/Linux user group and
>in #grep <at> Freenode... I'm still confused.
>
>GNU Grep don't have an arg to choose the subexpression. Right?
>
>Stupid e.g.:
> echo abcde | grep -o -E 'b([a-z])d'
> => "bcd"
>
>What if I want the first subexpression? ("b")? GNU Grep can't do it.
>Isn't it? (Why?)
>
>I actually use GNU Awk, or GNU Bash with $BASH_REMATCH[$n_sub].
>
>Thank you for the clarification!
>--
>Valerio Bozzolan
>Email sent from Android (CyanogenMod) using K-9 Mail.
--
Valerio Bozzolan
Email sent from Android (CyanogenMod) using K-9 Mail.
http://boz.reyboz.it
[Message part 2 (text/html, inline)]
bug-grep <at> gnu.org:bug#21865; Package grep.
(2015年11月09日 13:52:01 GMT) Full text and rfc822 format available.Message #11 received at 21865 <at> debbugs.gnu.org (full text, mbox):
From: Stephane Chazelas <stephane.chazelas <at> gmail.com> To: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> Cc: 21865 <at> debbugs.gnu.org Subject: Re: bug#21865: Parenthesis subexpressions Date: Mon, 9 Nov 2015 13:50:46 +0000
2015年11月08日 21:49:03 +0100, Valerio Bozzolan: > Sorry... typo... > > echo abcde | grep -o -E 'b([a-z])d' > => "bcd" > > Can't I choose to have only "c"? [...] That's correct, GNU grep doesn't have that capability (yet). Recent versions of pcregrep do: $ echo abc | pcregrep -o1 '.(.).' b Now, I'm not a GNU grep maintainer but I suppose the question is how far do we want to take grep away from its original purpose (print the lines that match a pattern which is what g/re/p stands for). GNU grep is already doing find's job with -r, part of sed's job with -o/--colour. Having said that, I do agree it's the logical continuation after -o. Note that for now, you can already do: $ echo abcde | grep -o -P 'b\K[a-z](?=d)' c -- Stephane
bug-grep <at> gnu.org:bug#21865; Package grep.
(2015年11月09日 16:05:01 GMT) Full text and rfc822 format available.Message #14 received at 21865 <at> debbugs.gnu.org (full text, mbox):
From: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> Cc: 21865 <at> debbugs.gnu.org Subject: Re: bug#21865: Parenthesis subexpressions Date: 2015年11月09日 17:03:39 +0100
[Message part 1 (text/plain, inline)]
Thanks for agreeing with the evolution of the meaning of "-o". Just to make you a laugh: I was reproducing egrep with $BASH_REMATCH: https://gist.github.com/valerio-bozzolan/6787675e931dce1ba7e9 Definitely not beautiful... but really effective for me. So something like "egrep -o $n regex" also can save the world from code similar to mine. On 9 November 2015 14:50:46 CET, Stephane Chazelas <stephane.chazelas <at> gmail.com> wrote: >2015年11月08日 21:49:03 +0100, Valerio Bozzolan: >> Sorry... typo... >> >> echo abcde | grep -o -E 'b([a-z])d' >> => "bcd" >> >> Can't I choose to have only "c"? >[...] > >That's correct, GNU grep doesn't have that capability (yet). >Recent versions of pcregrep do: > >$ echo abc | pcregrep -o1 '.(.).' >b > >Now, I'm not a GNU grep maintainer but I suppose the question is >how far do we want to take grep away from its original purpose >(print the lines that match a pattern which is what g/re/p >stands for). > >GNU grep is already doing find's job with -r, part of sed's job >with -o/--colour. > >Having said that, I do agree it's the logical continuation after >-o. > >Note that for now, you can already do: > >$ echo abcde | grep -o -P 'b\K[a-z](?=d)' >c > > >-- >Stephane -- Valerio Bozzolan Email sent from Android (CyanogenMod) using K-9 Mail.
[Message part 2 (text/html, inline)]
bug-grep <at> gnu.org:bug#21865; Package grep.
(2015年11月09日 17:28:02 GMT) Full text and rfc822 format available.Message #17 received at 21865 <at> debbugs.gnu.org (full text, mbox):
From: Stephane Chazelas <stephane.chazelas <at> gmail.com> To: Valerio Bozzolan <bozzolan.valerio <at> educ.di.unito.it> Subject: Re: bug#21865: Parenthesis subexpressions Date: Mon, 9 Nov 2015 17:16:19 +0000
2015年11月09日 17:00:36 +0100, Valerio Bozzolan: > Thanks for agreeing with the evolution of the meaning of "-o". > > Just to make you a laugh: I was reproducing egrep with $BASH_REMATCH: > https://gist.github.com/valerio-bozzolan/6787675e931dce1ba7e9 > > Definitely not beautiful... but really effective for me. You may want to read: https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice https://unix.stackexchange.com/questions/209123/understand-ifs-read-r-line https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo Here, if there wasn't a pcregrep already, I'd rather do it in perl or GNU sed than bash. Like: perl -lne 'print for /a([a-z])c/g' Also note that: echo abac | pcregrep -o1 'a(.)' b c > So something like "egrep -o $n regex" also can save the world from code similar to mine. GNU grep can't add it like that as that would break backward compatibility. grep -o 1 regex file is currently meant to print the occurrences of "1" in the "regex" and "file" files. Even adding it as: grep -o1 regex file would probably not be a good idea as that would mean some ad-hoc parsing of the options (in "grep -o1 regexp", "1" would be an argument to "-o" while in "grep -oi regexp", "i" currently is a separate "-i" option). So reasonably, it should probably be a separate option like -O 1. -- Stephane
Paul Eggert <eggert <at> cs.ucla.edu>
to control <at> debbugs.gnu.org.
(2015年12月31日 08:56:02 GMT) Full text and rfc822 format available.
GNU bug tracking system
Copyright (C) 1999 Darren O. Benham,
1997,2003 nCipher Corporation Ltd,
1994-97 Ian Jackson.