I've see a bunch of examples, but I just can't seem to make this work. Can grep output only specified groupings that match? for example seems like it should work, but I get either errors or no output at all.
I want to do:
pathname="/a/long/path/of/mine/2x02 - bar.mp4"
All of the examples will be the long path, one or two digits, x and then 2 digits followed by a space, a - and a filename.
I want to parse this for the 02 value: https://regex101.com/ shows that \d{1,2}x(\d\d) should have match 1 = 02 in this example.
What I can't figure out is if I have
echo "$pathname" | sed -n 's/.*\d{1,2}x\(\d\d\)/1円/p'
or
echo $pathname | grep -oP '\d{1,2}x(\d\d)'
I get nothing. I can do:
echo $pathname | grep -oP '(\d\d)'
but there could be cases where there are other 2 digit values in a row, like if I had
/a/long/path/of/mine/12x02 - bar.mp4
in which case I don't think the above will specify the second match, so I prefer the more specific regex... IF I can use matching groups or something. I'm trying to do this in bash on Scientific Linux 7.1.
3 Answers 3
As you were using grep
with PCRE (-P
), you can use this Regex pattern :
grep -Po '\d{1,2}x\K\d{2}(?= )' <<<"$pathname"
\d{1,2}x
will match one or two digits followed byx
, then\K
will discard the match\d{2}
will match exactly two digits, the zero width positive lookahead pattern(?= )
ensures that we have a space after the two digits.
So this should fulfill your requirements.
Example :
$ grep -Po '\d{1,2}x\K\d{2}(?= )' <<<'/a/long/path/of/mine/2x02 - bar.mp4'
02
$ grep -Po '\d{1,2}x\K\d{2}(?= )' <<<'/a/long/path/of/mine/34x12 - bar.mp4'
12
$ grep -Po '\d{1,2}x\K\d{2}(?= )' <<<'/a/long/path/of/mine/0x1 - bar.mp4'
## No match
$ grep -Po '\d{1,2}x\K\d{2}(?= )' <<<'/a/long/path/of/mine/00x1 - bar.mp4'
## No match
-
I think part of my confusion was not knowing about doing <<< rather than feeding grep from echo | ... The other info about regex is very helpful also, so I'm marking this as the answer.jmp242– jmp2422015年09月10日 12:29:28 +00:00Commented Sep 10, 2015 at 12:29
Using sed
With sed in basic mode, the braces need to be escaped:
$ echo "$pathname" | sed -n 's/.*[[:digit:]]\{1,2\}x\([[:digit:]][[:digit:]]\).*/1円/p'
02
For greater portability, I used [[:digit:]]
in place of \d
. I also added .*
to the end so as to remove the trailing text.
Using grep -P
grep -P
supports a look-behind feature but the look-behind text has to be of fixed length. So, we can look for a single digit preceding the x
preceding the two digits that we want to display:
$ echo "$pathname" | grep -oP '(?<=\dx)(\d\d)'
02
Alternate path
Both of the above also work with the alternate path:
$ echo '/a/long/path/of/mine/12x02 - bar.mp4' | grep -oP '(?<=\dx)(\d\d)'
02
$ echo '/a/long/path/of/mine/12x02 - bar.mp4' | sed -n 's/.*[[:digit:]]\{1,2\}x\([[:digit:]][[:digit:]]\).*/1円/p'
02
Using just a posix shell
p=$pathname
p=${p##*/}
p=${p#*x}
p=${p%% *}
echo "$p"
#or on one line
p=${pathname##*/};p=${p#*x};p=${p%% *};echo "$p"
-
That was cool, but it just gives the last digit. I.e.: pathname="/a/long/path/of/mine/2x12 - bar.mp4" still gives me "2" rather than "12"jmp242– jmp2422015年09月10日 12:26:55 +00:00Commented Sep 10, 2015 at 12:26
-
@jmp242 I made an edit to meet your requirement.fd0– fd02015年09月10日 13:14:13 +00:00Commented Sep 10, 2015 at 13:14
egrep
(orgrep -E
, which is the same) to enable some of the regexp functionality.-E
) regular expressions have "no difference in available functionality". Also, the OP is already using something more powerful than-E
. He is usinggrep -P
which means that his grep supports perl-style regexes.