Extract substring using python re.match

Asked 10 years, 3 months ago

Viewed 1k times

I have a string as

sg_ts_feature_name_01_some_xyz

In this, i want to extract two words that comes after the pattern - sg_ts with the underscore seperation between them

It must be,

feature_name

This regex,

st = 'sg_ts_my_feature_01'
a = re.match('sg_ts_([a-zA-Z_]*)_*', st)
print a.group()

returns,

sg_ts_my_feature_

whereas, i expect,

my_feature

Improve this question

asked Sep 26, 2015 at 9:12

user2879704

Have a look at this demo.

Wiktor Stribiżew
– Wiktor Stribiżew

2015年09月26日 09:19:05 +00:00
Commented Sep 26, 2015 at 9:19
stribizhev is too humble to put his best answer as just a comment and leave without traces....

user2879704
– user2879704

2015年09月26日 09:24:14 +00:00
Commented Sep 26, 2015 at 9:24
No, I just was looking after my 2 children, I have no time to write a full answer. Glad you could solve your issue with others' help. Have a great weekend.

Wiktor Stribiżew
– Wiktor Stribiżew

2015年09月26日 09:59:25 +00:00
Commented Sep 26, 2015 at 9:59

Add a comment |

2 Answers 2

Sorted by: Reset to default

The problem is that you are asking for the whole match, not just the capture group. From the manual:

group([group1, ...]) Returns one or more subgroups of the match. If there is a single argument, the result is a single string; if there are multiple arguments, the result is a tuple with one item per argument. Without arguments, group1 defaults to zero (the whole match is returned). If a groupN argument is zero, the corresponding return value is the entire matching string; if it is in the inclusive range [1..99], it is the string matching the corresponding parenthesized group.

and you asked for a.group() which is equivalent to a.group(0) which is the whole match. Asking for a.group(1) will give you only the capture group in the parentheses.

Improve this answer

answered Sep 26, 2015 at 9:23

msw's user avatar

msw

43.7k9 gold badges92 silver badges118 bronze badges

Comments

You can ask for the group surrounded by the parentheses, 'a.group(1)', which returns

'my_feature_'

In addition, if your string is always in this form you could also use the end-of string character $ and to make the inner match lazy instead of greedy (so it doesn't swallow the _).

a = re.match('sg_ts_([a-zA-Z_]*?)[_0-9]*$',st)

Improve this answer

edited Sep 26, 2015 at 9:32

answered Sep 26, 2015 at 9:20

Steve's user avatar

Steve

1,60712 silver badges23 bronze badges

Comments

Your Answer

Draft saved

Draft discarded

Sign up or log in

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

lang-py

CollectivesTM on Stack Overflow

Extract substring using python re.match

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related