I'm trying to split a string using a regular expression.
Friday 1Friday 11 JAN 11
The output I want to achieve is
['Friday 1', 'Friday 11', ' JAN 11']
My snippet so far is not producing the desired results:
>>> import re
>>> p = re.compile(r'(Sunday|Monday|Tuesday|Wednesday|Thursday|Friday|Saturday)\s*\d{1,2}')
>>> filter(None, p.split('Friday 1Friday 11 JAN 11'))
['Friday', 'Friday', ' JAN 11']
What am I doing wrong with my regex?
asked Feb 14, 2011 at 18:33
Jared Knipp
5,9507 gold badges48 silver badges53 bronze badges
3 Answers 3
The problem is the capturing parentheses. This syntax: (?:...) makes them non-capturing. Try:
p = re.compile(r'((?:Friday|Saturday)\s*\d{1,2})')
answered Feb 14, 2011 at 18:48
scoffey
4,6981 gold badge26 silver badges28 bronze badges
Sign up to request clarification or add additional context in comments.
2 Comments
Jared Knipp
That's exactly what I was after! I knew it was something small. Thanks
Jared Knipp
I was getting close with p = re.compile(r'((Friday|Saturday)\s*\d{1,2})') but didn't understand why I was getting 2 results for each group. Makes complete sense now though, it was producing the result + the group name back reference.
You can also use 're.findall' function.
\>>> val
'Friday 1Friday 11 JAN 11 '
\>>> pat = re.compile(r'(\w+\s*\d*)')
\>>> m=re.findall(pat,val)
\>>> m
['Friday 1', 'Friday 11', 'JAN 11']
om-nom-nom
63k13 gold badges186 silver badges231 bronze badges
answered Feb 14, 2011 at 18:52
sateesh
28.9k7 gold badges38 silver badges45 bronze badges
Comments
p = re.compile(r'(Friday\s\d+|Saturday)')
om-nom-nom
63k13 gold badges186 silver badges231 bronze badges
answered Feb 14, 2011 at 18:47
Asterisk
3,5742 gold badges37 silver badges55 bronze badges
Comments
lang-py