1
\$\begingroup\$

My implemented regex pattern contains two repeating symbols: \d{2}\. and <p>(.*)</p>. I want to get rid of this repetition and asked myself if there is a way to loop in Python's regular expression implementation.

Note: I do not ask for help to parse a XML file. There are many great tutorials, howtos and libraries. I am looking for means to implement repetition in regex patterns.

My code:

import re
pattern = '''
<menu>
<day>\w{2} (\d{2}\.\d{2})\.</day>
<description>
<p>(.*)</p>
<p>(.*)</p>
<p>(.*)</p>
</description>
'''
my_example_string = '''
<menu>
<day>Mi 03.04.</day>
<description>
<p>Knoblauchcremesuppe</p>
<p>Rindsbraten "Esterhazy" (Gem&uuml;serahmsauce)</p>
<p>mit H&ouml;rnchen und Salat</p>
</description>
</menu>
'''
re.findall(pattern, my_example_string, re.MULTILINE)
asked Apr 1, 2013 at 13:38
\$\endgroup\$
2
  • 1
    \$\begingroup\$ Parsing XML with regex is usually wrong, what are you really trying to accomplish? \$\endgroup\$ Commented Apr 1, 2013 at 14:30
  • \$\begingroup\$ The XML is malformed what prevents a usage of LXML and Xpath. I easily can retrieve the deserved data, but I want to find a way to avoid these repetitions in any regex patterns. \$\endgroup\$ Commented Apr 1, 2013 at 14:37

1 Answer 1

1
\$\begingroup\$

Firstly, just for anyone who might read this: DO NOT take this as an excuse to parse your XML with regular expressions. It generally a really really bad idea! In this case the XML is malformed, so its the best we can do.

The regular expressions looping constructs are * and {4} which you already using. But this is python, so you can construct your regular expression using python:

expression = """
<menu>
<day>\w{2} (\d{2}\.\d{2})\.</day>
<description>
"""
for x in xrange(3):
 expression += "<p>(.*)</p>"
expression += """
</description>
</menu>
"""
answered Apr 1, 2013 at 17:45
\$\endgroup\$
1
  • \$\begingroup\$ What about expression += "<p>(.*)</p>\n" * 3 ? \$\endgroup\$ Commented Apr 1, 2013 at 18:40

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.