I'm having trouble extracting a specific element text from a soap response. Other elements seems to be working fine.
I have tried the following:
Python 3.13.3 (main, Apr 8 2025, 13:54:08) [Clang 16.0.0 (clang-1600026.6)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> xml = '''<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope">
... <soap:Body>
... <soap:Fault>
... <soap:Code>
... <soap:Value>soap:Sender</soap:Value>
... <soap:Subcode>
... <soap:Value xmlns:ns1="http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd">
... ns1:unauthorized
... </soap:Value>
... </soap:Subcode>
... </soap:Code>
... <soap:Reason>
... <soap:Text xml:lang="en">AccessResult: result: Access Denied | AuthenticationAsked: true |
... ErrorCode: IDP_ERROR:
... 137 | ErrorReason: null</soap:Text>
... </soap:Reason>
... <soap:Detail>
... <WebServiceFault xmlns="http://www.taleo.com/ws/integration/toolkit/2005/07">
... <code>SystemError</code>
... <message>AccessResult: result: Access Denied | AuthenticationAsked: true | ErrorCode:
... IDP_ERROR: 137 |
... ErrorReason: null</message>
... </WebServiceFault>
... </soap:Detail>
... </soap:Fault>
... </soap:Body>
... </soap:Envelope>'''
>>> root = etree.fromstring(xml)
>>> print(root)
<Element {http://www.w3.org/2003/05/soap-envelope}Envelope at 0x1032c4680>
>>> ns = { 'soap':'http://www.w3.org/2003/05/soap-envelope', 'ns1':'http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd"' }
>>> print(root.xpath('//soap:Subcode/soap:Value',namespaces=ns)[0].text)
ns1:unauthorized
>>> print(root.xpath('//soap:Reason/soap:Text',namespaces=ns)[0].text)
AccessResult: result: Access Denied | AuthenticationAsked: true |
ErrorCode: IDP_ERROR:
137 | ErrorReason: null
>>> print(root.xpath('//soap:Detail/WebServiceFault/message',namespaces=ns)[0].text)
Traceback (most recent call last):
File "<python-input-7>", line 1, in <module>
print(root.xpath('//soap:Detail/WebServiceFault/message',namespaces=ns)[0].text)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^
IndexError: list index out of range
For some reason, the message element text I can't get.
Appreciate any help.
2 Answers 2
add name namespace in ns dist and as prefix
<WebServiceFault xmlns="http://www.taleo.com/ws/integration/toolkit/2005/07">
ns = { 'soap':'http://www.w3.org/2003/05/soap-envelope', 'ns1':'http://docs.oasis-open.org/wss/oasis-wss-wssecurity-secext-1.1.xsd"', 'ns2': 'http://www.taleo.com/ws/integration/toolkit/2005/07' }
print(root.xpath('//soap:Detail/ns2:WebServiceFault/ns2:message',namespaces=ns)[0].text)
#Output
AccessResult: result: Access Denied | AuthenticationAsked: true | ErrorCode:
IDP_ERROR: 137 |
ErrorReason: null
Sign up to request clarification or add additional context in comments.
1 Comment
ads
Unfortunately this is a response from a web service that I am calling and I can't not make any changes to the XML.
Instead of hard code the namespace, you can collect the nsmap:
from lxml import etree
def collect_nsmap(file):
""" Collect all unique namespaces, and prevent None prefix"""
root = etree.parse(file)
nsmap = {}
for element in root.iter():
if element.nsmap:
for prefix, uri in element.nsmap.items():
if prefix is None:
nsmap['ns0'] = uri # assign a default prefix placeholder
elif prefix not in nsmap:
nsmap[prefix] = uri
return root, nsmap
def main(root, nsmap):
""" Parse your content """
text_result = root.xpath('//soap:Reason/soap:Text', namespaces=nsmap)
print(text_result[0].text.strip())
fault_code = root.xpath('//ns0:WebServiceFault/ns0:code', namespaces=nsmap)
print(fault_code[0].text)
if __name__ == "__main__":
xml_file = "soap.xml"
root, nsmap = collect_nsmap(xml_file)
main(root, nsmap)
Output:
AccessResult: result: Access Denied | AuthenticationAsked: true |
ErrorCode: IDP_ERROR: 137 | ErrorReason: null
SystemError
answered May 25, 2025 at 12:45
Hermann12
4,1322 gold badges9 silver badges21 bronze badges
Comments
default