This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年05月20日 22:02 by Kyle.Keating, last changed 2022年04月11日 14:57 by admin.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| xmlNameVerification.py | jocassid, 2013年07月28日 02:49 | code to validate xml element/attribute names | ||
| Messages (7) | |||
|---|---|---|---|
| msg136402 - (view) | Author: Kyle Keating (Kyle.Keating) | Date: 2011年05月20日 22:02 | |
I was doing some tests on using this library and I noticed xml elements and attribute names could be created with mal-formed xml because special characters which can break validation are not cleaned or converted from their literal forms. Only the attribute values are cleaned, but not the names.
For example
import xml.dom
...
doc.createElement("p></p>")
...
will just embed a pair of p tags in the xml result. I thought that the xml spec did not permit <, >, &, \n etc. in the element name or attribute name? Could I get some clarification on this, thanks!
|
|||
| msg137142 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2011年05月28日 18:35 | |
I suspect you are right, but do not know the rules, and have never used the module. There is no particular person maintaining xml.dom.X at present. Could you please fill in the ... after the import to give a complete minimal example that fails? Someone could then test it on 3.2 |
|||
| msg137487 - (view) | Author: Kyle Keating (Kyle.Keating) | Date: 2011年06月02日 17:10 | |
This looks to break pretty good... I did confirm this on 3.0, I'm guessing 3.2 is the same.
import sys
import xml.dom
doc = xml.dom.getDOMImplementation().createDocument(None, 'xml', None)
doc.firstChild.appendChild(doc.createElement('element00'))
element01 = doc.createElement('element01')
element01.setAttribute('attribute', "script><![CDATA[alert('script!');]]></script>")
doc.firstChild.appendChild(element01)
element02 = doc.createElement("script><![CDATA[alert('script!');]]></script>")
doc.firstChild.appendChild(element02)
element03 = doc.createElement("new line \n")
element03.setAttribute('attribute-name','new line \n')
doc.firstChild.appendChild(element03)
print doc.toprettyxml(indent=" ")
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
output:
<?xml version="1.0" ?>
<xml>
<element/>
<element01 attribute="script><![CDATA[alert('script!');]]></script
>"/>
<script><![CDATA[alert('script!');]]></script>/>
<new line
attribute-name="new line
"/>
</xml>
|
|||
| msg137488 - (view) | Author: Kyle Keating (Kyle.Keating) | Date: 2011年06月02日 17:13 | |
oops, the first xml element in the output should read "<element00/>" not "<element/>" just a typo! don't get confused! |
|||
| msg193804 - (view) | Author: John Cassidy (jocassid) | Date: 2013年07月28日 02:49 | |
I added the line print(str(doc)) after the call to getDomImplementation and verified that the errors that I'm seeing are coming from the xml.dom.minidom implemenation of xml.dom. Checking minidom.py I did not see any validation on the tagName that gets passed to createElement. http://www.w3.org/TR/xml11/#NT-NameStartChar lists the format of allowed names. Attached is a file containing the functions I was working on. My thinking is that if the tagName is not valid a ValueError should be thrown. |
|||
| msg258344 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2016年01月16日 00:57 | |
My limited understanding is that xml.dom and minidom are supposed to implement particular interfaces. So do these DOM interfaces specify if this validation should be done? If so, this would be a bug. Or is it just a question of whether Python should do extra validation not specified by the underlying DOM API? |
|||
| msg283873 - (view) | Author: Pradeep (pdeep5693) | Date: 2016年12月23日 08:39 | |
xml minidom.py needs extra validation in setAttributes for certain special characters depending on the attribute name. Attribute values cannot have special characters like <,> and cant be nested as described in the example below
element01 = doc.createElement('element01')
element01.setAttribute('attribute', "script><![CDATA[alert('script!');]]></script>")
doc.firstChild.appendChild(element01)
script shouldn't be allowed as a value for an attribute and I feel it should throw an exception (Value Exception) and as described above <,> shouldn't be allowed as attributes are more like key-value pairs. Could someone tell me if this is right? If it is, then minidom.py needs this extra level of validation for the same
|
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:17 | admin | set | github: 56338 |
| 2019年04月27日 11:39:42 | scoder | unlink | issue5166 dependencies |
| 2016年12月23日 08:39:36 | pdeep5693 | set | nosy:
+ pdeep5693 messages: + msg283873 |
| 2016年01月16日 00:57:27 | martin.panter | set | versions:
+ Python 3.5, Python 3.6 nosy: + martin.panter messages: + msg258344 components: + XML, - Library (Lib) stage: test needed -> |
| 2016年01月16日 00:44:53 | martin.panter | link | issue5166 dependencies |
| 2013年07月28日 02:49:49 | jocassid | set | files:
+ xmlNameVerification.py nosy: + jocassid messages: + msg193804 |
| 2011年06月02日 17:13:17 | Kyle.Keating | set | messages: + msg137488 |
| 2011年06月02日 17:10:39 | Kyle.Keating | set | messages: + msg137487 |
| 2011年05月28日 18:35:29 | terry.reedy | set | nosy:
+ terry.reedy messages: + msg137142 stage: test needed |
| 2011年05月20日 22:02:10 | Kyle.Keating | create | |