This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2007年11月05日 00:58 by drtomc, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| bug.py | drtomc, 2007年11月05日 01:01 | |||
| minidom_comment.patch | vdupras, 2008年02月12日 19:27 | |||
| Messages (21) | |||
|---|---|---|---|
| msg57115 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月05日 01:01 | |
The attached script yields a non-well-formed xml document. |
|||
| msg57118 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月05日 05:31 | |
That's not a bug in Python, but in your script. You should not pass such a string to createComment. |
|||
| msg57119 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月05日 05:55 | |
Either it is a bug in the DOM implementation, which should reject comments containing -->, a bug in toxml() which should refuse to serialize unserializable documents, or it is a bug in the documentation. cheers, Tom |
|||
| msg57120 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月05日 06:07 | |
It's not a bug in the DOM implementation, as createCommment does not specify an exception in this case. It may be a bug in the W3 DOM specification; please report that to the W3 consortium. It's not a bug in toxml, which should always serialize the DOM tree if possible. As for a bug in the documentation: can you propose a change to the documentation that would make you happy? |
|||
| msg57140 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月05日 21:36 | |
Hi Martin,
You write:
It's not a bug in toxml, which should always serialize the
DOM tree if possible.
Right! In this case it is *not* possible. The generated serialization is
not a well formed XML document.
Having just consulted the DOM technical reports, I see that
createComment is specified as not generating any exceptions, so although
it would be quite a sensible place to check that one was not creating an
insane document, probably one should not do the check there. I think
you're right that this *is* a bug in DOM, and I will report it there.
Having said that, I still think that toxml should throw an exception. In
general, if toxml succeeds, I would expect to be able to parse the result.
I can propose a doco change, but I think such would only be a partial
solution. Something like the following addition to the description for
createComment
Note that comments containing the string C{-->} may make the document
unserializable.
cheers,
Tom
|
|||
| msg57143 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月05日 22:18 | |
I'm not willing to change minidom unless there is precedence of how to deal with this case. So can you find DOM implementations in other languages that meet the DOM spec an still reject your code? |
|||
| msg57144 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月05日 22:38 | |
Hi Martin, toxml() is not part of the DOM, so it could be changed to throw an exception. However, I suggest doing nothing, for the moment - I've posted to the dom mailing list at w3, so I'll see what wisdom we get from its members. cheers, Tom |
|||
| msg57184 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月06日 22:37 | |
The W3 guys had some information that helps. The DOM3 Core specification contains the following No lexical check is done on the content of a comment and it is therefore possible to have the character sequence "--" (double-hyphen) in the content, which is illegal in a comment per section 2.5 of [XML 1.0]. The presence of this character sequence must generate a fatal error during serialization. This suggest that toxml is does not comply with DOM3 at any rate. cheers, Tom |
|||
| msg57187 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2007年11月07日 00:31 | |
Looks like bad design on W3 part: postponing an error happening, though it wouldn't be difficult to check right in createComment(). But I guess minidom should still be changed to conform to standard. |
|||
| msg57188 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月07日 00:37 | |
Would anybody want to provide a patch, then? |
|||
| msg57189 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月07日 00:56 | |
FWIW, the DOM guys considered mandating a check in createComment, but decided that the performance penalty was too great. I'm not sure I agree with them, but there you have it. Here are links to my query about the issue: http://lists.w3.org/Archives/Public/www-dom/2007OctDec/0017.html http://lists.w3.org/Archives/Public/www-dom/2007OctDec/0018.html |
|||
| msg57190 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2007年11月07日 01:08 | |
Well, it seems that allows createComment() in minidom to raise something implementation/language specific. I personally would prefer this (e.g. a ValueError) instead of raising on serialization step, as I prefer early error checks, when these checks obviously relate to some place in the code. In any way, I think that either createComment() should raise or toxml(), but generating non-well formed output is a bug. |
|||
| msg57191 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月07日 06:26 | |
It may the intention that these functions may raise exceptions in other cases as well - but can you also support that possibility from the text of the DOM spec itself? Adding an exception there may break existing applications, which already have except clauses to deal with all "possible" exceptions; now they break if additional exceptions become possible. |
|||
| msg57192 - (view) | Author: Thomas Conway (drtomc) | Date: 2007年11月07日 07:00 | |
I think the specification is reasonably clear: createComment may not throw an exception. The serializer must throw an exception. (Personally, I think they have it round the wrong way - every time you write a serializer you have to write code to do the check; if it was in createComment, you'd only have to do it once. Never mind!) The problem of compatibility is, as always, a nasty one: whether or not to potentially break code that previously worked. In this case, I think modifying toxml (and the other serializing functions in the same library) to throw an exception is pretty unlikely to break existing code. The *only* way to trigger the error is if you call createComment with bad text. Moreover, the programs which "succeeded" before which now fail were almost certainly producing wrong output before, which if it did not break downstream processing, would at least produce strange bits of extra character data. If the library is changed to throw an exception, at least it will alert the author/maintainer to the problem. I would estimate the expected number of programs to be broken by such a change to be about 0. :-) This is certainly not the first time in the history of software development the break or not to break issue has come up. Is there a precedent in the python libraries for how to deal with this kind of issue? Can we add a quickAndBuggy = True default parameter to toxml, then in a couple of releases make it mandatory, then in a couple of further releases remove it and the old behaviour? cheers, Tom |
|||
| msg57195 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年11月07日 08:55 | |
The standard procedure for an incompatible change would be to add such a parameter to 2.6, and then change the default behavior in 2.7 (or rather 3.1). Of course, people will not notice the change in 2.6, and then be surprised as much about the default change in 2.7, assuming this problem arises in some application (and this issue is proof that the problem does arise in real life - unless drtomc brought up an artificial problem). |
|||
| msg57278 - (view) | Author: Paul Pogonyshev (_doublep) | Date: 2007年11月08日 21:06 | |
I think unexpected exception in toxml() is not worse than producing unreadable output. With createComment() it is different, since you can e.g. create a temporary DOM tree only to discard it later and never serialize. |
|||
| msg62329 - (view) | Author: Virgil Dupras (vdupras) (Python triager) | Date: 2008年02月12日 19:27 | |
I wanted to start contributing to python for quite a while, so here's my very first try (cleaning out old patchless open tickets). So, whatever is the final decision on this, here's a patch. CDATASection.writexml() already raises ValueError when finding invalid data, so it seems consistent to me to extend the behavior to Comment.writexml() note: I can't add the "patch" keyword myself? |
|||
| msg62331 - (view) | Author: Thomas Conway (drtomc) | Date: 2008年02月12日 19:46 | |
On Feb 13, 2008 6:27 AM, Virgil Dupras <report@bugs.python.org> wrote: > CDATASection.writexml() already raises ValueError when finding invalid data, > so it seems consistent to me to extend the behavior to Comment.writexml() That looks fine to me. |
|||
| msg64104 - (view) | Author: Sean Reifschneider (jafo) * (Python committer) | Date: 2008年03月19日 21:26 | |
Martin: What do you think of this patch? |
|||
| msg64106 - (view) | Author: Thomas Conway (drtomc) | Date: 2008年03月19日 21:30 | |
On Thu, Mar 20, 2008 at 8:26 AM, Sean Reifschneider <report@bugs.python.org> wrote: > > Sean Reifschneider <jafo@tummy.com> added the comment: > > Martin: What do you think of this patch? Looks fine. |
|||
| msg67247 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年05月23日 15:19 | |
Thanks for the patch. Committed as r63563. Because of the new exception, I won't backport the change to 2.5. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:27 | admin | set | github: 45731 |
| 2008年05月23日 15:19:08 | loewis | set | status: open -> closed resolution: fixed messages: + msg67247 |
| 2008年03月19日 21:30:35 | drtomc | set | messages: + msg64106 |
| 2008年03月19日 21:26:49 | jafo | set | priority: normal assignee: loewis messages: + msg64104 nosy: + jafo |
| 2008年02月15日 17:47:25 | akuchling | set | keywords: + patch |
| 2008年02月12日 19:46:17 | drtomc | set | messages: + msg62331 |
| 2008年02月12日 19:27:53 | vdupras | set | files:
+ minidom_comment.patch nosy: + vdupras messages: + msg62329 |
| 2007年11月08日 21:06:58 | _doublep | set | messages: + msg57278 |
| 2007年11月07日 08:55:32 | loewis | set | messages: + msg57195 |
| 2007年11月07日 07:00:38 | drtomc | set | messages: + msg57192 |
| 2007年11月07日 06:26:38 | loewis | set | messages: + msg57191 |
| 2007年11月07日 01:08:23 | _doublep | set | messages: + msg57190 |
| 2007年11月07日 00:56:10 | drtomc | set | messages: + msg57189 |
| 2007年11月07日 00:37:18 | loewis | set | messages: + msg57188 |
| 2007年11月07日 00:31:03 | _doublep | set | nosy:
+ _doublep messages: + msg57187 |
| 2007年11月06日 22:37:46 | drtomc | set | messages: + msg57184 |
| 2007年11月05日 22:38:31 | drtomc | set | messages: + msg57144 |
| 2007年11月05日 22:18:24 | loewis | set | messages: + msg57143 |
| 2007年11月05日 21:36:50 | drtomc | set | messages: + msg57140 |
| 2007年11月05日 06:07:10 | loewis | set | messages: + msg57120 |
| 2007年11月05日 05:55:42 | drtomc | set | messages: + msg57119 |
| 2007年11月05日 05:31:13 | loewis | set | nosy:
+ loewis messages: + msg57118 |
| 2007年11月05日 01:01:04 | drtomc | set | files:
+ bug.py messages: + msg57115 |
| 2007年11月05日 00:58:05 | drtomc | create | |