Message60331
| Author |
kingswood |
| Recipients |
| Date |
2004年04月03日.18:04:36 |
| SpamBayes Score |
| Marked as misclassified |
| Message-id |
| In-reply-to |
| Content |
Logged In: YES
user_id=555155
This problem is actually more widespread than previously
indicated. Not only do all calls to self.error where that
function returns need to cope with that, and recover (the
HTMLParser defines that every character in the input will be
visited exactly once), but other modules are also affected.
In particular, feeding HTML (from spam) with a tag <!12345>
into HTMLParser causes markupbase._scan_name to emit an
error that now needs to recover.
The patch in #917188 may be better than the one suggested
here as it deals with all places where self.error() can return.
More is needed to fix the problem completely.
In markupbase.py, at least this is necessary
--- markupbase.py.orig Sat Apr 03 17:43:48 2004
+++ markupbase.py Sat Apr 03 18:02:48 2004
@@ -377,6 +377,8 @@
else:
self.updatepos(declstartpos, i)
self.error("expected name token")
+ return None,rawdata.find(">",i)
# To be overridden -- handlers for unknown objects
def unknown_decl(self, data): |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2008年01月20日 09:56:06 | admin | link | issue736428 messages |
| 2008年01月20日 09:56:06 | admin | create |
|