This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2013年04月20日 10:58 by bmispelon, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue17802-unittest.patch | Thomas.Barlow, 2013年04月22日 19:26 | Patch for unit tests to reproduce issue 17802 | review | |
| issue17802.diff | ezio.melotti, 2013年04月23日 05:32 | review | ||
| Messages (6) | |||
|---|---|---|---|
| msg187414 - (view) | Author: Baptiste Mispelon (bmispelon) * | Date: 2013年04月20日 10:58 | |
When trying to parse the string `a&b`, the parser raises an UnboundLocalError:
{{{
>>> from html.parser import HTMLParser
>>> p = HTMLParser()
>>> p.feed('a&b')
>>> p.close()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python3.3/html/parser.py", line 149, in close
self.goahead(1)
File "/usr/lib/python3.3/html/parser.py", line 252, in goahead
if k <= i:
UnboundLocalError: local variable 'k' referenced before assignment
}}}
Granted, the HTML is invalid, but this error looks like it might have been an oversight.
|
|||
| msg187416 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2013年04月20日 11:43 | |
Thanks for the report. Yes, that's in a complicated bit of error recovery code, and clearly you found a path through it that doesn't have a corresponding test :) |
|||
| msg187582 - (view) | Author: Thomas Barlow (Thomas.Barlow) * | Date: 2013年04月22日 19:26 | |
Just adding a patch here with a few unit tests to demonstrate the issue, comments here are welcome. This is my first patch, I believe I have put the tests in the correct place. It appears the problem only occurs if there is an incomplete XML entity where a sequence of valid characters (for an XML entity's name) lead to the end-of-file. The test case for "a&b " passes, as it detects the space as an illegal character for the entity name. |
|||
| msg187608 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年04月23日 05:32 | |
Thanks for the patch Thomas! Starting from your work I made an updated patch that fixes the bug, but at the same time the tests revealed another possible issue. In case of invalid character references, HTMLParser still calls handle_entityref instead of reporting them as 'data'. Not sure what the preferable behavior should be though, but anyway this is a separate issue. |
|||
| msg188222 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年05月01日 13:20 | |
New changeset 9cb90c1a1a46 by Ezio Melotti in branch '3.3': #17802: Fix an UnboundLocalError in html.parser. Initial tests by Thomas Barlow. http://hg.python.org/cpython/rev/9cb90c1a1a46 New changeset 20be90a3a714 by Ezio Melotti in branch 'default': #17802: merge with 3.3. http://hg.python.org/cpython/rev/20be90a3a714 |
|||
| msg188224 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年05月01日 13:25 | |
Fixed, thanks for the report! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:44 | admin | set | github: 62002 |
| 2013年05月01日 13:25:05 | ezio.melotti | set | status: open -> closed resolution: fixed messages: + msg188224 stage: patch review -> resolved |
| 2013年05月01日 13:20:15 | python-dev | set | nosy:
+ python-dev messages: + msg188222 |
| 2013年04月23日 05:33:00 | ezio.melotti | set | files:
+ issue17802.diff messages: + msg187608 stage: needs patch -> patch review |
| 2013年04月22日 19:26:41 | Thomas.Barlow | set | files:
+ issue17802-unittest.patch nosy: + Thomas.Barlow messages: + msg187582 keywords: + patch |
| 2013年04月20日 11:48:08 | ezio.melotti | set | assignee: ezio.melotti |
| 2013年04月20日 11:43:20 | r.david.murray | set | type: crash -> behavior versions: + Python 3.4 keywords: + easy nosy: + r.david.murray, ezio.melotti messages: + msg187416 stage: needs patch |
| 2013年04月20日 10:58:16 | bmispelon | create | |