homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: HTMLParser parses attributes incorrectly.
Type: behavior Stage: resolved
Components: Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: ezio.melotti Nosy List: Michael.Brooks, ezio.melotti, python-dev
Priority: high Keywords:

Created on 2011年11月06日 19:09 by Michael.Brooks, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
red_test.html Michael.Brooks, 2011年11月06日 19:09 HTML incorrectly parsed by HTMLParser
Messages (7)
msg147169 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011年11月06日 19:09
Open the attached file "red_test.html" in a browser. The "bad" elements are blue because the style tag isn't parsed by any known browser. However, the HTMLParser library will incorrectly recognize them.
msg147170 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011年11月06日 19:14
Thanks for the report.
Could you try with the latest 2.7 and see if you can reproduce the problem? (see the devguide for instructions.)
If you can reproduce the issue even on the latest 2.7, it would be great if you could provide a patch with a test case like the ones in Lib/test/test_htmlparser.py.
msg147177 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011年11月06日 19:54
Yes, I am running the latest version, which is python 2.7.2.
On Sun, Nov 6, 2011 at 12:14 PM, Ezio Melotti <report@bugs.python.org>wrote:
>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> Thanks for the report.
> Could you try with the latest 2.7 and see if you can reproduce the
> problem? (see the devguide for instructions.)
>
> If you can reproduce the issue even on the latest 2.7, it would be great
> if you could provide a patch with a test case like the ones in
> Lib/test/test_htmlparser.py.
>
> ----------
> nosy: +ezio.melotti
> stage: -> test needed
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13357>
> _______________________________________
>
msg147179 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011年11月06日 19:56
I mean 2.7.3 (i.e. the development version).
You need to get a clone of Python as explained here: http://docs.python.org/devguide/ 
msg147182 - (view) Author: Michael Brooks (Michael.Brooks) Date: 2011年11月06日 20:26
Python 2.7.3 is still affected by both of these issues.
On Sun, Nov 6, 2011 at 12:56 PM, Ezio Melotti <report@bugs.python.org>wrote:
>
> Ezio Melotti <ezio.melotti@gmail.com> added the comment:
>
> I mean 2.7.3 (i.e. the development version).
> You need to get a clone of Python as explained here:
> http://docs.python.org/devguide/
>
> ----------
>
> _______________________________________
> Python tracker <report@bugs.python.org>
> <http://bugs.python.org/issue13357>
> _______________________________________
>
msg147615 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2011年11月14日 16:57
New changeset 3c3009f63700 by Ezio Melotti in branch '2.7':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in HTMLParser.
http://hg.python.org/cpython/rev/3c3009f63700
New changeset 16ed15ff0d7c by Ezio Melotti in branch '3.2':
#1745761, #755670, #13357, #12629, #1200313: improve attribute handling in HTMLParser.
http://hg.python.org/cpython/rev/16ed15ff0d7c
New changeset 426f7a2b1826 by Ezio Melotti in branch 'default':
#1745761, #755670, #13357, #12629, #1200313: merge with 3.2.
http://hg.python.org/cpython/rev/426f7a2b1826 
msg147804 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2011年11月17日 15:25
I verified with the red_test.html you provided and now HTMLParser seems to parse everything correctly, so I'm closing this.
History
Date User Action Args
2022年04月11日 14:57:23adminsetgithub: 57566
2011年11月17日 15:25:10ezio.melottisetstatus: open -> closed
versions: + Python 3.2, Python 3.3
messages: + msg147804

resolution: fixed
stage: test needed -> resolved
2011年11月14日 16:57:16python-devsetnosy: + python-dev
messages: + msg147615
2011年11月14日 12:44:10ezio.melottisetassignee: ezio.melotti
2011年11月07日 05:45:58rhettingersetpriority: normal -> high
2011年11月06日 20:26:10Michael.Brookssetmessages: + msg147182
2011年11月06日 19:56:24ezio.melottisetmessages: + msg147179
2011年11月06日 19:54:06Michael.Brookssetmessages: + msg147177
2011年11月06日 19:14:18ezio.melottisetnosy: + ezio.melotti

messages: + msg147170
stage: test needed
2011年11月06日 19:09:06Michael.Brookscreate

AltStyle によって変換されたページ (->オリジナル) /