This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年12月11日 01:58 by ezio.melotti, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue13576.diff | ezio.melotti, 2011年12月11日 01:58 | Tests against 3.2. | review | |
| Messages (2) | |||
|---|---|---|---|
| msg149204 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2011年12月11日 01:58 | |
The attached patch adds a few tests about the handling of broken conditional comments (condcoms). A valid condcom looks like <!--[if ie 6]>...<![endif]-->. An invalid one looks like <![if ie 6]>...<![endif]>. This seems a common mistake, and it's found even on popular sites like adobe, linkedin, deviantart. Currently, HTMLParser calls unknown_decl() passing e.g. 'if ie 6', and if strict=True an error is raised. With strict=False no error is raised and the unknown declaration is ignored. The HTML5 specs say: """ [After '<!',] If the next two characters are both U+002D HYPHEN-MINUS characters (-), consume those two characters, [...] Otherwise, this is a parse error. Switch to the bogus comment state.[0] [Once in the bogus comment state,] Consume every character up to and including the first U+003E GREATER-THAN SIGN character (>) or the end of the file (EOF), whichever comes first. Emit a comment token whose data is the concatenation of all the characters starting from and including the character that caused the state machine to switch into the bogus comment state, up to and including the character immediately before the last consumed character (i.e. up to the character just before the U+003E or EOF character), but with any U+0000 NULL characters replaced by U+FFFD REPLACEMENT CHARACTER characters. (If the comment was started by the end of the file (EOF), the token is empty.)[1] """ So, IIUC, '<![if ie 6]>...<![endif]>' should emit a '[if ie 6]' comment, parse the '...' normally, and emit a '[endif]' comment. However I think it's fine to leave the current behavior for the following reasons: 1) backward compatibility; 2) handling broken condcoms in unknown_decl is easier than doing it in handle_comment, where all the other comments are sent; 3) no one probably cares about them anyway; [0]: http://www.w3.org/TR/html5/tokenization.html#markup-declaration-open-state [1]: http://www.w3.org/TR/html5/tokenization.html#bogus-comment-state |
|||
| msg149819 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2011年12月19日 05:36 | |
New changeset 9c60fd12664f by Ezio Melotti in branch '2.7': #13576: add tests about the handling of (possibly broken) condcoms. http://hg.python.org/cpython/rev/9c60fd12664f New changeset 4ddbb756b602 by Ezio Melotti in branch '3.2': #13576: add tests about the handling of (possibly broken) condcoms. http://hg.python.org/cpython/rev/4ddbb756b602 New changeset 6452edbc5f12 by Ezio Melotti in branch 'default': #13576: merge with 3.2. http://hg.python.org/cpython/rev/6452edbc5f12 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:24 | admin | set | github: 57785 |
| 2011年12月19日 05:46:34 | ezio.melotti | set | status: open -> closed type: behavior -> enhancement resolution: fixed stage: commit review -> resolved |
| 2011年12月19日 05:36:13 | python-dev | set | nosy:
+ python-dev messages: + msg149819 |
| 2011年12月11日 01:58:33 | ezio.melotti | create | |