This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2002年02月06日 17:55 by glchapman, last changed 2022年04月10日 16:04 by admin. This issue is now closed.
| Messages (4) | |||
|---|---|---|---|
| msg61076 - (view) | Author: Greg Chapman (glchapman) | Date: 2002年02月06日 17:55 | |
The parsers defined in htmllib and sgmllib do not provide any facilities for unescaping a tag attribute which has an embedded html entityref (i.e., they do not provide a way to convert "a&b" to "a&b"). The parser in HTMLParser unescapes all tag attributes automatically. I'm not sure that's the right approach for sgmllib and htmllib (since it might break existing code), but it seems to me that one of the modules ought to provide a function or method which can do the unescaping if needed. (I'm not familiar with either the SGML or the HTML specification, but I assume one of them mandates the escaping of '&' (e.g.) in tag attributes. If so, then it seems appropriate for one of the modules to provide a function which undoes the mandated transformation.) |
|||
| msg61077 - (view) | Author: Fred Drake (fdrake) (Python committer) | Date: 2006年06月22日 03:57 | |
Logged In: YES user_id=3066 This request is making me reconsider some other changes that have already been made on the trunk (and are now in 2.5b1). Reading this, I thought "Doesn't it already do that?" Turns out that in Python 2.4, it doesn't. Both versions handle this in parsed character data; the difference is confined to attribute values. I'd like to propose adding a Boolean configuration attribute on the parser instance that, when set, causes the parser to decode entity and character references. By default, it would be unset. This would support backward compatibility and make it easier to get attribute value decoding. Another possibility would be to revert the new feature and add a separate method to perform the decoding. |
|||
| msg114175 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2010年08月17日 21:41 | |
Is anyone aware if this was implemented in 2.5 or later as hinted at in msg61077? If yes please close this. If no any point in putting this into 3.2? |
|||
| msg185129 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年03月24日 11:33 | |
See also #2927. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月10日 16:04:57 | admin | set | github: 36039 |
| 2013年11月18日 09:54:25 | ezio.melotti | set | status: open -> closed assignee: ezio.melotti superseder: expose html.parser.unescape resolution: duplicate stage: test needed -> resolved |
| 2013年03月24日 11:33:06 | ezio.melotti | set | messages:
+ msg185129 versions: + Python 3.4, - Python 3.2 |
| 2013年03月23日 22:22:01 | ezio.melotti | set | nosy:
+ ezio.melotti |
| 2010年08月17日 21:41:06 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg114175 versions: + Python 3.2, - Python 2.7 |
| 2009年02月12日 20:03:12 | ajaksu2 | set | keywords:
+ easy stage: test needed versions: + Python 2.7 |
| 2002年02月06日 17:55:02 | glchapman | create | |