homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author thomaspinckney3
Recipients thomaspinckney3
Date 2008年05月20日.05:43:48
SpamBayes Score 0.0009664701
Marked as misclassified No
Message-id <1211262235.18.0.705526453605.issue2927@psf.upfronthosting.co.za>
In-reply-to
Content
There is currently a private method inside of html.parser.HTMLParser to 
unescape HTML &...; style escapes. This would be useful to expose for 
other users who want to unescape a piece of HTML.
Additionally, many websites don't use proper unicode or iso-8859-1 
encodings and accidentally use Microsoft Code Page 1252 extensions. I 
added code to map these to their appropriate unicode values.
The unescaping logic was slightly simplified too.
This is my first Python patch submission, so please let me know if I've 
done anything wrong.
A new test case was also added for this functionality.
History
Date User Action Args
2008年05月20日 05:43:55thomaspinckney3setspambayes_score: 0.00096647 -> 0.0009664701
recipients: + thomaspinckney3
2008年05月20日 05:43:55thomaspinckney3setspambayes_score: 0.00096647 -> 0.00096647
messageid: <1211262235.18.0.705526453605.issue2927@psf.upfronthosting.co.za>
2008年05月20日 05:43:53thomaspinckney3linkissue2927 messages
2008年05月20日 05:43:52thomaspinckney3create

AltStyle によって変換されたページ (->オリジナル) /