homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author eric.araujo
Recipients Hunanyan, Matt.Basta, cpalmer, eric.araujo, ezio.melotti, fantoozler, fdrake, friday, georg.brandl, gsf, momat, orsenthil, r.david.murray, yotam
Date 2011年07月27日.15:12:06
SpamBayes Score 0.05191596
Marked as misclassified No
Message-id <1311779527.62.0.34688829476.issue670664@psf.upfronthosting.co.za>
In-reply-to
Content
Ezio wrote:
 >>> myhp.feed('<script><p>foo</p></script>')
 data: '<p>foo' # where's the </p>?
http://www.w3.org/TR/html4/types#type-cdata says:
 Although the STYLE and SCRIPT elements use CDATA for their data
 model, for these elements, CDATA must be handled differently by user
 agents. Markup and entities must be treated as raw text and passed to
 the application as is. The first occurrence of the character sequence
 "</" (end-tag open delimiter) is treated as terminating the end of
 the element's content. In valid documents, this would be the end tag
 for the element.
So I think the example is invalid (should escape the <), and that HTMLParser is not buggy.
History
Date User Action Args
2011年07月27日 15:12:07eric.araujosetrecipients: + eric.araujo, fdrake, georg.brandl, yotam, orsenthil, fantoozler, gsf, cpalmer, ezio.melotti, r.david.murray, momat, Hunanyan, friday, Matt.Basta
2011年07月27日 15:12:07eric.araujosetmessageid: <1311779527.62.0.34688829476.issue670664@psf.upfronthosting.co.za>
2011年07月27日 15:12:07eric.araujolinkissue670664 messages
2011年07月27日 15:12:07eric.araujocreate

AltStyle によって変換されたページ (->オリジナル) /