DOCTYPE + SAX

jdownie jdownie at gmail.com
Sat Apr 9 19:48:09 EDT 2011


On Apr 10, 1:47 am, Alain Ketterlin <al... at dpt-info.u-strasbg.fr>
wrote:
> jdownie <jdow... at gmail.com> writes:
> > I'm trying to get xml.sax to interpret a file that begins with…
>> > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://
> >www.w3.org/TR/html4/loose.dtd">
>> > After a while I get...
>> >http://www.w3.org/TR/html4/loose.dtd:31:2:error in processing
> > external entity reference
>> > …although…
>> > time curlhttp://www.w3.org/TR/html4/loose.dtd
> > [works]
>> You're mistaken. There is no problem fetching the file, but there is a
> problem while parsing the file (at line 31, where you find a comment in
> an entity declaration, which is not acceptable in XML).
>> You're trying to use HTML's SGML DTD in a XML document. Direct your
> doctype to XHTML's DTD, and everything will be fine (hopefully).
>> BTW, your installation will probably let you use a locally cached copy
> of the DTD, instead of fetching a file at every parse. How this works
> depends somehow on the parser you use.
>> -- Alain.

Excellent. I think I understand that. I'll look around for the xhtml
version of the html4/loose DTD and try what you suggest. Thanks very
much.


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /