homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: xmllib unable to parse in UTF8 format
Type: Stage: resolved
Components: XML Versions: Python 2.7
process
Status: closed Resolution: wont fix
Dependencies: Superseder:
Assigned To: Nosy List: enrico.scame, serhiy.storchaka
Priority: normal Keywords:

Created on 2016年05月25日 09:09 by enrico.scame, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
xmllib.py enrico.scame, 2016年05月25日 13:14
Messages (4)
msg266322 - (view) Author: Enrico (enrico.scame) Date: 2016年05月25日 09:09
The xmllib.XMLParser seems to be unable to parse 
an XML file that contains cyrillic characters.
 File "xmllib.pyc", line 172, in feed
 File "xmllib.pyc", line 268, in goahead
 File "xmllib.pyc", line 798, in syntax_error
 Error: Syntax error at line 8: illegal character in content
msg266339 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年05月25日 12:36
Could you please provide minimal reproducer? Minimal script and minimal data that expose the issue.
msg266344 - (view) Author: Enrico (enrico.scame) Date: 2016年05月25日 13:14
I have attached xmllib.py. This file is in python23\lib folder.
The strings in XML file are in cyrillic language.
My code:
import xmllib
class Parser(xmllib.XMLParser):
 # a simple styling engine
 def __init__(self):
 xmllib.XMLParser.__init__(self)
 self.cursupervisore = None
 self.curdata = ''
 self.elements = {'Superv':(self.starttag_superv, self.endtag_superv)
........
 }
 def load(self, file):
 while 1:
 s = file.readline()
 if not s:
 break
 self.feed(s)
 self.close()
def read_plant_tree(filexml):
 c = Parser()
 c.load(filexml)
msg266479 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2016年05月27日 06:02
See also issue222587. Seems this was the reason why the xmllib module was deprecated.
Use the xml package for parsing XML (xml.etree.ElementTree, xml.dom.minidom, xml.sax, etc).
History
Date User Action Args
2022年04月11日 14:58:31adminsetgithub: 71307
2016年05月27日 06:03:07serhiy.storchakasetstatus: open -> closed
stage: test needed -> resolved
2016年05月27日 06:02:48serhiy.storchakasetresolution: wont fix
messages: + msg266479
2016年05月25日 13:14:10enrico.scamesetfiles: + xmllib.py

messages: + msg266344
2016年05月25日 12:36:20serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg266339
stage: test needed
2016年05月25日 09:09:33enrico.scamecreate

AltStyle によって変換されたページ (->オリジナル) /