[Python-checkins] CVS: python/dist/src/Lib sgmllib.py,1.30,1.31
Guido van Rossum
gvanrossum@users.sourceforge.net
2001年5月21日 13:17:19 -0700
Update of /cvsroot/python/python/dist/src/Lib
In directory usw-pr-cvs1:/tmp/cvs-serv4001
Modified Files:
sgmllib.py
Log Message:
parse_declaration(): be more lenient in what we accept. We now
basically accept <!...> where the dots can be single- or double-quoted
strings or any other character except >.
Background: I found a real-life example that failed to parse with
the old assumption: http://www.opensource.org/licenses/jabberpl.html
contains a few constructs of the form <![if !supportLists]>...<![endif]>.
Index: sgmllib.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/sgmllib.py,v
retrieving revision 1.30
retrieving revision 1.31
diff -C2 -r1.30 -r1.31
*** sgmllib.py 2001年04月15日 13:01:41 1.30
--- sgmllib.py 2001年05月21日 20:17:17 1.31
***************
*** 40,44 ****
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./:;+*%?!&$\(\)_#=~]*))?')
! declname = re.compile(r'[a-zA-Z][-_.a-zA-Z0-9]*\s*')
declstringlit = re.compile(r'(\'[^\']*\'|"[^"]*")\s*')
--- 40,44 ----
r'(\'[^\']*\'|"[^"]*"|[-a-zA-Z0-9./:;+*%?!&$\(\)_#=~]*))?')
! decldata = re.compile(r'[^>\'\"]+')
declstringlit = re.compile(r'(\'[^\']*\'|"[^"]*")\s*')
***************
*** 213,218 ****
rawdata = self.rawdata
j = i + 2
! # in practice, this should look like: ((name|stringlit) S*)+ '>'
! while 1:
c = rawdata[j:j+1]
if c == ">":
--- 213,218 ----
rawdata = self.rawdata
j = i + 2
! n = len(rawdata)
! while j < n:
c = rawdata[j:j+1]
if c == ">":
***************
*** 226,242 ****
return -1
j = m.end()
! elif c in "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ":
! m = declname.match(rawdata, j)
if not m:
# incomplete or an error?
return -1
j = m.end()
! elif i == len(rawdata):
! # end of buffer between tokens
! return -1
! else:
! raise SGMLParseError(
! "unexpected char in declaration: %s" % `rawdata[i]`)
! assert 0, "can't get here!"
# Internal -- parse processing instr, return length or -1 if not terminated
--- 226,237 ----
return -1
j = m.end()
! else:
! m = decldata.match(rawdata, j)
if not m:
# incomplete or an error?
return -1
j = m.end()
! # end of buffer between tokens
! return -1
# Internal -- parse processing instr, return length or -1 if not terminated