[Python-checkins] CVS: python/dist/src/Include unicodeobject.h,2.20,2.21

2001年5月21日 13:30:17 -0700

Update of /cvsroot/python/python/dist/src/Include
In directory usw-pr-cvs1:/tmp/cvs-serv5748/Include
Modified Files:
	unicodeobject.h 
Log Message:
This patch changes the behaviour of the UTF-16 codec family. Only the
UTF-16 codec will now interpret and remove a *leading* BOM mark. Sub-
sequent BOM characters are no longer interpreted and removed. 
UTF-16-LE and -BE pass through all BOM mark characters.
These changes should get the UTF-16 codec more in line with what
the Unicode FAQ recommends w/r to BOM marks.
Index: unicodeobject.h
===================================================================
RCS file: /cvsroot/python/python/dist/src/Include/unicodeobject.h,v
retrieving revision 2.20
retrieving revision 2.21
diff -C2 -r2.20 -r2.21
*** unicodeobject.h	2001年04月23日 14:44:21	2.20
--- unicodeobject.h	2001年05月21日 20:30:15	2.21
***************
*** 460,467 ****
 	*byteorder == 1: big endian

! and then switches according to all BOM marks it finds in the input
! data. BOM marks are not copied into the resulting Unicode string.
! After completion, *byteorder is set to the current byte order at
! the end of input data.

 If byteorder is NULL, the codec starts in native order mode.
--- 460,468 ----
 	*byteorder == 1: big endian

! In native mode, the first two bytes of the stream are checked for a
! BOM mark. If found, the BOM mark is analysed, the byte order
! adjusted and the BOM skipped. In the other modes, no BOM mark
! interpretation is done. After completion, *byteorder is set to the
! current byte order at the end of input data.

 If byteorder is NULL, the codec starts in native order mode.