[Python-checkins] CVS: python/dist/src/Misc unicode.txt,3.7,3.8

Thu, 8 Jun 2000 10:51:35 -0700

Update of /cvsroot/python/python/dist/src/Misc
In directory slayer.i.sourceforge.net:/tmp/cvs-serv8496/Misc
Modified Files:
	unicode.txt 
Log Message:
Marc-Andre Lemburg <mal@lemburg.com>:
Updated to version 1.5. Includes typo fixes by Andrew Kuchling
and a new section on the default encoding.
Index: unicode.txt
===================================================================
RCS file: /cvsroot/python/python/dist/src/Misc/unicode.txt,v
retrieving revision 3.7
retrieving revision 3.8
diff -C2 -r3.7 -r3.8
*** unicode.txt	2000年05月09日 19:58:19	3.7
--- unicode.txt	2000年06月08日 17:51:33	3.8
***************
*** 20,28 ****
 The latest version of this document is always available at:

! http://starship.skyport.net/~lemburg/unicode-proposal.txt

 Older versions are available as:

! http://starship.skyport.net/~lemburg/unicode-proposal-X.X.txt

--- 20,28 ----
 The latest version of this document is always available at:

! http://starship.python.net/~lemburg/unicode-proposal.txt

 Older versions are available as:

! http://starship.python.net/~lemburg/unicode-proposal-X.X.txt

***************
*** 102,106 ****
 needed, but if you include Latin-1 characters not defined in ASCII, it
 may well be worthwhile including a hint since people in other
! countries will want to be able to read you source strings too.

--- 102,106 ----
 needed, but if you include Latin-1 characters not defined in ASCII, it
 may well be worthwhile including a hint since people in other
! countries will want to be able to read your source strings too.

***************
*** 170,174 ****

 In containment tests ('a' in u'abc' and u'a' in 'abc') both sides
! should be coerced to Unicode before applying the test. Errors occuring
 during coercion (e.g. None in u'abc') should not be masked.

--- 170,174 ----

 In containment tests ('a' in u'abc' and u'a' in 'abc') both sides
! should be coerced to Unicode before applying the test. Errors occurring
 during coercion (e.g. None in u'abc') should not be masked.

***************
*** 185,189 ****

 All string methods should delegate the call to an equivalent Unicode
! object method call by converting all envolved strings to Unicode and
 then applying the arguments to the Unicode method of the same name,
 e.g.
--- 185,189 ----

 All string methods should delegate the call to an equivalent Unicode
! object method call by converting all involved strings to Unicode and
 then applying the arguments to the Unicode method of the same name,
 e.g.
***************
*** 200,204 ****
 -----------

! UnicodeError is defined in the exceptions module as subclass of
 ValueError. It is available at the C level via PyExc_UnicodeError.
 All exceptions related to Unicode encoding/decoding should be
--- 200,204 ----
 -----------

! UnicodeError is defined in the exceptions module as a subclass of
 ValueError. It is available at the C level via PyExc_UnicodeError.
 All exceptions related to Unicode encoding/decoding should be
***************
*** 269,273 ****

 'utf-8': 8-bit variable length encoding
! 'utf-16': 16-bit variable length encoding (litte/big endian)
 'utf-16-le': utf-16 but explicitly little endian
 'utf-16-be': utf-16 but explicitly big endian
--- 269,273 ----

 'utf-8': 8-bit variable length encoding
! 'utf-16': 16-bit variable length encoding (little/big endian)
 'utf-16-le': utf-16 but explicitly little endian
 'utf-16-be': utf-16 but explicitly big endian
***************
*** 285,289 ****

 All other encodings such as the CJK ones to support Asian scripts
! should be implemented in seperate packages which do not get included
 in the core Python distribution and are not a part of this proposal.

--- 285,289 ----

 All other encodings such as the CJK ones to support Asian scripts
! should be implemented in separate packages which do not get included
 in the core Python distribution and are not a part of this proposal.

***************
*** 325,329 ****
 def encode(self,input,errors='strict'):

! """ Encodes the object intput and returns a tuple (output
 object, length consumed).

--- 325,329 ----
 def encode(self,input,errors='strict'):

! """ Encodes the object input and returns a tuple (output
 object, length consumed).

***************
*** 332,336 ****

 The method may not store state in the Codec instance. Use
! SteamCodec for codecs which have to keep state in order to
 make encoding/decoding efficient.

--- 332,336 ----

 The method may not store state in the Codec instance. Use
! StreamCodec for codecs which have to keep state in order to
 make encoding/decoding efficient.

***************
*** 351,355 ****

 The method may not store state in the Codec instance. Use
! SteamCodec for codecs which have to keep state in order to
 make encoding/decoding efficient.

--- 351,355 ----

 The method may not store state in the Codec instance. Use
! StreamCodec for codecs which have to keep state in order to
 make encoding/decoding efficient.

***************
*** 491,495 ****
 .readline() method -- there is currently no support for
 line breaking using the codec decoder due to lack of line
! buffering. Sublcasses should however, if possible, try to
 implement this method using their own knowledge of line
 breaking.
--- 491,495 ----
 .readline() method -- there is currently no support for
 line breaking using the codec decoder due to lack of line
! buffering. Subclasses should however, if possible, try to
 implement this method using their own knowledge of line
 breaking.
***************
*** 528,532 ****

 Note that no stream repositioning should take place.
! This method is primarely intended to be able to recover
 from decoding errors.

--- 528,532 ----

 Note that no stream repositioning should take place.
! This method is primarily intended to be able to recover
 from decoding errors.

***************
*** 554,558 ****
 It is not required by the Unicode implementation to use these base
 classes, only the interfaces must match; this allows writing Codecs as
! extensions types.

 As guideline, large mapping tables should be implemented using static
--- 554,558 ----
 It is not required by the Unicode implementation to use these base
 classes, only the interfaces must match; this allows writing Codecs as
! extension types.

 As guideline, large mapping tables should be implemented using static
***************
*** 629,634 ****

 Support for these is left to user land Codecs and not explicitly
! intergrated into the core. Note that due to the Internal Format being
! implemented, only the area between \uE000 and \uF8FF is useable for
 private encodings.

--- 629,634 ----

 Support for these is left to user land Codecs and not explicitly
! integrated into the core. Note that due to the Internal Format being
! implemented, only the area between \uE000 and \uF8FF is usable for
 private encodings.

***************
*** 650,654 ****

 It is the Codec's responsibility to ensure that the data they pass to
! the Unicode object constructor repects this assumption. The
 constructor does not check the data for Unicode compliance or use of
 surrogates.
--- 650,654 ----

 It is the Codec's responsibility to ensure that the data they pass to
! the Unicode object constructor respects this assumption. The
 constructor does not check the data for Unicode compliance or use of
 surrogates.
***************
*** 657,661 ****
 set of all UTF-16 addressable characters (around 1M characters).

! The Unicode API should provide inteface routines from <PythonUnicode>
 to the compiler's wchar_t which can be 16 or 32 bit depending on the
 compiler/libc/platform being used.
--- 657,661 ----
 set of all UTF-16 addressable characters (around 1M characters).

! The Unicode API should provide interface routines from <PythonUnicode>
 to the compiler's wchar_t which can be 16 or 32 bit depending on the
 compiler/libc/platform being used.