[Python-checkins] CVS: python/dist/src/Include unicodeobject.h,2.35,2.36

2001年10月18日 19:01:33 -0700

Update of /cvsroot/python/python/dist/src/Include
In directory usw-pr-cvs1:/tmp/cvs-serv8577/Include
Modified Files:
	unicodeobject.h 
Log Message:
SF patch #470578: Fixes to synchronize unicode() and str()
 This patch implements what we have discussed on python-dev late in
 September: str(obj) and unicode(obj) should behave similar, while
 the old behaviour is retained for unicode(obj, encoding, errors).
 The patch also adds a new feature with which objects can provide
 unicode(obj) with input data: the __unicode__ method. Currently no
 new tp_unicode slot is implemented; this is left as option for the
 future.
 Note that PyUnicode_FromEncodedObject() no longer accepts Unicode
 objects as input. The API name already suggests that Unicode
 objects do not belong in the list of acceptable objects and the
 functionality was only needed because
 PyUnicode_FromEncodedObject() was being used directly by
 unicode(). The latter was changed in the discussed way:
 * unicode(obj) calls PyObject_Unicode() 
 * unicode(obj, encoding, errors) calls PyUnicode_FromEncodedObject() 
 One thing left open to discussion is whether to leave the
 PyUnicode_FromObject() API as a thin API extension on top of
 PyUnicode_FromEncodedObject() or to turn it into a (macro) alias
 for PyObject_Unicode() and deprecate it. Doing so would have some
 surprising consequences though, e.g. u"abc" + 123 would turn out
 as u"abc123"...
[Marc-Andre didn't have time to check this in before the deadline. I
hope this is OK, Marc-Andre! You can still make changes and commit
them on the trunk after the branch has been made, but then please mail
Barry a context diff if you want the change to be merged into the
2.2b1 release branch. GvR]
Index: unicodeobject.h
===================================================================
RCS file: /cvsroot/python/python/dist/src/Include/unicodeobject.h,v
retrieving revision 2.35
retrieving revision 2.36
diff -C2 -d -r2.35 -r2.36
*** unicodeobject.h	2001年09月20日 10:35:45	2.35
--- unicodeobject.h	2001年10月19日 02:01:31	2.36
***************
*** 455,466 ****
 Coercion is done in the following way:

! 1. Unicode objects are passed back as-is with incremented
! refcount.
! 
! 2. String and other char buffer compatible objects are decoded
 under the assumptions that they contain data using the current
 default encoding. Decoding is done in "strict" mode.

! 3. All other objects raise an exception.

 The API returns NULL in case of an error. The caller is responsible
--- 455,464 ----
 Coercion is done in the following way:

! 1. String and other char buffer compatible objects are decoded
 under the assumptions that they contain data using the current
 default encoding. Decoding is done in "strict" mode.

! 2. All other objects (including Unicode objects) raise an
! exception.

 The API returns NULL in case of an error. The caller is responsible
***************
*** 475,484 ****
 );

! /* Shortcut for PyUnicode_FromEncodedObject(obj, NULL, "strict");
! which results in using the default encoding as basis for 
! decoding the object.
! 
! Coerces obj to an Unicode object and return a reference with
 *incremented* refcount.

 The API returns NULL in case of an error. The caller is responsible
--- 473,483 ----
 );

! /* Coerce obj to an Unicode object and return a reference with
 *incremented* refcount.
+ 
+ Unicode objects are passed back as-is (subclasses are converted to
+ true Unicode objects), all other objects are delegated to
+ PyUnicode_FromEncodedObject(obj, NULL, "strict") which results in
+ using the default encoding as basis for decoding the object.

 The API returns NULL in case of an error. The caller is responsible