[Python-checkins] CVS: python/dist/src/Modules _tkinter.c,1.97,1.98

Guido van Rossum python-dev@python.org
Thu, 4 May 2000 11:07:19 -0400 (EDT)


Update of /projects/cvsroot/python/dist/src/Modules
In directory eric:/projects/python/develop/guido/src/Modules
Modified Files:
	_tkinter.c 
Log Message:
Two changes to improve (I hope) Unicode support.
1. In Tcl 8.2 and later, use Tcl_NewUnicodeObj() when passing a Python
Unicode object rather than going through UTF-8. (This function
doesn't exist in Tcl 8.1, so there the original UTF-8 code is still
used; in Tcl 8.0 there is no support for Unicode.) This assumes that
Tcl_UniChar is the same thing as Py_UNICODE; a run-time error is
issued if this is not the case.
2. In Tcl 8.1 and later (i.e., whenever Tcl supports Unicode), when a
string returned from Tcl contains bytes with the top bit set, we
assume it is encoded in UTF-8, and decode it into a Unicode string
object.
Notes:
- Passing Unicode strings to Tcl 8.0 does not do the right thing; this
isn't worth fixing.
- When passing an 8-bit string to Tcl 8.1 or later that has bytes with
the top bit set, Tcl tries to interpret it as UTF-8; it seems to fall
back on Latin-1 for non-UTF-8 bytes. I'm not sure what to do about
this besides telling the user to disambiguate such strings by
converting them to Unicode (forcing the user to be explicit about the
encoding).
- Obviously it won't be possible to get binary data out of Tk this
way. Do we need that ability? How to do it?
Index: _tkinter.c
===================================================================
RCS file: /projects/cvsroot/python/dist/src/Modules/_tkinter.c,v
retrieving revision 1.97
retrieving revision 1.98
diff -C2 -r1.97 -r1.98
*** _tkinter.c	2000年05月03日 23:44:31	1.97
--- _tkinter.c	2000年05月04日 15:07:16	1.98
***************
*** 551,554 ****
--- 551,556 ----
 	}
 	else if (PyUnicode_Check(value)) {
+ #if TKMAJORMINOR <= 8001
+ 		/* In Tcl 8.1 we must use UTF-8 */
 		PyObject* utf8 = PyUnicode_AsUTF8String (value);
 		if (!utf8)
***************
*** 558,561 ****
--- 560,574 ----
 		Py_DECREF(utf8);
 		return result;
+ #else /* TKMAJORMINOR > 8001 */
+ 		/* In Tcl 8.2 and later, use Tcl_NewUnicodeObj() */
+ 		if (sizeof(Py_UNICODE) != sizeof(Tcl_UniChar)) {
+ 			/* XXX Should really test this at compile time */
+ 			PyErr_SetString(PyExc_SystemError,
+ 					"Py_UNICODE and Tcl_UniChar differ in size");
+ 			return 0;
+ 		}
+ 		return Tcl_NewUnicodeObj(PyUnicode_AS_UNICODE(value),
+ 					 PyUnicode_GET_SIZE(value));
+ #endif /* TKMAJORMINOR > 8001 */
 	}
 	else {
***************
*** 625,632 ****
 	if (i == TCL_ERROR)
 		Tkinter_Error(self);
! 	else
 		/* We could request the object result here, but doing
 		 so would confuse applications that expect a string. */
! 		res = PyString_FromString(Tcl_GetStringResult(interp));
 
 	LEAVE_OVERLAP_TCL
--- 638,661 ----
 	if (i == TCL_ERROR)
 		Tkinter_Error(self);
! 	else {
 		/* We could request the object result here, but doing
 		 so would confuse applications that expect a string. */
! 		char *s = Tcl_GetStringResult(interp);
! 		char *p = s;
! 		/* If the result contains any bytes with the top bit set,
! 		 it's UTF-8 and we should decode it to Unicode */
! 		while (*p != '0円') {
! 			if (*p & 0x80)
! 				break;
! 			p++;
! 		}
! 		if (*p == '0円')
! 			res = PyString_FromStringAndSize(s, (int)(p-s));
! 		else {
! 			/* Convert UTF-8 to Unicode string */
! 			p = strchr(p, '0円');
! 			res = PyUnicode_DecodeUTF8(s, (int)(p-s), "ignore");
! 		}
! 	}
 
 	LEAVE_OVERLAP_TCL

AltStyle によって変換されたページ (->オリジナル) /