This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2007年12月13日 21:41 by filip, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| strxfrm-unicode.diff | filip, 2007年12月13日 21:41 | |||
| Messages (5) | |||
|---|---|---|---|
| msg58592 - (view) | Author: Filip Salomonsson (filip) | Date: 2007年12月13日 21:41 | |
locale.strxfrm currently does not handle non-ascii strings:
$ ./python
Python 3.0a2 (py3k:59482, Dec 13 2007, 21:27:14)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, "en_US.utf8")
'en_US.utf8'
>>> locale.strxfrm("a")
'\x0c\x01\x08\x01\x02'
>>> locale.strxfrm("\N{LATIN SMALL LETTER A WITH DIAERESIS}")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: strxfrm() argument 1 must be string without null bytes, not str
The attached patch tries to fix this:
$ ./python
Python 3.0a2 (py3k:59482M, Dec 13 2007, 21:58:09)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import locale
>>> locale.setlocale(locale.LC_COLLATE, "en_US.utf8")
'en_US.utf8'
>>> locale.strxfrm("a")
'.\x01\x10\x01\x02'
>>> locale.strxfrm("\N{LATIN SMALL LETTER A WITH DIAERESIS}")
'.\x01\x19\x01\x02'
>>> alist = list("aboåäöABOÅÄÖñÑ")
>>> sorted(alist, cmp=locale.strcoll) == sorted(alist, key=locale.strxfrm)
True
The patch does not include what's needed to define HAVE_WCSXFRM, since I
really don't know how to do that properly (I edited 'configure' and
'pyconfig.h.in' manually to compile it).
|
|||
| msg58596 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年12月13日 22:18 | |
locale.strxfrm needs to be removed in Python 3, probably along with the entire locale module. We can't support it anymore. |
|||
| msg58599 - (view) | Author: Christian Heimes (christian.heimes) * (Python committer) | Date: 2007年12月14日 00:29 | |
What's wrong with the locale module? |
|||
| msg58615 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2007年12月14日 06:49 | |
It operates on char*, not Unicode strings. |
|||
| msg63396 - (view) | Author: Martin v. Löwis (loewis) * (Python committer) | Date: 2008年03月08日 10:55 | |
I found a way to fix this, using wchar_t functions. Fixed in r61307. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:28 | admin | set | github: 45959 |
| 2008年03月08日 10:55:18 | loewis | set | status: open -> closed resolution: fixed messages: + msg63396 |
| 2007年12月14日 06:49:53 | loewis | set | messages: + msg58615 |
| 2007年12月14日 00:29:50 | christian.heimes | set | nosy:
+ christian.heimes messages: + msg58599 |
| 2007年12月13日 22:18:57 | loewis | set | nosy:
+ loewis messages: + msg58596 |
| 2007年12月13日 21:41:48 | filip | create | |