| Files |
| File name |
Uploaded |
Description |
Edit |
|
python_mbstring_diff.txt
|
hyeshik.chang,
2001年11月09日 07:10
|
patch to Objects/stringobject.c |
|
configure.in.diff.txt
|
hyeshik.chang,
2001年12月10日 03:20
|
2nd) autoconf detect for mbtowc(), iswprint() |
|
pyconfig.h.in.diff.txt
|
hyeshik.chang,
2001年12月10日 03:21
|
2nd) autoconf detect for mbtowc(), iswprint() |
|
stringobject.c.diff.txt
|
hyeshik.chang,
2001年12月10日 03:22
|
2nd) new clean(on my view) patch for Objects/stringobject.c |
|
mb3.diff
|
hyeshik.chang,
2002年04月01日 18:06
|
3rd) revised (includes patch for stringobject.c, configure.in and pyconfig.h.in) |
| Messages (10) |
|
msg38131 - (view) |
Author: Hyeshik Chang (hyeshik.chang) * (Python committer) |
Date: 2001年11月09日 07:10 |
Many multibyte language users are difficult to see
native characters on list or dictionary and etc.
This patch allows printing multibyte on UNIX98-
compatible machines; mbtowc() is ISO/IEC 9899:1990
standard C-API function.
|
|
msg38132 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2001年11月09日 21:21 |
Logged In: YES
user_id=21627
Even though I think this patch is correct in principle, I
see a few problems with it:
1. Since it doesn't fix a bug, it probably cannot go into 2.2.
2. There is no autoconf test for mbtowc. You should test
this in configure, and then conditionalize your code on
HAVE_MBTOWC.
3. There is too much code duplication. Try to find a
solution which special-cases the escape codes (\something)
only once. For example, you may implement a trivial mbtowc
redefinition if mbtowc is not available, and then use mbtowc
always.
|
|
msg38133 - (view) |
Author: Guido van Rossum (gvanrossum) * (Python committer) |
Date: 2001年12月04日 19:08 |
Logged In: YES
user_id=6380
I don't understand the point of using mbtowc() here.
The code extracts a wide character, but then it uses
isprint() on it, and as far as I know, isprint() is not
defined on wide characters, only on 'unsigned char' (and on
-1).
Isn't what the author wants simply to is isprint(c) instead
of (c < ' ' || c >= 0x7f)???
|
|
msg38134 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2001年12月06日 15:12 |
Logged In: YES
user_id=21627
You are right, the code should use iswprint instead.
The point is that multiple subsequent bytes can make up a
single printable character. Not every character above 127 is
necessarily printable (e.g. in Latin-1, only characters
above 160 are printable). Likewise, a single byte may not be
printable, but a combination will print fine. So this code
is supposed to catch only those cases where printing will
actually work.
|
|
msg38135 - (view) |
Author: Hyeshik Chang (hyeshik.chang) * (Python committer) |
Date: 2001年12月07日 06:38 |
Logged In: YES
user_id=55188
Yes, it should be changed to iswprint on Linux systems.
(but, isprint of BSD systems was designed for wide
characters)
As loewis told, EUC codes of Korea, Japan, Taiwan doesn't
use 0x7F-0x9F for printable character. So, I think that
using mbtowc is unavoidable.
|
|
msg38136 - (view) |
Author: Guido van Rossum (gvanrossum) * (Python committer) |
Date: 2001年12月07日 13:21 |
Logged In: YES
user_id=6380
Still, the patch as it exists is unacceptable -- it needs
configure support to decide whether to use mbtowc() and
whether to use iswprint() or isprint() (I would hope on BSD
there is also an iswprint(), to be standard-conforming).
|
|
msg38137 - (view) |
Author: Hyeshik Chang (hyeshik.chang) * (Python committer) |
Date: 2001年12月10日 03:26 |
Logged In: YES
user_id=55188
I uploaded 2nd patches which contains configure support.
Unfortunately, Citrus(new generation locale support for
*BSDs) didn't implemented iswprint() yet. but *BSDs
supports wide character via Rune Locale isprint() func.
|
|
msg38138 - (view) |
Author: Hyeshik Chang (hyeshik.chang) * (Python committer) |
Date: 2001年12月10日 03:38 |
Logged In: YES
user_id=55188
Oops, one mistake. sorry.
stringobject.c:646
else if (_ISPRINT(c)) {
->
else if (cr > 0 && _ISPRINT(c)) {
(to detect whether mbtowc failed to convert)
|
|
msg38139 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2002年10月07日 13:58 |
Logged In: YES
user_id=21627
Thanks for the patch, committed as
configure 1.343;
configure.in 1.354;
pyconfig.h.in 1.51;
stringobject.c 2.190;
I'm not quite sure that your correction is correct: If we
invoke iswprint, cr is already guaranteed to be >0, since we
otherwise goto nonprintable.
|
|
msg38140 - (view) |
Author: Martin v. Löwis (loewis) * (Python committer) |
Date: 2002年10月11日 05:38 |
Logged In: YES
user_id=21627
The patch was causing too many problems, so I had to back it
out.
|