Message140637
| Author |
vstinner |
| Recipients |
Arfrever, Nicholas.Cole, akuchling, cben, gpolo, inigoserna, python-dev, r.david.murray, schodet, vstinner, zeha |
| Date |
2011年07月19日.00:19:49 |
| SpamBayes Score |
0.0 |
| Marked as misclassified |
No |
| Message-id |
<1311034792.46.0.229441503017.issue12567@psf.upfronthosting.co.za> |
| In-reply-to |
| Content |
Patch the _curses module to improve Unicode support:
- add an encoding attribute to a window (only visible in C): read the locale encoding
- encode a character and a character string to the window encoding if the ncursesw library is NOT used
- addch(), addstr(), addnstr(), insstr() and insnstr() use the wide character functions if the ncursesw library is used
- PyCurses_ConvertToChtype() checks for integer overflow and rejects values outside [0; 255]
The check on the ncursesw library availability is done in setup.py because the library linked to _curses depends on the readline library (see issues #7384 and #9408).
I don't know if wide character functions can be available in curses or ncurses library.
Details:
- locale encoding: use GetConsoleOutputCP() on Windows, nl_langinfo(CODESET) if available, or "utf-8"
- don't encode a character to the window encoding if its code is in [0; 127] (use the Unicode point code): all encoding are compatible with ASCII... except some encodings like JIS X 0201. In JIS, 0x5C is decoded to the yen sign (U+00A5) instead of a backslash (U+005C).
- if an encoded character is longer than 1 byte, raise a OverflowError. For example, U+00E9 (é) encoded to UTF-8 gives b'\xC3\xA9' (two bytes).
- copy the encoding when creating a subwindow.
- use a global variable, screen_encoding, in PyCurses_UnCtrl() and PyCurses_UngetCh()
It's not possible to specify an encoding.
GetConsoleOutputCP() is maybe not the right code on Windows if a text application doesn't run in a Windows console (e.g. if it uses its own terminal emulator). GetOEMCP() is maybe a better choice, or a function should be added to specify the encoding used by the _curses module (override the "locale encoding").
If a function is added to specify the encoding, I think that it is better to add a global function instead of adding an argument to functions creating a new window object (initscr(), getwin(), subwin(), derwin(), newpad()). |
|
History
|
|---|
| Date |
User |
Action |
Args |
| 2011年07月19日 00:19:53 | vstinner | set | recipients:
+ vstinner, akuchling, cben, gpolo, Arfrever, r.david.murray, inigoserna, zeha, schodet, python-dev, Nicholas.Cole |
| 2011年07月19日 00:19:52 | vstinner | set | messageid: <1311034792.46.0.229441503017.issue12567@psf.upfronthosting.co.za> |
| 2011年07月19日 00:19:51 | vstinner | link | issue12567 messages |
| 2011年07月19日 00:19:51 | vstinner | create |
|