This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年03月28日 12:01 by malin, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| idle_fix_non_bmp.patch | serhiy.storchaka, 2014年07月12日 11:20 | review | ||
| nonbmp_except_check.patch | malin, 2014年07月25日 00:59 | review | ||
| nonbmp_except_check_v2.patch | malin, 2014年07月25日 03:10 | review | ||
| Messages (14) | |||
|---|---|---|---|
| msg215038 - (view) | Author: Ma Lin (malin) * | Date: 2014年03月28日 12:01 | |
When open a file with characters above the range (U+0000-U+FFFF), IDLE quit without any report. For example, open this file \Lib\test\test_re.py The below is Traceback info, the last line tells the reason. I just hope IDLE say something before quit, so we can know what happend. I have checked Python 3.3.5 and 3.4.0, they have the same problem. I didn't find a 3.5 build, so I can't test this problem under 3.5. ============================================= Exception in Tkinter callback Traceback (most recent call last): File "C:\Python33\lib\tkinter\__init__.py", line 1489, in __call__ return self.func(*args) File "C:\Python33\lib\idlelib\IOBinding.py", line 186, in open flist.open(filename) File "C:\Python33\lib\idlelib\FileList.py", line 36, in open edit = self.EditorWindow(self, filename, key) File "C:\Python33\lib\idlelib\PyShell.py", line 126, in __init__ EditorWindow.__init__(self, *args) File "C:\Python33\lib\idlelib\EditorWindow.py", line 288, in __init__ if io.loadfile(filename): File "C:\Python33\lib\idlelib\IOBinding.py", line 236, in loadfile self.text.insert("1.0", chars) File "C:\Python33\lib\idlelib\Percolator.py", line 25, in insert self.top.insert(index, chars, tags) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 81, in insert self.addcmd(InsertCommand(index, chars, tags)) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 116, in addcmd cmd.do(self.delegate) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 219, in do text.insert(self.index1, self.chars, self.tags) File "C:\Python33\lib\idlelib\ColorDelegator.py", line 85, in insert self.delegate.insert(index, chars, tags) File "C:\Python33\lib\idlelib\WidgetRedirector.py", line 104, in __call__ return self.tk_call(self.orig_and_operation + args) _tkinter.TclError: character U+1d518 is above the range (U+0000-U+FFFF) allowed by Tcl |
|||
| msg215039 - (view) | Author: Ma Lin (malin) * | Date: 2014年03月28日 12:02 | |
When open a file with characters above the range (U+0000-U+FFFF), IDLE quit without any report. For example, open this file C:\Python33\lib\test\test_re.py The below is Traceback info, the last line tells the reason. I just hope IDLE say something before quit, so we can know what happend. I have checked Python 3.3.5 and 3.4.0, they have the same problem. I didn't find a 3.5 build, so I can't test this problem under 3.5. ============================================= Exception in Tkinter callback Traceback (most recent call last): File "C:\Python33\lib\tkinter\__init__.py", line 1489, in __call__ return self.func(*args) File "C:\Python33\lib\idlelib\IOBinding.py", line 186, in open flist.open(filename) File "C:\Python33\lib\idlelib\FileList.py", line 36, in open edit = self.EditorWindow(self, filename, key) File "C:\Python33\lib\idlelib\PyShell.py", line 126, in __init__ EditorWindow.__init__(self, *args) File "C:\Python33\lib\idlelib\EditorWindow.py", line 288, in __init__ if io.loadfile(filename): File "C:\Python33\lib\idlelib\IOBinding.py", line 236, in loadfile self.text.insert("1.0", chars) File "C:\Python33\lib\idlelib\Percolator.py", line 25, in insert self.top.insert(index, chars, tags) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 81, in insert self.addcmd(InsertCommand(index, chars, tags)) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 116, in addcmd cmd.do(self.delegate) File "C:\Python33\lib\idlelib\UndoDelegator.py", line 219, in do text.insert(self.index1, self.chars, self.tags) File "C:\Python33\lib\idlelib\ColorDelegator.py", line 85, in insert self.delegate.insert(index, chars, tags) File "C:\Python33\lib\idlelib\WidgetRedirector.py", line 104, in __call__ return self.tk_call(self.orig_and_operation + args) _tkinter.TclError: character U+1d518 is above the range (U+0000-U+FFFF) allowed by Tcl |
|||
| msg215040 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2014年03月28日 12:31 | |
See #13153. |
|||
| msg222817 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014年07月12日 01:03 | |
Accidentally set to pending I take it. |
|||
| msg222834 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年07月12日 11:20 | |
Yes, this is very similar to issue13153. Both these issues can have same solution or can have different solutions. This issue relates to more realistic situation and therefore is more important. Here is simple and almost working solution for this issue. Unfortunately it works incorrectly when astral characters are encountered in raw string literals. More mature solution should parse sources and convert raw string literals containing astral characters to non-raw string literals. But this will not work with invalid Python files and non-Python files. I afraid this issue has not perfect solution. The question is which imperfect solution and compromise we will decided enough acceptable. |
|||
| msg223007 - (view) | Author: Ma Lin (malin) * | Date: 2014年07月14日 09:15 | |
I suggest don't change the content of file, just give a message such as: IDLE can't display non-BMP character (codepoint above 0xFFFF). A non-BMP character found in Line 23, position 8 of aaaa.py, please open this file with other editor. |
|||
| msg223843 - (view) | Author: Ma Lin (malin) * | Date: 2014年07月24日 14:47 | |
I wrote this code, but I don't know how to make a patch.
Insert these codes in C:\Python34\Lib\idlelib\IOBinding.py
Around line 234, before this line:
self.text.delete("1.0", "end")
# check non-bmp characters
line_count = 1
position_count = 1
for char in chars:
if char == '\n':
line_count += 1
position_count = 1
if ord(char) > 0xFFFF:
nonbmp_msg = ("IDLE can't display non-BMP characters "
"(codepoint above 0xFFFF).\n"
"A non-BMP character found at line %d, "
"position %d of file %s, codepoint 0x%X.\n"
"Please open this file with another editor.")
tkMessageBox.showerror("non-BMP character",
nonbmp_msg %
(line_count, position_count,
filename, ord(char)),
parent=self.text)
return False
position_count += 1
|
|||
| msg223846 - (view) | Author: Ma Lin (malin) * | Date: 2014年07月24日 14:56 | |
Changing the second "if" to "elif" is better. I'm sorry, I have never submitted patch. If somebody gives a hand, feel free to modify those codes. |
|||
| msg223848 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2014年07月24日 15:26 | |
See https://docs.python.org/devguide/patch.html |
|||
| msg223912 - (view) | Author: Ma Lin (malin) * | Date: 2014年07月25日 00:59 | |
Feel free to modify this patch. |
|||
| msg223915 - (view) | Author: Ma Lin (malin) * | Date: 2014年07月25日 03:10 | |
nonbmp_except_check_v2.patch changes character numbers to 0-based, same as IDLE. Quote from www.tkdocs.com : "for historical conventions related to how programmers normally refer to lines and characters, line numbers are 1-based, and character numbers are 0-based." |
|||
| msg266443 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2016年05月26日 16:07 | |
Tk Text (and other widgets, but Text is the main issue) has two display problems: astral chars and long lines (over a thousand chars, say). These problems can manifest in various places: file names, shell input (keyboard or clipboard), shell output, editor input (keyboard, clipboard, or file). IDLE needs to take more control over what is displayed to work around both problems. Tk Text also has a display feature: substring tagging. I have been heistant to simple replace astral chars with their \U000hhhhh expansion because of the aliasing problem: in shell output, for instance, the user would not know if the program wrote 1 char or 10. It would also be impossible to know if a reverse transformation might be needed. Tagging astral expansions would solve both problems. import re astral = re.compile(r'([^\x00-\uffff])') s = 'X\U00011111Y\U00011112\U00011113Z' for i, ss in enumerate(re.split(astral, s)): if not i%2: print(ss, end='') else: print(r'\\U%08x' % ord(ss), end='') # prints X\\U00011111Y\\U00011112\\U00011113Z Now replace print with test.insert, with an 'astral' tag for the second. tk will not double '\'s. Astral tag could switch, for instance, to underline version of current font. This should work with any color scheme. [Separate but related issue: augment Format or context menu with functions to convert between literal char, escape string, and name representation (using unicodedatabase).] |
|||
| msg353933 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2019年10月04日 12:03 | |
Fixed by PR 16545 (see issue13153). |
|||
| msg353969 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2019年10月04日 18:50 | |
As noted on #13153, files with astral chars can now be read without an exception, but the presence of astral chars messes up editing text that follows at least on the same line by misplacing the cursor. I will open a new issue about replacing such with \U escapes. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:00 | admin | set | github: 65283 |
| 2019年10月04日 18:50:19 | terry.reedy | set | messages: + msg353969 |
| 2019年10月04日 12:03:40 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg353933 stage: needs patch -> resolved |
| 2018年02月09日 22:39:45 | buhtz | set | nosy:
+ buhtz |
| 2016年05月28日 21:13:39 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2016年05月26日 16:07:45 | terry.reedy | set | versions:
+ Python 3.6, - Python 2.7, Python 3.4 nosy: + terry.reedy messages: + msg266443 resolution: duplicate -> (no value) |
| 2015年12月06日 12:59:11 | THRlWiTi | set | nosy:
+ THRlWiTi |
| 2014年07月25日 03:10:31 | malin | set | files:
+ nonbmp_except_check_v2.patch messages: + msg223915 |
| 2014年07月25日 00:59:10 | malin | set | files:
+ nonbmp_except_check.patch messages: + msg223912 |
| 2014年07月24日 15:26:46 | ezio.melotti | set | messages: + msg223848 |
| 2014年07月24日 14:56:43 | malin | set | messages: + msg223846 |
| 2014年07月24日 14:47:04 | malin | set | messages: + msg223843 |
| 2014年07月15日 18:56:17 | serhiy.storchaka | set | stage: needs patch |
| 2014年07月14日 10:15:09 | rhettinger | set | priority: normal -> high |
| 2014年07月14日 09:15:13 | malin | set | messages: + msg223007 |
| 2014年07月12日 11:20:02 | serhiy.storchaka | set | files:
+ idle_fix_non_bmp.patch assignee: serhiy.storchaka components: + Tkinter, Unicode versions: + Python 2.7, Python 3.5, - Python 3.3 keywords: + patch nosy: + vstinner messages: + msg222834 |
| 2014年07月12日 01:03:50 | BreamoreBoy | set | status: pending -> open nosy: + BreamoreBoy messages: + msg222817 |
| 2014年03月28日 12:31:15 | ezio.melotti | set | status: open -> pending nosy: + ezio.melotti, serhiy.storchaka messages: + msg215040 type: crash -> behavior resolution: duplicate |
| 2014年03月28日 12:02:51 | malin | set | messages: + msg215039 |
| 2014年03月28日 12:01:05 | malin | create | |