This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2008年02月08日 21:21 by josephoenix, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue2052.diff | berker.peksag, 2014年05月13日 19:25 | review | ||
| issue2052_html5.diff | berker.peksag, 2014年05月13日 19:26 | review | ||
| issue2052_html5_v2.diff | berker.peksag, 2014年05月18日 01:20 | review | ||
| issue2052_v2.diff | berker.peksag, 2015年03月13日 19:50 | review | ||
| issue2052_v3.diff | berker.peksag, 2015年03月14日 17:08 | review | ||
| issue2052_v4.diff | berker.peksag, 2015年03月14日 17:50 | review | ||
| Messages (15) | |||
|---|---|---|---|
| msg62208 - (view) | Author: (josephoenix) | Date: 2008年02月08日 21:21 | |
When passed unicode strings, difflib.HtmlDiff.make_file and make_table fail with a UnicodeEncodeError. Also, the html outputted by make_file seems to be hardcoded to use charset=ISO-8859-1 (line 1584 of difflib.py) |
|||
| msg62209 - (view) | Author: (josephoenix) | Date: 2008年02月08日 21:34 | |
Oops, please close this. Apparently was fixed in 2.5.1, and I'm just behind. |
|||
| msg62211 - (view) | Author: (josephoenix) | Date: 2008年02月08日 21:51 | |
After installing 2.5.1, the UnicodeEncodeError is gone, but the charset is still hardcoded in difflib._file_template. So, I guess this is still a separate bug. |
|||
| msg116949 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2010年09月20日 14:59 | |
difflib._file_template is still hard-coded in py3k SVN. I'm unsure as to whether this is a feature request, a behaviour issue or not an issue at all, can someone please advise, thanks. |
|||
| msg117234 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2010年09月23日 21:11 | |
I believe that charset is the standard default for html, which would make this a feature request. |
|||
| msg184726 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2013年03月20日 02:57 | |
In 3.2, it is line 1629: content="text/html; charset=ISO-8859-1" /> That charset was only standard for Western European documents limited to that charset. Now, even such limited-char docs often use 'utf-8' (python.org does). The result of putting an incorrect charset designation in an html file is that the browser will not display the file correctly. For instance, I tried an input sequence containing line 'c\u3333', which displays in IDLE as 'cフィート'. The string from HtmlDill.make_file() must be written to a file opened with encoding='utf-8', not the above or equivalent. Firefox then reads the three bytes of the utf-8 encoding as three separate characters and displays 'cãŒ3'. To check: >>> 'cフィート'.encode().decode(encoding='Latin-1') 'cã\x8c3' To me the clear implication of "returns a string which is a complete HTML file containing a table showing line by line differences with inter-line and intra-line changes highlighted." is that the resulting file will display correctly. The current template charset prevents that, changing to 'utf-8' results in a file that displays correctly (tested). So the current behavior and the code that causes it is to me clearly a bug. I would like to fix it before 2.7.4 comes out. |
|||
| msg184751 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2013年03月20日 11:15 | |
After thinking about it more, the real problem is that the charset setting must match the chars used and how they re encoded, so no one setting is right for all uses. An alternative to changing the default in existing versions is to at least document what it is and explain how to work around it with .replace -- for instance output.replace('ISO-8859-1', 'utf-8'). I agree that adding a parameter (charset=xxx) is a new feature.
|
|||
| msg184755 - (view) | Author: Ezio Melotti (ezio.melotti) * (Python committer) | Date: 2013年03月20日 12:13 | |
I haven't looked at the code, but if an HTML page is generated it should probably be updated to use HTML5 and <meta charset="utf-8">. |
|||
| msg218479 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2014年05月13日 19:25 | |
Attaching two patches: issue2052.diff adds a "charset" keyword argument to HtmlDiff.make_file(). issue2052_html5.diff also adds a "charset" keyword argument to HtmlDiff.make_file() and updates the markup of HtmlDiff() to HTML5. I tested it with Firefox 29 and Chrome 34. |
|||
| msg218726 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2014年05月18日 01:20 | |
Attaching a new version of issue2052_html5.diff. Changes: - Switch from px to em in CSS - Cleanup markup a bit (e.g. delete redundant colgroup tags) |
|||
| msg237383 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月06日 21:26 | |
May be updating the markup to HTML5 should be different issue. issue2052_html5_v2.diff not only adds charset in HTML5 format, it totally changes the template. This definitely a separate issue. |
|||
| msg238050 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2015年03月13日 19:50 | |
Here is an updated patch. Thanks for the review, Serhiy. I will open a new issue for the HTML 5 part of the patch. |
|||
| msg238097 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年03月14日 19:37 | |
LGTM |
|||
| msg238104 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年03月14日 23:18 | |
New changeset e058423d3ca4 by Berker Peksag in branch 'default': Issue #2052: Add charset parameter to HtmlDiff.make_file(). https://hg.python.org/cpython/rev/e058423d3ca4 |
|||
| msg238105 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2015年03月14日 23:19 | |
Thanks Serhiy. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:30 | admin | set | github: 46328 |
| 2015年03月14日 23:19:45 | berker.peksag | set | status: open -> closed resolution: fixed messages: + msg238105 stage: commit review -> resolved |
| 2015年03月14日 23:18:50 | python-dev | set | nosy:
+ python-dev messages: + msg238104 |
| 2015年03月14日 19:37:49 | serhiy.storchaka | set | assignee: berker.peksag messages: + msg238097 stage: patch review -> commit review |
| 2015年03月14日 17:50:11 | berker.peksag | set | files: + issue2052_v4.diff |
| 2015年03月14日 17:08:49 | berker.peksag | set | files: + issue2052_v3.diff |
| 2015年03月13日 19:50:32 | berker.peksag | set | files:
+ issue2052_v2.diff messages: + msg238050 |
| 2015年03月06日 21:26:14 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg237383 |
| 2014年07月30日 10:41:30 | hashimo | set | nosy:
+ hashimo |
| 2014年05月18日 01:20:43 | berker.peksag | set | files:
+ issue2052_html5_v2.diff messages: + msg218726 |
| 2014年05月13日 19:26:11 | berker.peksag | set | files: + issue2052_html5.diff |
| 2014年05月13日 19:25:57 | berker.peksag | set | files:
+ issue2052.diff versions: + Python 3.5, - Python 3.2 keywords: + patch nosy: + berker.peksag messages: + msg218479 stage: needs patch -> patch review |
| 2014年04月19日 16:00:57 | orsenthil | set | nosy:
- orsenthil |
| 2014年02月03日 18:38:49 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2013年03月20日 12:13:17 | ezio.melotti | set | messages: + msg184755 |
| 2013年03月20日 11:15:31 | terry.reedy | set | messages: + msg184751 |
| 2013年03月20日 02:57:37 | terry.reedy | set | nosy:
+ ezio.melotti, terry.reedy, orsenthil, - tim.peters messages: + msg184726 |
| 2010年09月23日 21:11:26 | r.david.murray | set | assignee: tim.peters -> (no value) type: behavior -> enhancement versions: + Python 3.2, - Python 2.6 nosy: + r.david.murray messages: + msg117234 stage: test needed -> needs patch |
| 2010年09月20日 14:59:10 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg116949 |
| 2010年01月29日 04:40:41 | brian.curtin | set | stage: test needed versions: + Python 2.6, - Python 2.5 |
| 2008年03月18日 19:03:06 | jafo | set | priority: normal assignee: tim.peters nosy: + tim.peters title: Lack of difflib.HtmlDiff unicode support -> Allow changing difflib._file_template character encoding. |
| 2008年02月08日 21:51:40 | josephoenix | set | messages: + msg62211 |
| 2008年02月08日 21:34:08 | josephoenix | set | messages: + msg62209 |
| 2008年02月08日 21:21:53 | josephoenix | create | |