homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Write unescaped unicode characters (Japanese, Chinese, etc) in JSON module when "ensure_ascii=False"
Type: enhancement Stage:
Components: Extension Modules, Unicode Versions: Python 3.3, Python 2.7
process
Status: closed Resolution: works for me
Dependencies: Superseder:
Assigned To: Nosy List: Michael.Kuss, ezio.melotti, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords:

Created on 2014年10月22日 18:55 by Michael.Kuss, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (6)
msg229830 - (view) Author: Michael Kuss (Michael.Kuss) Date: 2014年10月22日 18:55
When running the following:
>> json.dump(['name': "港区"], myfile.json, indent=4, separators=(',', ': '), ensure_ascii=False)
the function escapes the unicode, even though I have explicitly asked to not force to ascii:
\u6E2F\u533A
By changing "__init__.py" such that the fp.write call encodes the text as utf-8, the output json file displays the human-readable text required (see below).
OLD (starting line 167):
if (not skipkeys and ensure_ascii and
 check_circular and allow_nan and
 cls is None and indent is None and separators is None and
 encoding == 'utf-8' and default is None and not kw):
 iterable = _default_encoder.iterencode(obj)
 else:
 if cls is None:
 cls = JSONEncoder
 iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
 separators=separators, encoding=encoding,
 default=default, **kw).iterencode(obj)
for chunk in iterable:
 fp.write(chunk)
NEW:
if (not skipkeys and ensure_ascii and
 check_circular and allow_nan and
 cls is None and indent is None and separators is None and
 encoding == 'utf-8' and default is None and not kw):
 iterable = _default_encoder.iterencode(obj)
 for chunk in iterable:
 fp.write(chunk)
 else:
 if cls is None:
 cls = JSONEncoder
 iterable = cls(skipkeys=skipkeys, ensure_ascii=ensure_ascii,
 check_circular=check_circular, allow_nan=allow_nan, indent=indent,
 separators=separators, encoding=encoding,
 default=default, **kw).iterencode(obj)
 for chunk in iterable:
 fp.write(chunk.encode('utf-8'))
msg229834 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014年10月22日 19:39
If I fix your example so it runs:
json.dump({'name': "港区"}, open('myfile.json', 'w'), indent=4, separators=(',', ': '), ensure_ascii=False)
I get the expected output:
rdmurray@pydev:~/python/p34>cat myfile.json 
{
 "name": "港区"
}
That example won't work in python2, of course, so you'd have to show us your actual code there.
msg230365 - (view) Author: Ezio Melotti (ezio.melotti) * (Python committer) Date: 2014年10月31日 18:29
The example works for me with both python 2 and 3. I'm going to close this in a while if OP doesn't reply.
$ python2 -c "import json; json.dump({'name': '港区'}, open('py2.json', 'w'), indent=4, separators=(',', ': '), ensure_ascii=False)" && cat py2.json
{
 "name": "港区"
}
$ python3 -c "import json; json.dump({'name': '港区'}, open('py3.json', 'w'), indent=4, separators=(',', ': '), ensure_ascii=False)" && cat py3.json
{
 "name": "港区"
}
msg230417 - (view) Author: Michael Kuss (Michael.Kuss) Date: 2014年11月01日 00:50
Pardon the delay - this json dump function is embedded in a much larger script, so it took some untangling to get it running on Python 3.3, and scrub some personal identifying info from it. This script also does not work in Python 3.3:
 File "C:/Users/mkuss/PycharmProjects/TestJSON\dump_list_to_json_file.py", line 319, in dump_list_to_json_file
 json.dump(addresses, outfile, indent=4, separators=(',', ': '))
 File "C:\Python33\lib\json\__init__.py", line 184, in dump
 fp.write(chunk)
TypeError: 'str' does not support the buffer interface
In python 2.7, I still get escaped unicode when I try writing this dictionary using json.dump, so the work-around that I pasted originally is how I'm choosing to accomplish the task for now.
I'd you'd like, I can spend more time debugging this issue I'm running into running the script in python 3.3, but it maybe be til next week when I have sufficient time to solve. THANKS --mike
msg230421 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2014年11月01日 01:47
That error message indicates you've opened the output file in binary mode instead of text mode.
msg231994 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2014年12月02日 13:19
Looks either you have opened a file with the backslashreplace error handler or ran Python with PYTHONIOENCODING which sets the backslashreplace error handler.
History
Date User Action Args
2022年04月11日 14:58:09adminsetgithub: 66890
2015年02月10日 08:43:39serhiy.storchakasetstatus: pending -> closed
2014年12月02日 13:19:23serhiy.storchakasetstatus: open -> pending
nosy: + serhiy.storchaka
messages: + msg231994

2014年11月01日 01:47:03r.david.murraysetmessages: + msg230421
2014年11月01日 00:50:19Michael.Kusssetstatus: pending -> open

messages: + msg230417
2014年10月31日 18:29:07ezio.melottisetstatus: open -> pending
resolution: works for me
messages: + msg230365
2014年10月22日 19:39:34r.david.murraysetnosy: + r.david.murray
messages: + msg229834
2014年10月22日 18:55:23Michael.Kusscreate

AltStyle によって変換されたページ (->オリジナル) /