homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author vstinner
Recipients Arfrever, ishimoto, loewis, methane, mrabarnett, ncoghlan, pitrou, serhiy.storchaka, vstinner
Date 2012年08月07日.02:01:33
SpamBayes Score -1.0
Marked as misclassified Yes
Message-id <1344304897.02.0.734555445654.issue15216@psf.upfronthosting.co.za>
In-reply-to
Content
Here is a Python implementation of TextIOWrapper.set_encoding().
The main limitation is that it is not possible to set the encoding on a non-seekable stream after the first read (if the read buffer is not empty, ie. if there are pending decoded characters).
+ # flush read buffer, may require to seek backward in the underlying
+ # file object
+ if self._decoded_chars:
+ if not self.seekable():
+ raise UnsupportedOperation(
+ "It is not possible to set the encoding "
+ "of a non seekable file after the first read")
+ assert self._snapshot is not None
+ dec_flags, next_input = self._snapshot
+ offset = self._decoded_chars_used - len(next_input)
+ if offset:
+ self.buffer.seek(offset, SEEK_CUR)
--
I don't have an use case for setting the encoding of sys.stdout/stderr after startup, but I would like to be able to control the *error handler* after the startup! My implementation permits to change both (encoding, errors, encoding and errors).
For example, Lib/test/regrtest.py uses the following function to force the backslashreplace error handler on sys.stdout. It changes the error handler to avoid UnicodeEncodeError when displaying the result of tests.
def replace_stdout():
 """Set stdout encoder error handler to backslashreplace (as stderr error
 handler) to avoid UnicodeEncodeError when printing a traceback"""
 import atexit
 stdout = sys.stdout
 sys.stdout = open(stdout.fileno(), 'w',
 encoding=stdout.encoding,
 errors="backslashreplace",
 closefd=False,
 newline='\n')
 def restore_stdout():
 sys.stdout.close()
 sys.stdout = stdout
 atexit.register(restore_stdout)
The doctest module uses another trick to change the error handler:
 save_stdout = sys.stdout
 if out is None:
 encoding = save_stdout.encoding
 if encoding is None or encoding.lower() == 'utf-8':
 out = save_stdout.write
 else:
 # Use backslashreplace error handling on write
 def out(s):
 s = str(s.encode(encoding, 'backslashreplace'), encoding)
 save_stdout.write(s)
 sys.stdout = self._fakeout
History
Date User Action Args
2012年08月07日 02:01:37vstinnersetrecipients: + vstinner, loewis, ishimoto, ncoghlan, pitrou, mrabarnett, Arfrever, methane, serhiy.storchaka
2012年08月07日 02:01:37vstinnersetmessageid: <1344304897.02.0.734555445654.issue15216@psf.upfronthosting.co.za>
2012年08月07日 02:01:36vstinnerlinkissue15216 messages
2012年08月07日 02:01:35vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /