homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: input() doesn't catch _PyUnicode_AsString() exception; io.StringIO().encoding is None
Type: behavior Stage: resolved
Components: IO, Library (Lib) Versions: Python 3.7, Python 3.6, Python 3.5
process
Status: closed Resolution: fixed
Dependencies: 24402 Superseder:
Assigned To: Nosy List: aliles, amaury.forgeotdarc, belopolsky, dangyogi, erik.bray, flox, gruszczy, martin.panter, pitrou, r.david.murray, serhiy.storchaka, vstinner
Priority: normal Keywords: patch

Created on 2010年03月28日 20:15 by dangyogi, last changed 2022年04月11日 14:56 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
bug.py dangyogi, 2010年03月28日 20:15 small bug demo program
8256_3.patch gruszczy, 2010年03月30日 20:35
input_stdout_encoding.patch vstinner, 2010年05月14日 01:20
input_stdout_none_encoding.patch vstinner, 2011年11月03日 20:25
p1345978092.diff aliles, 2012年08月26日 10:59 review
input_fallback.patch serhiy.storchaka, 2015年10月06日 19:09 review
Pull Requests
URL Status Linked Edit
PR 517 merged serhiy.storchaka, 2017年03月06日 13:34
PR 640 closed serhiy.storchaka, 2017年03月12日 11:55
PR 641 merged serhiy.storchaka, 2017年03月12日 12:31
PR 642 merged serhiy.storchaka, 2017年03月12日 12:41
PR 703 larry, 2017年03月17日 21:00
Messages (32)
msg101874 - (view) Author: Bruce Frederiksen (dangyogi) Date: 2010年03月28日 20:15
I'm getting a "TypeError: bad argument type for built-in operation" on a print() with no arguments. This seems to be a problem in both 3.1 and 3.1.2 (haven't tried 3.1.1).
I've narrowed the problem down in a very small demo program that you can run to reproduce the bug. Just do "python3.1 bug.py" and hit <ENTER> at the "prompt:".
Removing the doctest call (and calling "foo" directly) doesn't get the error. Also removing the "input" call (and leaving the doctest call in) doesn't get the error.
The startup banner on my python3.1 is:
Python 3.1.2 (r312:79147, Mar 26 2010, 16:55:44) 
[GCC 4.3.3] on linux2
I compiled python 3.1.2 with ./configure, make, make altinstall without any options. I'm running ubuntu 9.04 with the 2.6.28-18-generic (32-bit) kernel.
msg101876 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010年03月28日 22:46
Confirmed.
There's something wrong around the doctest._SpoofOut class.
This script triggers the same bug (both 3.x and 3.1).
Output:
$ ./python issue8256_case.py 
prompt:
Traceback (most recent call last):
 File "issue8256_case.py", line 13, in <module>
 foo()
 File "issue8256_case.py", line 7, in foo
 print()
TypeError: bad argument type for built-in operation
msg101879 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年03月28日 22:59
The bug is triggered by input, not by print. The exact place is _PyUnicode_AsStringAndSize, where unicode check happens. Then print checks PyError_Occured and catches this error. Either this error should not be raised or should be cleared input finishes.
I'd love to provide a patch, but I have no idea, what should be corrected and how. If some would tutor me a little, I would be very happy to learn and code this.
msg101880 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2010年03月28日 23:01
Right. It does not involve doctest.
#
import io, sys
original_stdout = sys.stdout
try:
 sys.stdout = io.StringIO()
 input("prompt:")
 print()
finally:
 sys.stdout = original_stdout
msg101886 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年03月29日 12:20
The problem occurs in line in bltinmodule.c:
		po = PyUnicode_AsEncodedString(stringpo,
			_PyUnicode_AsString(stdout_encoding), NULL);
Where _PyUnicode_AsString returns NULL, since stdout_encoding is Py_None and that won't pass PyUnicode_Check in _PyUnicode_AsStringAndSize. To what object can _PyUnicode_AsString be turned and then passed to _PyUnicode_AsStringAndSize? Is there some default 'utf-8' encoding object?
msg101888 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年03月29日 13:05
Whatever the solution to this issue is, it certainly looks like a bug that the return value of that function isn't being checked for errors.
msg101900 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年03月29日 19:21
I have written a small patch, that solves the problem, but is disgusting. Could anyone tell me, how I can get some default encoding from Python internals (I have no idea where to look) and return it inside _PyUnicode_AsStringAndSize? Anyway, now when the error happens inside input, it raises an Exception properly. So now I only need to know, how to correct the bug in an elegant fashion.
msg101904 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年03月29日 20:56
Ok, I have found Py_FileDefaultSystemEncoding and use it, however I had to cast it to (char *), because it's a const char *. Maybe I could do it better?
msg101956 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年03月30日 20:35
I have read, that I shouldn't directly use Py_FileSystemDefaultEncoding and rather use PyUnicode_GetDefaultEncoding, so I have changed the code a little.
msg105422 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年05月09日 23:52
Bump! Is there anything happening about this bug? Is my patch any good or should I try to work on something different?
msg105435 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2010年05月10日 14:15
Victor, you've been dealing with Python's default encoding lately, care to render an opinion on the correct fix for this bug?
@Filip: the patch will need a unit test, which will also help with assessing the validity of the fix.
msg105436 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年05月10日 14:19
I'll try to code a small test this evening.
msg105439 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010年05月10日 14:54
The patch is wrong: _PyUnicode_AsString(Py_None) should not return "utf8"!
I suggest that since PyOS_Readline() write the prompt to stderr, the conversion uses the encoding of stderr.
msg105555 - (view) Author: Filip Gruszczyński (gruszczy) Date: 2010年05月11日 22:37
Amaury, could you elaborate a little more on this? I am pretty new to all this and I would happily write the patch, if only you could give me some clue on how I should approach this.
msg105616 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年05月13日 00:14
This issue is directly related to issue #6697. The first problem is that the builtin input() function doesn't check that _PyUnicode_AsString() result is not NULL.
The second problem is that io.StringIO().encoding is None. I don't understand why it is None whereas it uses utf8 (it calls TextIOWrapper constructor with encodings="utf8" and errors="strict").
I will be difficult to write an unit test because the issue only occurs if stdin and stdout are TTY: input() calls PyOS_Readline(stdin, stdout, prompt). 
--
@gruszczy: You're patch is just a workaround, not the right fix. The problem should be fixed in input(), not in PyUnicode methods. _PyUnicode_AsString() expects an unicode argument, it should raise an error if the argument is None (and not return a magical value).
msg105676 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年05月14日 01:20
Here is a patch catching the _PyUnicode_AsString() error.
input() uses sys.stdout.encoding to encode the prompt to a byte string, but 
PyOS_StdioReadline() writes the prompt to stderr (it should use sys_stdout).
I don't know which encoding should be used if sys.stdout.encoding is None (eg. 
if sys.stdout is a StringIO() object).
StringIO() of _io module has no encoding because it stores unicode characters, 
not bytes. StringIO() of _pyio module is based on BytesIO() and use utf8 
encoding, but the reference implementation is now _io.
msg105699 - (view) Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) Date: 2010年05月14日 11:21
since the prompt is written to stderr, why is sys.stdout.encoding used instead of sys.stderr.encoding?
msg105700 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2010年05月14日 11:31
amaury> since the prompt is written to stderr, why is sys.stdout.encoding
amaury> used instead of sys.stderr.encoding?
input() calls PyOS_Readline() but PyOS_Readline() has multiple 
implementations:
 - PyOS_StdioReadline() if sys_stdin or sys_stdout is not a TTY
 - or PyOS_ReadlineFunctionPointer callback:
 - vms__StdioReadline() (VMS only)
 - PyOS_StdioReadline()
 - call_readline() when readline module is loaded
call_readline() calls rl_callback_handler_install() with the prompt which 
writes the prompt to *stdout* (try ./python 2>/dev/null).
I don't think that it really matters that the prompt is written to stderr with 
stdout encoding, because both outputs always use the same encoding.
msg146590 - (view) Author: Florent Xicluna (flox) * (Python committer) Date: 2011年10月29日 02:04
Confirmed in 3.3.
The patch does not apply cleanly on trunk.
msg146969 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年11月03日 20:07
A patch similar to input_stdout_encoding.patch has been applied to 3.2 and 3.3 for the issue #6697: see changeset 846866aa0eb6.
msg146973 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2011年11月03日 20:25
input_stdout_none_encoding.patch uses UTF-8 if sys.stdout.encoding is None.
msg168873 - (view) Author: Aaron Iles (aliles) * Date: 2012年08月22日 12:01
Replicated this issue on Python 3.3b2. The cause is the 'encoding' and 'errors' attributes on io.StringIO() being None. Doctest replaces sys.stdout with a StringIO subclass. The exception raised is still a TypeError.
At this point I'm unsure what the fix should be:
1. Should the exception raised be more descriptive of the problem?
2. Should io.StringIO have real values for encoding and errors?
3. Should Doctest's StingIO class provide encoding and errors?
msg168880 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年08月22日 12:48
> I suggest that since PyOS_Readline() write the prompt to stderr, the
> conversion uses the encoding of stderr.
Agreed with Amaury.
msg169166 - (view) Author: Aaron Iles (aliles) * Date: 2012年08月26日 10:59
Upload new patch that uses encoding and errors from stderr if stdout values are invalid unicode. Includes unit test in test_builtin.py.
With this patch I am no longer able to replicate this issue.
msg252421 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年10月06日 18:28
I would fallback to PyFile_WriteObject(prompt, fout, Py_PRINT_RAW) if the stdout has no the encoding attribute or it is not a string.
msg252422 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2015年10月06日 19:09
Here is a patch.
msg252437 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年10月06日 23:04
Serhiy, your patch looks like a worthwhile improvement because it adds proper error checking and handling. However I suspect this original bug is actually a side effect of Issue 24402. The code in question shouldn’t even be running, because sys.stdout is not the original output file descriptor, and is not a terminal.
msg255217 - (view) Author: Erik Bray (erik.bray) * (Python triager) Date: 2015年11月23日 20:17
I just recently discovered this myself. In the process of debugging the issue I also noticed the same bug that is now fixed via Issue 24402.
While I agree that Issue 24402 mostly mitigates the issue I think this patch is still worthwhile, as the current behavior still leads to cryptic, hard to debug errors. For example (although this is not great code, bear with me...) one could write a stdout wrapper like:
>>> class WrappedStream:
... encoding = 'utf8'
... errors = None
... def __getattr__(self, attr):
... return getattr(sys.__stdout__, attr)
... 
>>> sys.stdout = WrappedStream()
>>> sys.stdout.fileno()
1
>>> sys.stdout.isatty()
True
>>> sys.stdout.errors
>>> input('Prompt: ')
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: bad argument type for built-in operation
This still goes down the path for ttys, but because the 'errors' attribute does not defer to the underlying stream it still leads to a hard to debug exception. To be clear, I think the above code *should* break, just not as cryptically.
msg255220 - (view) Author: Erik Bray (erik.bray) * (Python triager) Date: 2015年11月23日 20:25
> I think the above code *should* break
Actually, I see now that Serhiy's patch would allow this example to just pass through to the non-interactive fallback. So I take it back that my example should break--I think using the fallback would also be fine.
msg290194 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017年03月24日 22:21
New changeset a16894ebf8823f0e09036aacde9288c00e8d9058 by Serhiy Storchaka in branch '3.5':
[3.5] bpo-8256: Fixed possible failing or crashing input() (#642)
https://github.com/python/cpython/commit/a16894ebf8823f0e09036aacde9288c00e8d9058
msg290195 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017年03月24日 22:21
New changeset aac875fa2f03cab61ceeaa2621c4c5534c7bcfc2 by Serhiy Storchaka in branch '3.6':
[3.6] bpo-8256: Fixed possible failing or crashing input() (#641)
https://github.com/python/cpython/commit/aac875fa2f03cab61ceeaa2621c4c5534c7bcfc2
msg290200 - (view) Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) Date: 2017年03月24日 22:23
New changeset c2cf12857187aa147c268651f10acd6da2c9cb74 by Serhiy Storchaka in branch 'master':
bpo-8256: Fixed possible failing or crashing input() (#517)
https://github.com/python/cpython/commit/c2cf12857187aa147c268651f10acd6da2c9cb74
History
Date User Action Args
2022年04月11日 14:56:59adminsetgithub: 52503
2017年03月24日 22:23:04serhiy.storchakasetmessages: + msg290200
2017年03月24日 22:21:44serhiy.storchakasetmessages: + msg290195
2017年03月24日 22:21:37serhiy.storchakasetmessages: + msg290194
2017年03月17日 21:11:46serhiy.storchakasetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2017年03月17日 21:00:31larrysetpull_requests: + pull_request577
2017年03月12日 12:41:28serhiy.storchakasetpull_requests: + pull_request531
2017年03月12日 12:31:34serhiy.storchakasetpull_requests: + pull_request530
2017年03月12日 11:55:48serhiy.storchakasetpull_requests: + pull_request529
2017年03月06日 13:34:57serhiy.storchakasetpull_requests: + pull_request427
2017年03月06日 13:33:37serhiy.storchakasetversions: + Python 3.7, - Python 3.4
2015年11月23日 20:25:16erik.braysetmessages: + msg255220
2015年11月23日 20:17:46erik.braysetnosy: + erik.bray
messages: + msg255217
2015年10月06日 23:04:35martin.pantersetnosy: + martin.panter
dependencies: + input() uses sys.__stdout__ instead of sys.stdout for prompt
messages: + msg252437
2015年10月06日 19:09:04serhiy.storchakasetfiles: + input_fallback.patch

messages: + msg252422
stage: needs patch -> patch review
2015年10月06日 18:28:21serhiy.storchakasetnosy: + serhiy.storchaka

messages: + msg252421
versions: + Python 3.4, Python 3.5, Python 3.6, - Python 3.2, Python 3.3
2012年08月26日 10:59:20alilessetfiles: + p1345978092.diff

messages: + msg169166
2012年08月22日 12:48:34pitrousetnosy: + pitrou
messages: + msg168880
2012年08月22日 12:01:33alilessetnosy: + aliles
messages: + msg168873
2011年11月03日 20:25:56vstinnersetfiles: + input_stdout_none_encoding.patch

messages: + msg146973
2011年11月03日 20:07:08vstinnersetmessages: + msg146969
2011年10月29日 02:04:40floxsetstage: test needed -> needs patch
messages: + msg146590
versions: + Python 3.3, - Python 3.1
2010年05月14日 11:31:06vstinnersetmessages: + msg105700
2010年05月14日 11:21:10amaury.forgeotdarcsetmessages: + msg105699
2010年05月14日 01:20:13vstinnersetfiles: + input_stdout_encoding.patch

messages: + msg105676
2010年05月13日 00:15:14vstinnersettitle: TypeError: bad argument type for built-in operation -> input() doesn't catch _PyUnicode_AsString() exception; io.StringIO().encoding is None
2010年05月13日 00:14:24vstinnersetmessages: + msg105616
2010年05月11日 22:37:11gruszczysetmessages: + msg105555
2010年05月11日 15:44:19belopolskysetnosy: + belopolsky
2010年05月10日 14:54:22amaury.forgeotdarcsetnosy: + amaury.forgeotdarc
messages: + msg105439
2010年05月10日 14:19:14gruszczysetmessages: + msg105436
2010年05月10日 14:15:36r.david.murraysetnosy: + vstinner
messages: + msg105435
2010年05月09日 23:52:19gruszczysetmessages: + msg105422
2010年03月30日 20:35:48gruszczysetfiles: - 8256_2.patch
2010年03月30日 20:35:44gruszczysetfiles: - 8256_1.patch
2010年03月30日 20:35:31gruszczysetfiles: + 8256_3.patch

messages: + msg101956
2010年03月29日 20:56:03gruszczysetfiles: + 8256_2.patch

messages: + msg101904
2010年03月29日 19:21:43gruszczysetfiles: + 8256_1.patch
keywords: + patch
messages: + msg101900
2010年03月29日 13:05:35r.david.murraysetnosy: + r.david.murray
messages: + msg101888
2010年03月29日 12:20:59gruszczysetmessages: + msg101886
2010年03月28日 23:01:53floxsetmessages: + msg101880
2010年03月28日 22:59:36floxsetfiles: - issue8256_case.py
2010年03月28日 22:59:10gruszczysetnosy: + gruszczy
messages: + msg101879
2010年03月28日 22:46:01floxsetfiles: + issue8256_case.py
priority: normal

components: - Interpreter Core
versions: + Python 3.2
nosy: + flox

messages: + msg101876
stage: test needed
2010年03月28日 20:15:14dangyogicreate

AltStyle によって変換されたページ (->オリジナル) /