homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: io.open() doesn't check for embedded NUL characters
Type: behavior Stage: resolved
Components: IO, Library (Lib) Versions: Python 3.2, Python 3.3, Python 2.7
process
Status: closed Resolution: fixed
Dependencies: Superseder:
Assigned To: Nosy List: alex, hynek, pitrou, python-dev, terry.reedy
Priority: normal Keywords: easy, patch

Created on 2012年01月24日 02:03 by pitrou, last changed 2022年04月11日 14:57 by admin. This issue is now closed.

Files
File name Uploaded Description Edit
open-nul.patch hynek, 2012年01月24日 15:52 Fix for non-Win32 platforms review
open-nul.patch hynek, 2012年01月29日 12:47 Fix for all platforms, adds _PyUnicode_HasNULChars review
open-nul.patch hynek, 2012年01月29日 13:33 Fix for all platforms, adds _PyUnicode_HasNULChars, caches nul string review
Messages (13)
msg151876 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月24日 02:03
>>> open("\x00abc")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: ''
Contrast with 2.x open():
>>> open("\x00abc")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: file() argument 1 must be encoded string without NULL bytes, not str
msg151899 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年01月24日 12:55
I took a deep dive into parts of CPython that were unknown to me :) and dug up the following:
Methods like os.stat or even os.open convert the file name using "et" in PyArg_ParseTuple[AndKeywords].
OTOH, open() and io.open() just hand through the object as "O" to the respective low-level io module.
The result in 2.7 is that file() tries to convert it for it's own usage eventually – which fails as seen. While a more explicit error message wouldn't hurt, this seems safe to me insofar.
In 3.3, file() aka Modules/_io/fileio.c , io_open does no such thing because it seems to handle fds as "nameobj" as well and does a wide range of checks on the argument.
After io_open is certain that nameobj is a file name, it uses PyObject_AsCharBuffer()on bytes and PyUnicode_FromObject() + encoding magic on unicode to get an encoded string as a file name.
Neither does a check for NUL bytes so the (w)open(er) that follows reacts as demonstrated by Antoine.
I presume fixing/breaking PyObject_AsCharBuffer()/PyUnicode_FromObject() is out of question so the most obvious part to fix would be the conversion block inside io_open.
Should I have a stab at it or do you disagree with my approach?
msg151902 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月24日 13:20
Yes, fixing the conversion block is probably the right approach.
Apparently posixmodule.c uses PyUnicode_FSConverter, perhaps that would work?
(also make sure that the case where a bytes string is given is fixed too:
>>> open(b"\x00")
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
FileNotFoundError: [Errno 2] No such file or directory: ''
)
msg151906 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年01月24日 13:42
JFTR, file()'s C equivalent is fileio_init and not io_open, I lost track of all the opens. ;)
msg151913 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年01月24日 15:52
So I have good news and bad news. The good is: I fixed it for non-Win platforms and the patch is truly beautiful:
 Lib/test/test_builtin.py | 6 ++++++
 Modules/_io/fileio.c | 25 ++++---------------------
 2 files changed, 10 insertions(+), 21 deletions(-)
;)
Two problems:
 1. I'm not sure if it's okay for me to put the test where I put it? 
 2. I'm not sure how to fix it for Win32 (and I also can't test it :().
It's just about the case when it's called with a Unicode path name. The current code looks like as following:
 if (PyUnicode_Check(nameobj)) {
 widename = PyUnicode_AsUnicode(nameobj);
 if (widename == NULL)
 return -1;
 }
We can't use the nifty PyUnicode_FSConverter because we want to keep Unicode. So I assume the way to go would be the C equivalent of somthing like:
if '0円' in widename:
 raise TypeError()
Right?
I hope someone would be so kind to implement it, otherwise the patch attached passes all test on Linux and Mac (except for test_recursion_limit but that fails currently for me on the Mac even without my patch).
msg152075 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月27日 09:28
Indeed, there seems to be no mechanism available to forbid NUL chars under Windows (for Python 3.x):
>>> open("LICENSE\x00foobar")
<_io.TextIOWrapper name='LICENSE\x00foobar' mode='r' encoding='cp1252'>
>>> os.stat("LICENSE\x00foobar")
nt.stat_result(st_mode=33206, st_ino=2251799813779714, st_dev=0, st_nlink=1, st_uid=0, st_gid=0, st_size=15132, st_atime=8589934592, st_mtime=8589934592, st_ctime=1301169903)
I think PyUnicode_AsUnicode should grow a NUL char check in Python 3.3, since it doesn't return the size anyway. I don't think we can do that in previous versions, though, so we need an alternate strategy. Scanning the unicode string for NUL characters is enough. That should be easy by using PyUnicode_AsUnicodeAndSize.
As for the patch:
- the test should be in test_io; you may also add a separate in test_fileio
- conv_name is never decref'ed, and so there will be a memory leak
msg152077 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月27日 09:32
Since the NUL-scanning will be useful for Modules/posixmodule.c as well, perhaps it should be done as a private _PyUnicode_HasNULChars() function.
msg152139 - (view) Author: Terry J. Reedy (terry.reedy) * (Python committer) Date: 2012年01月27日 23:26
See also #13849 
msg152224 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年01月29日 12:47
I have fixed the refleak, added a _PyUnicode_HasNULChars and integrated it into the Win32-unicode-if-branch. Couldn't test it due to lack of win32 – the function itself is tested though.
msg152225 - (view) Author: Hynek Schlawack (hynek) * (Python committer) Date: 2012年01月29日 13:33
With Georg's kind help I added some improvements:
 - I've been reluctant to waste heap for caching the nul string but he convinced me that I was being ridiculous ;)
 - For some reason there was a stray character inside, that should be fixed too.
In related news, I'm also adding tests for fileio since the last patch.
msg152244 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月29日 17:15
The patch works under Windows here (on branch default).
msg152248 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2012年01月29日 17:47
New changeset 572bb8c265c0 by Antoine Pitrou in branch '3.2':
Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
http://hg.python.org/cpython/rev/572bb8c265c0
New changeset 6bb05ce1cd1f by Antoine Pitrou in branch 'default':
Issue #13848: open() and the FileIO constructor now check for NUL characters in the file name.
http://hg.python.org/cpython/rev/6bb05ce1cd1f 
msg152250 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2012年01月29日 17:50
I've made small changes and committed the patch in 3.2 and 3.3.
2.7 would need further changes and I don't think it's worth the bother.
Thanks!
History
Date User Action Args
2022年04月11日 14:57:26adminsetgithub: 58056
2012年01月29日 17:50:08pitrousetstatus: open -> closed
resolution: fixed
messages: + msg152250

stage: patch review -> resolved
2012年01月29日 17:47:29python-devsetnosy: + python-dev
messages: + msg152248
2012年01月29日 17:15:24pitrousetmessages: + msg152244
stage: needs patch -> patch review
2012年01月29日 13:33:43hyneksetfiles: + open-nul.patch

messages: + msg152225
2012年01月29日 12:47:16hyneksetfiles: + open-nul.patch

messages: + msg152224
2012年01月27日 23:26:32terry.reedysetnosy: + terry.reedy
messages: + msg152139
2012年01月27日 09:32:32pitrousetmessages: + msg152077
2012年01月27日 09:28:30pitrousetmessages: + msg152075
2012年01月24日 15:52:13hyneksetfiles: + open-nul.patch
keywords: + patch
messages: + msg151913
2012年01月24日 13:42:01hyneksetmessages: + msg151906
2012年01月24日 13:20:50pitrousetmessages: + msg151902
2012年01月24日 12:55:31hyneksetmessages: + msg151899
2012年01月24日 02:06:27alexsetnosy: + alex
2012年01月24日 02:03:19pitroucreate

AltStyle によって変換されたページ (->オリジナル) /