This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年10月13日 18:40 by Alexander.Steppke, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (8) | |||
|---|---|---|---|
| msg145477 - (view) | Author: Alexander Steppke (Alexander.Steppke) | Date: 2011年10月13日 18:40 | |
The tempfile module shows strange behavior under certain conditions. This might lead to data leaking or other problems.
The test session looks as follows:
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tmp = tempfile.TemporaryFile()
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'P\xf6D\x02\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\ [ommitted]'
or similar behavior in text mode:
Python 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tempfile
>>> tmp = tempfile.TemporaryFile('w+t')
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'\x00\xa5\x8b\x02int or long, hash(a) is used instead.\n i\x10 [ommitted]'
>>> tmp.seek(0)
>>> tmp.readline()
'test\x00\xa5\x8b\x02int or long, hash(a) is used instead.\n'
This bug seems to be triggered by calling tmp.read() before tmp.seek(). I am running Python 2.7.2 on Windows 7 x64, other people have reproduced the problem on Windows XP but not under Linux or Cygwin (see also http://stackoverflow.com/questions/7757663/python-tempfile-broken-or-am-i-doing-it-wrong).
Thank you for looking into this.
Alexander
|
|||
| msg145480 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2011年10月13日 19:11 | |
I wonder if it is a bug in Windows? Have you tried similar experiments with regular files? tempfile is really just about *where* the files are located (and what happens when they are closed), not about their fundamental nature as OS file objects. (I could be wrong about that on Windows of course, I'm more familiar with Linux.) |
|||
| msg145501 - (view) | Author: Alexander Steppke (Alexander.Steppke) | Date: 2011年10月14日 09:13 | |
Hi David,
I followed your suggestion and tried to reproduce the problem without the tempfile module. It turns out that is indeed an underlying issue. I am not sure what the root cause is but now this is even a bigger problem: read() returns information from some file/memory that it was never intended to access.
The session looks similar to the tempfile session:
>>> tmp = open('tmp', 'w+t')
>>> tmp.read()
''
>>> tmp.write('test')
>>> tmp.read()
'hp\'\x02\xe4\xb9>7\x80\x88\x81\x02\x01\x00\x00\x00\x00\x00\x00\x00\x12\x00\x00\
x00\xe86(\x02p\x11\x8d\x02\x01\x00\x00\x00@\xfd)\x02\xe7Y\x9aN\x01\x00\x00\x00\x
00\x00\x00\x00\x14\x00\x00\x00\x087(\x02\x00\x00\x00\x00\xe9Y\x0b\xa2\x00\x93+\x
02\x01\x00\x00\x00\x00\x00\x00\x00\x00\x9b,\x02\x02\x00\x00\x00\xe06(\x02\xc0W5\
At the moment the bug could only be reproduced using CPython 2.7.1 on Windows XP and Windows 7.
Alexander
|
|||
| msg145502 - (view) | Author: Alexander Steppke (Alexander.Steppke) | Date: 2011年10月14日 09:20 | |
Additionally after calling tmp.close() the file 'tmp' contains the string 'test', which is followed by about 4kB of binary data similar to the previous output of tmp.read(). |
|||
| msg145508 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年10月14日 11:37 | |
This issue is a duplicate of the issue #1394612 which has been closed as invalid. Read the following message: http://bugs.python.org/issue1394612#msg27200 I suppose that Python 3 is not affected by this issue because it doesn't use fread/fwrite anymore, but directly read/write (the low level, unbuffered, API). It looks like Python cannot do anything for this issue, except documenting this surprising behaviour. Would you like to write a patch for the documentation? |
|||
| msg145513 - (view) | Author: Alexander Steppke (Alexander.Steppke) | Date: 2011年10月14日 12:37 | |
Thank you for the update Victor. It seems to me that this is exactly the same issue. At the moment the current documentation says (http://docs.python.org/library/stdtypes.html#bltin-file-objects): "Note: This function is simply a wrapper for the underlying fread() C function, and will behave the same in corner cases, such as whether the EOF value is cached." This is a hint to the current behavior but I would not expect from this that file.read() can return any kind of data, if used directly after file.write(). Maybe one could include a link or a snippet of the C standard which states that one shall not do this: "When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values), both input and output may be performed on the associated stream. However, output shall not be directly followed by input without an intervening call to the fflush function or to a file positioning function (fseek, fsetpos, or rewind), and input shall not be directly followed by output without an intervening call to a file positioning function, unless the input operation encounters end-of-file." (from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf, page 272) |
|||
| msg145541 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年10月14日 15:55 | |
Le 14/10/2011 14:37, Alexander Steppke a écrit :
> "When a file is opened with update mode ('+' as the second or third character in the above list of mode argument values),
You can just say " '+' in the file mode ".
> the fflush function or to a file positioning function (fseek, fsetpos, or rewind),
You should translate these names into Python method names:
fflush -> file.flush()
fseek/fsetpos -> file.seek()
rewind -> (not exposed in Python)
|
|||
| msg145577 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2011年10月15日 00:43 | |
This issue has come up enough (tracker and python-list) that I think adding a mild adaptation of the C standard paragraph might be a good idea. Changing to a doc issue. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:22 | admin | set | github: 57380 |
| 2020年05月31日 12:07:21 | serhiy.storchaka | set | status: open -> closed resolution: out of date stage: resolved |
| 2011年10月15日 00:43:19 | terry.reedy | set | nosy:
+ terry.reedy, docs@python messages: + msg145577 assignee: docs@python components: + Documentation, - Library (Lib), Windows, IO |
| 2011年10月14日 15:55:20 | vstinner | set | messages: + msg145541 |
| 2011年10月14日 12:37:29 | Alexander.Steppke | set | messages: + msg145513 |
| 2011年10月14日 11:37:12 | vstinner | set | messages: + msg145508 |
| 2011年10月14日 09:30:09 | vstinner | set | nosy:
+ vstinner |
| 2011年10月14日 09:20:44 | Alexander.Steppke | set | messages: + msg145502 |
| 2011年10月14日 09:15:11 | Alexander.Steppke | set | components:
+ IO title: Bug in tempfile module -> Bug in file.read(), can access unknown data. |
| 2011年10月14日 09:13:47 | Alexander.Steppke | set | messages: + msg145501 |
| 2011年10月13日 19:11:22 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg145480 |
| 2011年10月13日 18:40:10 | Alexander.Steppke | create | |