This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年05月19日 16:04 by vstinner, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Messages (5) | |||
|---|---|---|---|
| msg136296 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月19日 16:04 | |
Example:
with open("setup.py", "rb") as f:
# read smaller than the file size to fill the readahead buffer
f.read(1)
# seek doesn't seek
f.seek(0)
print("f pos=", f.tell())
print("f.raw pos=", f.raw.tell())
Output:
f pos= 0
f.raw pos= 4096
I expect f.raw.tell() to be 0.
Extract of Modules/_io/buffered.c:
if (whence != 2 && self->readable) {
Py_off_t current, avail;
/* Check if seeking leaves us inside the current buffer,
so as to return quickly if possible. Also, we needn't take the
lock in this fast path.
Don't know how to do that when whence == 2, though. */
/* NOTE: RAW_TELL() can release the GIL but the object is in a stable
state at this point. */
current = RAW_TELL(self);
avail = READAHEAD(self);
printf("current=%" PY_PRIdOFF ", avail=%" PY_PRIdOFF "\n", current, avail);
if (avail > 0) {
Py_off_t offset;
if (whence == 0)
offset = target - (current - RAW_OFFSET(self));
else
offset = target;
printf("offset=%" PY_PRIdOFF "\n", offset);
if (offset >= -self->pos && offset <= avail) {
printf("NO SEEK!\n");
self->pos += offset;
return PyLong_FromOff_t(current - avail + offset);
}
}
}
I found this weird behaviour when trying to understand why:
with open("setup.py", 'rb') as f:
encoding, lines = tokenize.detect_encoding(f.readline)
with open("setup.py", 'r', encoding=encoding) as f:
imp.load_module("setup", f, "setup.py", (".py", "r", imp.PY_SOURCE))
is different than:
with tokenize.open("setup.py") as f:
imp.load_module("setup", f, "setup.py", (".py", "r", imp.PY_SOURCE))
imp.load_module() clones the file using something like fd = os.dup(f.fileno()); clone = os.fdopen(fd, "r").
For tokenizer.open(), a workaround is to replace:
buffer.seek(0)
by
buffer.seek(0); buffer.raw.seek(0)
|
|||
| msg136297 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月19日 16:07 | |
Note: _pyio.BufferedReader(), _pyio.BufferedWriter(), _pyio.BufferedRandom() don't use this optimization. They might be patched too. |
|||
| msg136298 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年05月19日 16:16 | |
This is by design. |
|||
| msg136306 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2011年05月19日 16:39 | |
And how can I seek the raw file to zero?
Using buffer.raw.seek(0), buffer.tell() becomes inconsistent:
$ ./python
Python 3.2.1b1 (3.2:bd5e4d8c8080, May 15 2011, 10:22:54)
>>> buffer=open('setup.py', 'rb')
>>> buffer.read(1)
>>> buffer.tell()
1
>>> buffer.raw.tell()
4096
>>> buffer.raw.seek(0)
0
>>> buffer.raw.tell()
0
>>> buffer.tell()
-4095
Same problem with os.lseek():
$ ./python
Python 3.2.1b1 (3.2:bd5e4d8c8080, May 15 2011, 10:22:54)
>>> import os
>>> buffer=open("setup.py", "rb")
>>> buffer.read(1)
>>> os.lseek(buffer.fileno(), 0, 0)
0
>>> buffer.raw.tell()
0
>>> buffer.tell()
-4095
|
|||
| msg136309 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年05月19日 16:44 | |
Simple: you are not supposed to use the raw file if you wrapped it inside a buffered file. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:17 | admin | set | github: 56325 |
| 2011年05月19日 16:44:17 | pitrou | set | messages: + msg136309 |
| 2011年05月19日 16:39:34 | vstinner | set | messages: + msg136306 |
| 2011年05月19日 16:16:41 | pitrou | set | status: open -> closed resolution: not a bug messages: + msg136298 |
| 2011年05月19日 16:07:19 | vstinner | set | messages: + msg136297 |
| 2011年05月19日 16:04:45 | vstinner | create | |