homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: shutil.copyfileobj should internally use os.sendfile when possible
Type: enhancement Stage: resolved
Components: Library (Lib) Versions:
process
Status: closed Resolution: rejected
Dependencies: Superseder:
Assigned To: Nosy List: desbma, giampaolo.rodola, martin.panter, python-dev, r.david.murray, vstinner
Priority: normal Keywords:

Created on 2015年09月10日 18:38 by desbma, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (7)
msg250401 - (view) Author: desbma (desbma) * Date: 2015年09月10日 18:38
Sorry if it has already been discussed, or if this is a stupid idea.
By looking at the signature, my thought was that the only use of shutil.copyfileobj was to wrap the use of sendfile, and use a fallback if it is not available on the system or not usable with "fake" Python files (not having a file descriptor, eg. like gzip.GzipFile).
By looking at the implementation, I was surprised that it does not try to call os.sendfile.
msg250406 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015年09月10日 19:10
No, it's to copy file like objects that aren't necessarily true file system objects. The only requirement is that they support a minimal file-like object interface. There is no suggestion or requirement that the object supports file descriptors (which would make it a real os file object).
I think changing it to use os.sendfile if possible would be too risky in terms of causing unexpected breakage (objects that look like they have a file descriptor but really don't), especially since handling file system files is explicitly not the purpose of copyfileobj.
msg250426 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年09月10日 22:31
I agree it is a nice idea, but have the same concern as David. Many pseudo file objects wrap real file descriptors. E.g. BufferedReader hides unread data in a buffer, GzipFile decompresses data, HTTPResponse decodes chunks. Detecting when sendfile() is appropriate would be hard to do. And I understand some platforms only support sockets.
Also I notice that 3.5 has a new socket.sendfile() method (Issue 17552), which partly overlaps with this proposal. BTW the 3.4 os documentation should not mention the socket method, but it does!
msg250503 - (view) Author: Roundup Robot (python-dev) (Python triager) Date: 2015年09月11日 23:56
New changeset 3d9cbbad8a04 by Martin Panter <vadmium> in branch '3.4':
Issue #25063: socket.sendfile() does not exist in 3.4
https://hg.python.org/cpython/rev/3d9cbbad8a04 
msg250578 - (view) Author: desbma (desbma) * Date: 2015年09月13日 17:18
If I understand you correctly, a naive implementation like this:
use_chunk_copy = False
try:
 fdsrc = fsrc.fileno()
 fddst = fdst.fileno()
except OSError:
 # fallback on current chunk based Python copy
 use_chunk_copy = True
else:
 # we have 2 fd, try sendfile
 try:
 os.sendfile(fdsrc, fddst)
 except AttributeError:
 # sendfile not available
 use_chunk_copy = True
if use_chunk_copy:
 # use current Python based chunk copy
...would not work because some file like objects expose fileno(), even though it does not accurately represent the file because they wrap it to add special behavior (like GzipFile).
I don't understand, if they alter the file behavior, why do they expose fileno()?
msg250586 - (view) Author: Martin Panter (martin.panter) * (Python committer) Date: 2015年09月13日 20:50
I don’t know the exact reasoning, but fileno() could be useful for other purposes, e.g. passing to stat() to find the last modification time, passing to select() to see if data is available.
msg250657 - (view) Author: R. David Murray (r.david.murray) * (Python committer) Date: 2015年09月14日 12:02
I'm going to reject this. If you think there is enough value in this to warrant the complications that would be required to ensure backward compatibility (assuming that is even possible), you could bring it up again on python-ideas.
History
Date User Action Args
2022年04月11日 14:58:20adminsetgithub: 69249
2018年05月19日 11:27:44giampaolo.rodolasetnosy: + giampaolo.rodola
2015年09月14日 12:02:59r.david.murraysetstatus: open -> closed
resolution: rejected
messages: + msg250657

stage: resolved
2015年09月13日 20:50:40martin.pantersetmessages: + msg250586
2015年09月13日 20:22:19vstinnersetnosy: + vstinner
2015年09月13日 17:18:00desbmasetmessages: + msg250578
2015年09月11日 23:56:37python-devsetnosy: + python-dev
messages: + msg250503
2015年09月10日 22:31:57martin.pantersettype: enhancement

messages: + msg250426
nosy: + martin.panter
2015年09月10日 19:10:04r.david.murraysetnosy: + r.david.murray
messages: + msg250406
2015年09月10日 18:38:50desbmacreate

AltStyle によって変換されたページ (->オリジナル) /