homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title: Add a new os.read_into() function to avoid memory copies
Type: Stage:
Components: Library (Lib) Versions: Python 3.5
process
Status: closed Resolution: postponed
Dependencies: Superseder:
Assigned To: Nosy List: martin.panter, piotr.dobrogost, pitrou, vstinner
Priority: normal Keywords:

Created on 2015年03月23日 21:16 by vstinner, last changed 2022年04月11日 14:58 by admin. This issue is now closed.

Messages (5)
msg239069 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年03月23日 21:16
Sockets have a recv_into() method, io.IOBase has a readinto() method, but there is no os.read_into() function. It would avoid memory copies. It would benefit to the Python implementation FileIO (readall() and readinto() methods), see the issue #21859.
msg239072 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年03月23日 21:26
os.read_into() may be used by the following functions.
subprocess.Popen._execute_child():
 # Wait for exec to fail or succeed; possibly raising an
 # exception (limited in size)
 errpipe_data = bytearray()
 while True:
 part = os.read(errpipe_read, 50000)
 errpipe_data += part
 if not part or len(errpipe_data) > 50000:
 break
subprocess.Popen.communicate():
 self._fileobj2output = {}
 if self.stdout:
 self._fileobj2output[self.stdout] = []
 ...
 data = os.read(key.fd, 32768)
 if not data:
 ...
 self._fileobj2output[key.fileobj].append(data)
 ...
 stdout = b''.join(...)
multiprocessing.Connection._recv():
 def _recv(self, size, read=_read):
 buf = io.BytesIO()
 handle = self._handle
 remaining = size
 while remaining > 0:
 chunk = read(handle, remaining)
 n = len(chunk)
 if n == 0:
 if remaining == size:
 raise EOFError
 else:
 raise OSError("got end of file during message")
 buf.write(chunk)
 remaining -= n
 return buf
multiprocessing.read_unsigned():
 def read_unsigned(fd):
 data = b''
 length = UNSIGNED_STRUCT.size
 while len(data) < length:
 s = os.read(fd, length - len(data))
 if not s:
 raise EOFError('unexpected EOF')
 data += s
 return UNSIGNED_STRUCT.unpack(data)[0]
The problem is that some functions still require to return a bytes, not a bytearray or something else. Converting a bytearray to a bytes still require a memory copy...
msg239079 - (view) Author: Antoine Pitrou (pitrou) * (Python committer) Date: 2015年03月23日 22:40
Why do you want to optimize the pure Python FileIO?
msg239550 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年03月30日 01:25
> Why do you want to optimize the pure Python FileIO?
I gave more examples than FileIO in this issue.
msg244059 - (view) Author: STINNER Victor (vstinner) * (Python committer) Date: 2015年05月25日 22:51
Without more interested, I chose to defer this issue. Feel free to reopen it if you need it for more use cases, or if you are interested to implement it.
History
Date User Action Args
2022年04月11日 14:58:14adminsetgithub: 67942
2015年05月25日 22:51:43vstinnersetstatus: open -> closed
resolution: postponed
messages: + msg244059
2015年03月30日 01:25:07vstinnersetmessages: + msg239550
2015年03月23日 23:47:06martin.pantersetnosy: + martin.panter
2015年03月23日 22:55:05piotr.dobrogostsetnosy: + piotr.dobrogost
2015年03月23日 22:40:14pitrousetnosy: + pitrou
messages: + msg239079
2015年03月23日 21:26:50vstinnersetmessages: + msg239072
2015年03月23日 21:16:11vstinnercreate

AltStyle によって変換されたページ (->オリジナル) /