This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年11月21日 06:16 by v+python, last changed 2022年04月11日 14:57 by admin.
| Messages (9) | |||
|---|---|---|---|
| msg121871 - (view) | Author: Glenn Linderman (v+python) * | Date: 2010年11月21日 06:16 | |
.communicate is a nice API for programs that produce little output, and can be buffered. While that may cover a wide range of uses, it doesn't cover launching CGI programs, such as is done in http.server. Now there are nice warnings about that issue in the http.server documentation. However, while .communicate has the building blocks to provide more general solutions, it doesn't expose them to the user, nor does it separate them into building blocks, rather it is a monolith inside ._communicate. For example, it would be nice to have an API that would "capture a stream using a thread" which could be used for either stdout or stdin, and is what ._communicate does under the covers for both of them. It would also be nice to have an API that would "pump a bytestream to .stdin as a background thread. ._communicate doesn't provide that one, but uses the foreground thread for that. And, it requires that it be fully buffered. It would be most useful for http.server if this API could connect a file handle and an optional maximum read count to .stdin, yet do it in a background thread. That would leave the foreground thread able to process stdout. It is correct (but not what http.server presently does, but I'll be entering that enhancement request soon) for http.server to read the first line from the CGI program, transform it, add a few more headers, and send that to the browser, and then hook up .stdout to the browser (shutil.copyfileobj can be used for the rest of the stream). However, there is no deadlock free way of achieving this sort of solution, capturing the stderr to be logged, not needing to buffer a potentially large file upload, and transforming the stdout, with the facilities currently provided by subprocess. Breaking apart some of the existing building blocks, and adding an additional one for .stdin processing would allow a real http.server implementation, as well as being more general for other complex uses. You see, for http.server, the stdin |
|||
| msg122264 - (view) | Author: Glenn Linderman (v+python) * | Date: 2010年11月24日 07:56 | |
So I've experimented a bit, and it looks like simply exposing ._readerthread as an external API would handle the buffered case for stdout or stderr. For http.server CGI scripts, I think it is fine to buffer stderr, as it should not be a high-volume channel... but not both stderr and stdout, as stdout can be huge. And not stdin, because it can be huge also. For stdin, something like the following might work nicely for some cases, including http.server (with revisions): def _writerthread(self, fhr, fhw, length): while length > 0: buf = fhr.read( min( 8196, length )) fhw.write( buf ) length -= len( buf ) fhw.close() When the stdin data is buffered, but the application wishes to be stdout centric instead of stdin centric (like the current ._communicate code), a variation could be made replacing fhr by a data buffer, and writing it gradually (or fully) to the pipe, but from a secondary thread. Happily, this sort of code (the above is extracted from a test version of http.server) can be implemented in the server, but would be more usefully provided by subprocess, in my opinion. To include the above code inside subprocess would just be a matter of tweaking references to class members instead of parameters. |
|||
| msg123058 - (view) | Author: Glenn Linderman (v+python) * | Date: 2010年12月02日 06:59 | |
Here's an updated _writerthread idea that handles more cases: def _writerthread(self, fhr, fhw, length=None): if length is None: flag = True while flag: buf = fhr.read( 512 ) fhw.write( buf ) if len( buf ) == 0: flag = False else: while length > 0: buf = fhr.read( min( 512, length )) fhw.write( buf ) length -= len( buf ) # throw away additional data [see bug #427345] while select.select([fhr._sock], [], [], 0)[0]: if not fhr._sock.recv(1): break fhw.close() |
|||
| msg123059 - (view) | Author: Glenn Linderman (v+python) * | Date: 2010年12月02日 07:02 | |
Sorry, left some extraneous code in the last message, here is the right code: def _writerthread(self, fhr, fhw, length=None): if length is None: flag = True while flag: buf = fhr.read( 512 ) fhw.write( buf ) if len( buf ) == 0: flag = False else: while length > 0: buf = fhr.read( min( 512, length )) fhw.write( buf ) length -= len( buf ) fhw.close() |
|||
| msg123454 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2010年12月06日 04:41 | |
The general idea is sound. My work colleagues have certainly had to implement their own reader/writer thread equivalents to keep subprocess from blocking. It makes sense to provide more robust public support for such techniques in process itself. |
|||
| msg123517 - (view) | Author: Glenn Linderman (v+python) * | Date: 2010年12月07日 03:01 | |
Looking at the code the way I've used it in my modified server.py:
stderr = []
stderr_thread = threading.Thread(target=self._readerthread,
args=(p.stderr, stderr))
stderr_thread.daemon = True
stderr_thread.start()
self.log_message("writer: %s" % str( nbytes ))
stdin_thread = threading.Thread(target=self._writerthread,
args=(self.rfile, p.stdin, nbytes))
stdin_thread.daemon = True
stdin_thread.start()
and later
stderr_thread.join()
stdin_thread.join()
p.stderr.close()
p.stdout.close()
if stderr:
stderr = stderr[ 0 ].decode("UTF-8")
It seems like this sort of code (possibly passing in the encoding) could be bundled back inside subprocess (I borrowed it from there).
It also seems from recent discussion on npopdev that the cheat-sheet "how to replace" other sys and os popen functions would be better done as wrapper functions for the various cases. Someone pointed out that the hard cases probably aren't cross-platform, but that currently the easy cases all get harder when using subprocess than when using the deprecated facilities. They shouldn't. The names may need to be a bit more verbose to separate the various use cases, but each use case should remain at least as simple as the prior function.
So perhaps instead of just subprocess.PIPE to select particular handling for stdin, stdout, and stderr, subprocess should implement some varieties to handle attaching different types of reader and writer threads to the handles... of course, parameters need to come along for the ride too: maybe the the additional variations would be object references with parameters supplied, instead of just a manifest constant like .PIPE.
|
|||
| msg123521 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2010年12月07日 04:01 | |
Or various incarnations of functools.partial applied to subprocess.Popen. |
|||
| msg222894 - (view) | Author: Mark Lawrence (BreamoreBoy) * | Date: 2014年07月12日 22:53 | |
@Glenn can you provide a formal patch so we can take this forward? |
|||
| msg245560 - (view) | Author: Martin Panter (martin.panter) * (Python committer) | Date: 2015年06月20日 12:50 | |
Related: Issue 1260171, essentially proposing streaming readers and writers for communicate() instead of fixed buffers, but without using OS threads. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:09 | admin | set | github: 54691 |
| 2019年03月15日 21:59:23 | BreamoreBoy | set | nosy:
- BreamoreBoy |
| 2015年06月20日 12:50:44 | martin.panter | set | messages: + msg245560 |
| 2015年03月22日 15:11:03 | akira | set | nosy:
+ akira |
| 2015年03月22日 07:19:35 | martin.panter | set | nosy:
+ martin.panter |
| 2014年07月12日 22:53:20 | BreamoreBoy | set | nosy:
+ BreamoreBoy messages: + msg222894 versions: + Python 3.5, - Python 3.4 |
| 2013年07月09日 11:40:25 | christian.heimes | set | stage: needs patch versions: + Python 3.4, - Python 3.3 |
| 2012年02月05日 20:25:58 | brandjon | set | nosy:
+ brandjon |
| 2012年02月05日 14:56:15 | rosslagerwall | set | nosy:
+ rosslagerwall |
| 2010年12月07日 04:01:10 | ncoghlan | set | messages: + msg123521 |
| 2010年12月07日 03:01:38 | v+python | set | messages: + msg123517 |
| 2010年12月06日 04:41:24 | ncoghlan | set | nosy:
+ ncoghlan messages: + msg123454 versions: + Python 3.3, - Python 3.2 |
| 2010年12月02日 07:02:46 | v+python | set | messages: + msg123059 |
| 2010年12月02日 06:59:29 | v+python | set | messages: + msg123058 |
| 2010年11月24日 07:56:19 | v+python | set | messages: + msg122264 |
| 2010年11月21日 06:16:01 | v+python | create | |