1

I'm having a problem with subprocess poll not returning the return code when the process has finished.

I found out how to set a timeout on subprocess.Popen and used that as the basis for my code. However, I have a call that uses Java that doesn't correctly report the return code so each call "times out" even though it is actually finished. I know the process has finished because when removing the poll timeout check, the call runs without issue returning a good exit code and within the time limit.

Here is the code I am testing with.

import subprocess
import time
def execute(command):
 print('start command: {}'.format(command))
 process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 print('wait')
 wait = 10
 while process.poll() is None and wait > 0:
 time.sleep(1)
 wait -= 1
 print('done')
 if wait == 0:
 print('terminate')
 process.terminate()
 print('communicate')
 stdout, stderr = process.communicate()
 print('rc')
 exit_code = process.returncode
 if exit_code != 0:
 print('got bad rc')
if __name__ == '__main__':
 execute(['ping','-n','15','127.0.0.1']) # correctly times out
 execute(['ping','-n','5','127.0.0.1']) # correctly runs within the time limit
 # incorrectly times out
 execute(['C:\\dev\\jdk8\\bin\\java.exe', '-jar', 'JMXQuery-0.1.8.jar', '-url', 'service:jmx:rmi:///jndi/rmi://localhost:18080/jmxrmi', '-json', '-q', 'java.lang:type=Runtime;java.lang:type=OperatingSystem'])

You can see that two examples are designed to time out and two are not to time out and they all work correctly. However, the final one (using jmxquery to get tomcat metrics) doesn't return the exit code and therefore "times out" and has to be terminated, which then causes it to return an error code of 1.

Is there something I am missing in the way subprocess poll is interacting with this Java process that is causing it to not return an exit code? Is there a way to get a timeout option to work with this?

asked Nov 18, 2019 at 17:27
4
  • Does that process produce more than a few kilobytes of output? Commented Nov 22, 2019 at 1:48
  • @DavisHerring it certainly can but I don't think all of the requests do. Specifically, there is one large and one small one within the actual script. Commented Nov 22, 2019 at 14:31
  • And does the small one complete as expected? (Do they both complete quickly if you just skip the polling and terminating completely?) Commented Nov 22, 2019 at 14:37
  • @DavisHerring yes they both typically complete within a second without the polling in place. But I had an issue where the call hung so now I'm looking to put a timeout in place. Neither call works with this timeout though as if the process.poll thinks it is still running until the timeout. Commented Nov 22, 2019 at 15:14

2 Answers 2

1

This has the same cause as a number of existing questions, but the desire to impose a timeout requires a different answer.

The OS deliberately gives only a small amount of buffer space to each pipe. When a process writes to one that is full (because the reader has not yet consumed the previous output), it blocks. (The reason is that a producer that is faster than its consumer would otherwise be able to quickly use a great deal of memory for no gain.) Therefore, if you want to do more than one of the following with a subprocess, you have to interleave them rather than doing each in turn:

  1. Read from standard output
  2. Read from standard error (unless it’s merged via subprocess.STDOUT)
  3. Wait for the process to exit, or for a timeout to elapse

Of course, the subprocess might close its streams before it exits, write useful output after you notice the timeout and before you kill it, and/or start additional processes that keep the pipe open indefinitely, so you might want to have multiple timeouts. Probably what’s most informative is the EOF on the pipe, so repeatedly use something like select to wait for (however much is left of) the timeout, issue single reads on the streams that are ready, and wait (with another timeout if you’re concerned about hangs after an early stream closure) on EOF. If the timeout occurs instead, (try to) kill the subprocess, and consider issuing non-blocking reads (or another timeout loop) to get any last available output before closing the pipes.

answered Nov 22, 2019 at 22:28
Sign up to request clarification or add additional context in comments.

Comments

0

Using the other answer by @DavisHerring as the basis for more research, I came across a concept that worked for my original case. Here is the code that came out of that.

import subprocess
import threading
import time
def execute(command):
 print('start command: {}'.format(command))
 process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 timer = threading.Timer(10, terminate_process, [process])
 timer.start()
 print('communicate')
 stdout, stderr = process.communicate()
 print('rc')
 exit_code = process.returncode
 timer.cancel()
 if exit_code != 0:
 print('got bad rc')
def terminate_process(p):
 try:
 p.terminate()
 except OSError:
 pass # ignore error

It uses the threading.Timer to make sure that the process doesn't go over the time limit and terminates the process if it does. It otherwise waits for a response back and cancels the timer once it finishes.

answered Dec 16, 2020 at 20:23

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.