Why does shell=True work when piping commands together?

Question 1

I have a couple subprocess instances I'd like to string together into a pipeline, but I am stuck and would like to ask for advice.

For example, to mimic:

cat data | foo - | bar - > result

Or:

foo - < data | bar - > result

...I first tried the following, which hangs:

import subprocess, sys
firstProcess = subprocess.Popen(['foo', '-'], stdin=subprocess.PIPE,
 stdout=subprocess.PIPE)
secondProcess = subprocess.Popen(['bar', '-'], stdin=firstProcess.stdout,
 stdout=sys.stdout)
for line in sys.stdin:
 firstProcess.stdin.write(line)
 firstProcess.stdin.flush()
firstProcess.stdin.close()
firstProcess.wait()

My second attempt uses one subprocess instance with the shell=True parameter, which works:

import subprocess, sys
pipedProcess = subprocess.Popen(" ".join(['foo', '-', '|', 'bar', '-']),
 stdin=subprocess.PIPE, shell=True)
for line in sys.stdin:
 pipedProcess.stdin.write(line)
 pipedProcess.stdin.flush()
pipedProcess.stdin.close()
pipedProcess.wait()

What am I doing wrong with the first, chained subprocess approach? I read that it is best not to use shell=True and I'm curious what I'm doing wrong with the first approach. Thanks for your advice.

EDIT

I fixed a typo in my question and fixed the stdin parameter of secondProcess. It still hangs.

I also tried removing firstProcess.wait() which resolves the hang, but then I get a 0-byte file as result.

I'll stick with the pipedProcess, since it works fine. But if anyone knows why the first setup hangs or makes a 0-byte file as output, I'd be interested to know why as well.

Question 2

Shouldn't the stdin for bar be the stdout of foo rather than its stdin?

Question 3

It should: I had a typo. I have fixed it in my question.

Question 4

If you're only copying your stdin to the child's stdin then firstProcess = Popen(['foo', '-'], stdin=sys.stdin, stdout=PIPE) works.

Question 5

shell=True works because you're asking the shell to interpret your entire command line and handle the piping itself. It is effectively as if you typed foo - | bar - directly into the shell.

(This is also why it can be unsafe to use shell=True; there are many ways to fool the shell into doing bad things that won't happen if you directly pass the command and arguments in as a list that isn't subject to parsing by any intermediaries.)

Question 6

I think I added a typo and meant to have it the way you have it. And indeed I checked and that's the case. I'll edit my question accordingly.

Question 7

Gotcha. I removed the code as it is now superfluous. I should add that it worked for me with some basic standins for foo and bar though (sort and uniq, actually), reading some random text on stdin. (On further thought, that's probably because uniq wouldn't exit before sort...)

Question 8

To fix the first example, add foo_process.stdout.close() as the docs suggest. The following code emulates foo - | bar - command:

#!/usr/bin/python
from subprocess import Popen, PIPE
foo_process = Popen(['foo', '-'], stdout=PIPE)
bar_process = Popen(['bar', '-'], stdin=foo_process.stdout)
foo_process.stdout.close() # allow foo to know if bar ends
bar_process.communicate() # equivalent to bar_process.wait() in this case

You don't need to use sys.stdin, sys.stdout explicitly here unless their different from sys.__stdin__, sys.__stdout__.

To emulate foo - < data | bar - > result command:

#!/usr/bin/python
from subprocess import Popen, PIPE
with open('data','rb') as input_file, open('result', 'wb') as output_file:
 foo = Popen(['foo', '-'], stdin=input_file, stdout=PIPE)
 bar = Popen(['bar', '-'], stdin=foo.stdout, stdout=output_file)
 foo.stdout.close() # allow foo to know if bar ends
bar.wait()

If you want to feed modified input line-by-line to the foo process i.e., to emulate python modify_input.py | foo - | bar - command:

#!/usr/bin/python
import sys
from subprocess import Popen, PIPE
foo_process = Popen(['foo', '-'], stdin=PIPE, stdout=PIPE)
bar_process = Popen(['bar', '-'], stdin=foo_process.stdout)
foo_process.stdout.close() # allow foo to know if bar ends
for line in sys.stdin:
 print >>foo_process.stdin, "PY", line, # modify input, feed it to `foo`
foo_process.stdin.close() # tell foo there is no more input
bar_process.wait()

Question 9

Why would I use bar_process.communicate()? Wouldn't I write data to foo_process?

Question 10

communicate() would likely be standing in for wait(), since bar_process will never provide output to Python.

Question 11

@AlexReynolds: the code emulates your 1st example (there is no need to first read from stdin, only to write it immediately to foo. foo process can read from stdin by itself).

Question 12

@AlexReynolds: bar_process.stdin is None so indeed bar_process.communicate() is just bar_process.wait() in this case.

Question 13

Unfortunately, this didn't work. I need to process data one line at a time, in any case, and also be able to handle file handles other than stdin. It doesn't look like for line in sys.stdin: ... bar_process.communicate(line) works.

Mattie B 21.5k7 gold badges39 silver badges57 bronze badges · Accepted Answer · 2013-03-12 23:50:06Z

2

shell=True works because you're asking the shell to interpret your entire command line and handle the piping itself. It is effectively as if you typed foo - | bar - directly into the shell.

(This is also why it can be unsafe to use shell=True; there are many ways to fool the shell into doing bad things that won't happen if you directly pass the command and arguments in as a list that isn't subject to parsing by any intermediaries.)

Share

Improve this answer

edited Mar 13, 2013 at 0:44

answered Mar 12, 2013 at 23:50

Mattie B's user avatar

Mattie B

21.5k7 gold badges39 silver badges57 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Alex Reynolds

Alex Reynolds Over a year ago

I think I added a typo and meant to have it the way you have it. And indeed I checked and that's the case. I'll edit my question accordingly.

2013年03月13日T00:01:47.61Z+00:00

Mattie B

Mattie B Over a year ago

Gotcha. I removed the code as it is now superfluous. I should add that it worked for me with some basic standins for foo and bar though (sort and uniq, actually), reading some random text on stdin. (On further thought, that's probably because uniq wouldn't exit before sort...)

2013年03月13日T00:24:50.887Z+00:00

CollectivesTM on Stack Overflow

Why does shell=True work when piping commands together?

2 Answers 2

2 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

2 Answers 2

2 Comments

7 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related