Pipeline as parallel command

Question 1

Normally, pipelines in Unix are used to connect two commands and use the output of the first command as the input of the second command. However, I recently come up with the idea (which may not be new, but I didn't find much Googling) of using pipeline to run several commands in parallel, like this:

command1 | command2

This will invoke command1 and command2 in parallel even if command2 does not read from standard input and command1 does not write to standard output. A minimal example to illustrate this is (please run it in an interactive shell)

ls . -R 1>&2|ls . -R

My question is, are there any downsides to use pipeline to parallelize the execution of two commands in this way? Are there anything that I have missed in this idea?

Thank you very much in advance.

Question 2

What wrong with command1 & command2?

Question 3

I think if you replace | with & in my example and run in an interactive shell you can see the difference. Thank you anyway for your comment.

Question 4

@WeijunZhou, pipeline is NOT parallelism stuff

Question 5

@WeijunZhou At least on my system, replacing | with & in your example created exactly the same output. The difference is that & is specifically designed to execute several commands in parallel. A pipeline is simply the wrong tool for the task

Question 6

@Fox Thank you for your comment. I should have clarified it. The difference can be seen when you send a keyboard interruption. As I have said in the original question, the main usage of this is in an interactive shell.

Question 7

Command pipelines already run in parallel. With the command:

command1 | command2

Both command1 and command2 are started. If command2 is scheduled and the pipe is empty, it blocks waiting to read. If command1 tries to write to the pipe and its full, command1 blocks until there's room to write. Otherwise, both command1 and command2 execute in parallel, writing to and reading from the pipe.

Question 8

Thank you for your explanation. So I can use this to parallelize commands without resorting to parallel, and there are no downsides. Or furthermore, this is normal practice. Is is fair to say that?

Question 9

The behavior that you described is today's reality. If you run command1 | command2 where command1 does not write to standard output and command2 does not read from standard input, they will run in parallel.

Question 10

Thank you for clarifying, especially about the blocking mechanism.

Question 11

There are downsides...

you cannot see the output of command1
if command2 doesn't read the output of command1, the latter will hang after writing some amount of output (I have seen 4K, but experimentally the limit is around 58K at least for a python process, see below). This may depend on the runtime used by command1.
if command2 stops before command1 and command1 writes to its stdout, it will get [Errno 32] Broken pipe

Experiment:

cmd1

#! /usr/bin/python3
import sys,time
for i in range(64):
 print ("*"*1023,file=sys.stdout)
 print ("cmd1 here (%d)" % i,file=sys.stderr)
 time.sleep(.1)
print ("cmd1 exiting",file=sys.stderr)

cmd2

#! /usr/bin/python3
import sys,time
for i in range(16):
 print ("cmd2 here (%d)" % i,file=sys.stderr)
 time.sleep(1)
print ("cmd2 exiting",file=sys.stderr)

Run:

./cmd1 | ./cmd2

You will see:

cmd1 stalling at iteration 58 (because cmd2 never reads anything from its output)
cmd1 crashing (broken pipe) when cmd2 exits

So yes, maybe it can work. And maybe not.

Question 12

Thank you for providing more insights to the problem. I have tested it and the result is like what you said if cmd1 does write to standard output and I silently ignore it. However I see no problems if cmd1 does not write to standard output at all (commenting out the line related to standard output), which is what I asked about in the original question.

Question 13

If you want to, you can do cmd >/dev/null | othercmd so you don't have the blocking problem at least regarding the output of cmd. It looks rather silly, but works in bash, ksh and dash (not in zsh, but I think zsh splits the output to both redirections)

Andy Dalton Andy Dalton 14.7k1 gold badge28 silver badges50 bronze badges · Accepted Answer · 2017-12-08 21:58:03Z

9

Command pipelines already run in parallel. With the command:

command1 | command2

Both command1 and command2 are started. If command2 is scheduled and the pipe is empty, it blocks waiting to read. If command1 tries to write to the pipe and its full, command1 blocks until there's room to write. Otherwise, both command1 and command2 execute in parallel, writing to and reading from the pipe.

Share

Improve this answer

answered Dec 8, 2017 at 21:58

Andy Dalton's user avatar

Andy Dalton Andy Dalton

14.7k1 gold badge28 silver badges50 bronze badges

3

Thank you for your explanation. So I can use this to parallelize commands without resorting to parallel, and there are no downsides. Or furthermore, this is normal practice. Is is fair to say that?

Weijun Zhou
– Weijun Zhou

2017年12月08日 22:00:09 +00:00
Commented Dec 8, 2017 at 22:00
1

The behavior that you described is today's reality. If you run command1 | command2 where command1 does not write to standard output and command2 does not read from standard input, they will run in parallel.

Andy Dalton
– Andy Dalton

2017年12月08日 22:02:59 +00:00
Commented Dec 8, 2017 at 22:02
Thank you for clarifying, especially about the blocking mechanism.

Weijun Zhou
– Weijun Zhou

2017年12月08日 22:13:26 +00:00
Commented Dec 8, 2017 at 22:13

Add a comment |

Stack Exchange Network

Pipeline as parallel command

2 Answers 2

There are downsides...

Experiment:

You must log in to answer this question.

Hot Network Questions

Pipeline as parallel command

2 Answers 2

There are downsides...

Experiment:

You must log in to answer this question.

Related

Hot Network Questions