What explains this very odd behavior of GNU grep interacting with buffering and pipes and how to stop it?

Question 1

This is best illustrated with an example I feel:

{ printf 'foo\nbar\n' ; sleep 2 ; } | grep -m1 foo
{ printf 'foo\n' ; sleep 2 ; printf 'bar\n' ; sleep 2 ; } | grep -m1 foo

Both of these commands, ran in Bash with GNU coreutils, for whatever reason behave in exactly the same way:

First grep prints out "foo" with a newline behind it but keeps blocking.
Then grep waits for 2 seconds, and exits.

The expected behavior for me is both cases that grep should print "foo" with a newline and then immediately exit without waiting two seconds. After all, it has already satisfied its condition of exactly one match and it knows whatever further input it receives can't change this any further. Indeed if I do this:

{ printf 'foo' ; sleep 2 ; } | grep -m1 foo

Without the newline after "foo", it first waits for two seconds doing nothing, and then it exits printing "foo" with a newline behind it. This makes sense to me, grep has not yet received any newline so it doesn't know yet what could follow after those two seconds so it can't print yet what would be on the matching line.

But I in particular don't understand why the first two commands function as they do. In the second case, GNU grep immediately exits upon receiving the second line and does not wait out the further 2 seconds of sleep for the command before the pipe to exit, whereas in the first command, it receives foo\nbar\n right away over two lines, and yet, it does not exit immediately once receiving the second line. I assume this has something to do with how the buffering works.

If you want my real use case and why I'm investigating this. I'm using this in a script with udevadm monitor to grep for a specific event and when that event is reached I want to stop blocking so I use udevadm monitor | egrep -m1 <regex> so far so good except I noticed that it does not stop blocking when the specifically searched for event is the last one reported by udevadm, then it only stops blocking after a new event is sent. For whatever reason in that case grep only exits and prints the first match upon receiving the line after or receiving an end of file. Why does it do that and how do I make it stop doing that/

Question 2

recent similar one: unix.stackexchange.com/q/791984/170373

Question 3

I actually wrote this question and then someone else helped me towards the answer but I felt I should make the topic anyway and answer it simply to spread the knowledge since I feel many might be caught by this.

The simple answer is that pipes in the shell in fact to do not exit when the last command in the pipeline exits, but when all of them do. If even one of them contains an infinite loop, the entire pipeline will continue to run forever.

The grep portion in fact does end when it finds the first line it can match upon and return. The reason for the entire pipeline ending after two seconds is that the writing part then encounters a pipe that has hung up when it tries to write again and receives SIGPIPE and then exits on this attempted write, but it doesn't know that the other end has hung up until it starts to write. Perhaps in an ideal world, it would receive the SIGPIPE the moment the other end has hung up, though the other end can re-open again I suppose, but it only knows the other end has hung up once it actually attempts to write. It asks for forgiveness, not for permission, which is why it only ends then.

In the case of writing two lines at once, it writes them first in one buffered go, then the grep process hangs up after doing its job.

The way to solve this issue in the particular script is to manually check the exit status of grep with an anonymous named pipe:

# we create a pipe and then unlink it and set it file descriptor 3 to create an anonymous pipe
# create anonymous fifo on file descriptor 3 by creating it
# then opening a file descriptor to it and removing it again
# we use a directory to make sure this be atomic
fifodir=$(mktemp -d)
mkfifo -- "$fifodir/fifo"
exec 3<>"$fifodir/fifo"
rm -r -- "$fifodir"
# now perform the pipe manually
udevadm monitor <args> >&3 & udevadmpid=$!
# and wait only on the grep process
egrep -m1 <pattern> <&3 & wait $!
# we even kill this for cleanliness, though it's not needed
kill $udevadmpid

Question 4

Use the --line-buffered output option of GNU grep.

Otherwise, the output from grep sits in the STDOUT buffer until it is flushed when grep exits.

Question 5

Shouldn't grep -m1 stop at the first match though? That is the heart of the question.

Question 6

Yes, the --line-buffered option makes no difference here. Try it with the OP's examples. For instance, { printf 'foo\nbar\n' ; sleep 2 ; } | grep -m1 --line-buffered foo still takes two seconds to exit. The issue is the pipe, as explained in the OP's answer, not grep.

Question 7

and in the commands here, the output of grep goes to the terminal, so it's already line-buffered by default

Zorf Zorf 1715 bronze badges · Accepted Answer · 2025-03-09 21:25:50Z

I actually wrote this question and then someone else helped me towards the answer but I felt I should make the topic anyway and answer it simply to spread the knowledge since I feel many might be caught by this.

The simple answer is that pipes in the shell in fact to do not exit when the last command in the pipeline exits, but when all of them do. If even one of them contains an infinite loop, the entire pipeline will continue to run forever.

The grep portion in fact does end when it finds the first line it can match upon and return. The reason for the entire pipeline ending after two seconds is that the writing part then encounters a pipe that has hung up when it tries to write again and receives SIGPIPE and then exits on this attempted write, but it doesn't know that the other end has hung up until it starts to write. Perhaps in an ideal world, it would receive the SIGPIPE the moment the other end has hung up, though the other end can re-open again I suppose, but it only knows the other end has hung up once it actually attempts to write. It asks for forgiveness, not for permission, which is why it only ends then.

In the case of writing two lines at once, it writes them first in one buffered go, then the grep process hangs up after doing its job.

The way to solve this issue in the particular script is to manually check the exit status of grep with an anonymous named pipe:

# we create a pipe and then unlink it and set it file descriptor 3 to create an anonymous pipe
# create anonymous fifo on file descriptor 3 by creating it
# then opening a file descriptor to it and removing it again
# we use a directory to make sure this be atomic
fifodir=$(mktemp -d)
mkfifo -- "$fifodir/fifo"
exec 3<>"$fifodir/fifo"
rm -r -- "$fifodir"
# now perform the pipe manually
udevadm monitor <args> >&3 & udevadmpid=$!
# and wait only on the grep process
egrep -m1 <pattern> <&3 & wait $!
# we even kill this for cleanliness, though it's not needed
kill $udevadmpid

Stack Exchange Network

What explains this very odd behavior of GNU grep interacting with buffering and pipes and how to stop it?

2 Answers 2

You must log in to answer this question.

Linked

Hot Network Questions

What explains this very odd behavior of GNU grep interacting with buffering and pipes and how to stop it?

2 Answers 2

You must log in to answer this question.

Linked

Related

Hot Network Questions