I would like to track progress of a slow operation using pv
. The size of the input of this operation is known in advance, but the size of its output is not. This forced me to put pv
to the left of the operation in the pipe.
The problem is that the long-running command immediately consumes its whole input because of buffering. This is somewhat similar to the Turn off buffering in pipe question, but in my case it is the consuming operation that is slow, not the producing one and none of the answers to the other question seem to work in this case.
Here is a simple example demonstrating the problem:
seq 20 | pv -l -s 20 | while read line; do sleep 1; done
20 0:00:00 [13.8k/s] [=====================================>] 100%
Instead of getting updated every second, the progress bar immediately jumps to 100% and stays there for the entire 20 seconds it takes to process the input. pv
could only measure the progress if the lines were processed one by one, but the entire input of the last command seems to be read into a buffer.
A somewhat longer example that also demonstrates the unknown number of output lines:
#! /bin/bash
limit=10
seq 20 | \
pv -l -s 20 | \
while read num
do
sleep 1
if [ $num -gt $limit ]
then
echo $num
fi
done
Any suggestions for a workaround? Thanks!
1 Answer 1
In your setup the data has passed pv
while it is still processed on the right side. You could try to move pv
to the rightmost side like this:
seq 20 | while read line; do sleep 1; echo ${line}; done | pv -l -s 20 > /dev/null
Update: Regarding your update, maybe the easiest solution is to use a named pipe and a subshell to monitor the progress:
#! /bin/bash
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
(rm /tmp/progress.pipe; mkfifo /tmp/progress.pipe; tail -f /tmp/progress.pipe | pv -l -s 20 > /dev/null)&
limit=10
seq 20 | \
while read num
do
sleep 1
if [ $num -gt $limit ]
then
echo $num
fi
echo $num > /tmp/progress.pipe
done
-
Thanks, that's a good suggestion, unfortunately the processing that I substituted with
sleep 1
also involves printing some lines and the number of those is not known in advance. I updated my question with this detail.Zoltan– Zoltan2016年10月22日 17:03:51 +00:00Commented Oct 22, 2016 at 17:03 -
1Thanks, this is exactly what I was looking for. I also tried using
mkfifo
, but withouttail -f
it didn't work, that seems to be the critical bit that I missed.Zoltan– Zoltan2016年10月22日 19:34:27 +00:00Commented Oct 22, 2016 at 19:34 -
Please note that this still needs some minor tweaking as in its current form it leaves behind a
pv
process running in the background. This should be easy to take care of so I will handle this on my own, I'm just mentioning it to make other users of your answer aware.Zoltan– Zoltan2016年10月22日 19:54:29 +00:00Commented Oct 22, 2016 at 19:54 -
@Zoltan you are right. I've updated the script to clean up the childs on exit.FloHimself– FloHimself2016年10月22日 20:21:36 +00:00Commented Oct 22, 2016 at 20:21
strace
.seq 20 | pv -ls 20 | pv -qL 10
shows the same behavior.pv
to the left of the processing you are interested in. What's happening here is thatseq 20
immediately outputs the entire sequence, andpv
dutifully reads the whole thing and copies it to stdout, which does not block because pipes are buffered.