How can I read line by line from a variable in bash?

Question 1

I have a variable which contains multiline output of a command. What's the most effecient way to read the output line by line from the variable?

For example:

jobs="$(jobs)"
if [ "$jobs" ]; then
 # read lines from $jobs
fi

Question 2

You can use a while loop with process substitution:

while read -r line
do
 echo "$line"
done < <(jobs)

An optimal way to read a multiline variable is to set a blank IFS variable and printf the variable in with a trailing newline:

# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
while IFS= read -r line
do
 echo "$line"
done < <(printf '%s\n' "$var")

Note: As per shellcheck sc2031, the use of process substition is preferable to a pipe to avoid [subtly] creating an subshell.

Also, please realize that by naming the variable jobs it may cause confusion since that is also the name of a common shell command.

Question 3

If you want to keep all your whitespace, then use while IFS= read .... If you want to prevent \ interpretation, then use read -r

Question 4

I've fixed the points fred.bear mentioned, as well as changed echo to printf %s, so that your script would work even with non-tame input.

Question 5

To read from a multiline variable, a herestring is preferable to piping from printf (see l0b0's answer).

Question 6

@ata Though I've heard this "preferable" often enough, it must be noted that a herestring always requires the /tmp directory to be writable, as it relies on being able to create a temporary work file. Should you ever find yourself on a restricted system with /tmp being read-only (and not changeable by you), you will be happy about the possibility of using an alternate solution, e. g. with the printf pipe.

Question 7

In the second example, if the multi-line variable doesn't contain a trailing newline you will loose the last element. Change it to: printf "%s\n" "$var" | while IFS= read -r line

Question 8

To process the output of a command line by line (explanation):

jobs |
while IFS= read -r line; do
 process "$line"
done

If you have the data in a variable already:

printf %s "$foo" | ...

printf %s "$foo" is almost identical to echo "$foo", but prints $foo literally, whereas echo "$foo" might interpret $foo as an option to the echo command if it begins with a -, and might expand backslash sequences in $foo in some shells.

Note that in some shells (ash, bash, pdksh, but not ksh or zsh), the right-hand side of a pipeline runs in a separate process, so any variable you set in the loop is lost. For example, the following line-counting script prints 0 in these shells:

n=0
printf %s "$foo" |
while IFS= read -r line; do
 n=$(($n + 1))
done
echo $n

A workaround is to put the remainder of the script (or at least the part that needs the value of $n from the loop) in a command list:

n=0
printf %s "$foo" | {
 while IFS= read -r line; do
 n=$(($n + 1))
 done
 echo $n
}

If acting on the non-empty lines is good enough and the input is not huge, you can use word splitting:

IFS='
'
set -f
for line in $(jobs); do
 # process line
done
set +f
unset IFS

Explanation: setting IFS to a single newline makes word splitting occur at newlines only (as opposed to any whitespace character under the default setting). set -f turns off globbing (i.e. wildcard expansion), which would otherwise happen to the result of a command substitution $(jobs) or a variable substitution $foo. The for loop acts on all the pieces of $(jobs), which are all the non-empty lines in the command output. Finally, restore the globbing and IFS settings to values that are equivalent to the defaults.

Question 9

I have had trouble with setting IFS and unsetting IFS. I think the right thing to do is store the old value of IFS and set IFS back to that old value. I'm not a bash expert, but in my experience, this gets you back to the original bahavior.

Question 10

@BjornRoche: inside a function, use local IFS=something. It won't affect the global-scope value. IIRC, unset IFS doesn't get you back to the default (and certainly doesn't work if it wasn't the default beforehand).

Question 11

I am wondering whether using set in the way shown in the last example is correct. The code snippet assumes that set +f was active at the begin, and therefore restores that setting at the end. However, this assumption might be wrong. What if set -f was active at the beginning?

Question 12

@Binarus I only restore settings equivalent to the defaults. Indeed, if you want to restore the original settings, you need to do more work. For set -f, save the original $-. For IFS, it's annoyingly fiddly if you don't have local and you want to support the unset case; if you do want to restore it, I recommend enforcing the invariant that IFS remains set.

Question 13

Using local would indeed be the best solution, because local - makes the shell options local, and local IFS makes IFS local. Unfortunately, local is only valid within functions, which makes code restructuring necessary. Your suggestion to introduce the policy that IFS is always set also sounds very reasonable and solves the biggest part of the problem. Thanks!

Question 14

jobs="$(jobs)"
while IFS= read -r line
do
 echo "$line"
done <<< "$jobs"

References:

Question 15

-r is a good idea too; It prevents \` interpretation... (it is in your links, but its probably worth mentioning, just to round out your IFS=` (which is essential to prevent losing whitespace)

Question 16

Only this solution worked for me. Thanks brah.

Question 17

Doesn't this solution suffer from the same problem which is mentioned in the comments to @dogbane's answer? What if the last line of the variable is not terminated by a newline character?

Question 18

This answer provides the cleanest way to feed the content of a variable to the while read construct.

Question 19

This and the printf solution in the accepted answer are the only things that seem to work when the variable has already been defined with command substitution earlier, and this way is a lot cleaner than the printf hackiness.

Question 20

Problem: if you use while loop it will run in subshell and all variables will be lost. Solution: use for loop

# change delimiter (IFS) to new line.
IFS_BAK=$IFS
IFS=$'\n'
for line in $variableWithSeveralLines; do
 echo "$line"
 # return IFS back if you need to split new line by spaces:
 IFS=$IFS_BAK
 IFS_BAK=
 lineConvertedToArraySplittedBySpaces=( $line )
 echo "{lineConvertedToArraySplittedBySpaces[0]}"
 # return IFS back to newline for "for" loop
 IFS_BAK=$IFS
 IFS=$'\n'
done 
# return delimiter to previous value
IFS=$IFS_BAK
IFS_BAK=

Question 21

THANK YOU SO MUCH!! All of the above solutions failed for me.

Question 22

piping into a while read loop in bash means the while loop is in a subshell, so variables aren't global. while read;do ;done <<< "$var" makes the loop body not a subshell. (Recent bash has an option to put the body of a cmd | while loop not in a subshell, like ksh has always had.)

Question 23

Also see this related post.

Question 24

In similar situations, I found it surprisingly difficult to treat IFS correctly. This solution has a problem as well: What if IFS is not set at all in the beginning (i.e. is undefined)? It will be defined in every case after that code snippet; this doesn't seem to be correct.

Question 25

In recent bash versions, use mapfile or readarray to efficiently read command output into arrays

$ readarray test < <(ls -ltrR)
$ echo ${#test[@]}
6305

Disclaimer: horrible example, but you can prolly come up with a better command to use than ls yourself

Question 26

It's a nice way, but litters /var/tmp with temporary files on my system. +1 anyway

Question 27

@eugene: that's funny. What system (distro/OS) is that on?

Question 28

It's FreeBSD 8. How to reproduce: put readarray in a function and call the function a few times.

Question 29

Nice one, @sehe. +1

Question 30

The common patterns to solve this issue have been given in the other answers.

However, I'd like to add my approach, although I am not sure how efficient it is. But it is (at least for me) quite understandable, does not alter the original variable (all solutions which use read must have the variable in question with a trailing newline and therefore add it, which alters the variable), does not create subshells (which all pipe-based solutions do), does not use here-strings (which have their own issues), and does not use process substitution (nothing against it, but a bit hard to understand sometimes).

Actually, I don't understand why bash's integrated REs are used so rarely. Perhaps they are not portable, but since the OP has used the bash tag, that won't stop me :-)

#!/bin/bash
function ProcessLine() {
 printf '%s' "1ドル"
}
function ProcessText1() {
 local Text=1ドル
 local Pattern=$'^([^\n]*\n)(.*)$'
 while [[ "$Text" =~ $Pattern ]]; do
 ProcessLine "${BASH_REMATCH[1]}"
 Text="${BASH_REMATCH[2]}"
 done
 ProcessLine "$Text"
}
function ProcessText2() {
 local Text=1ドル
 local Pattern=$'^([^\n]*\n)(.*)$'
 while [[ "$Text" =~ $Pattern ]]; do
 ProcessLine "${BASH_REMATCH[1]}"
 Text="${BASH_REMATCH[2]}"
 done
}
function ProcessText3() {
 local Text=1ドル
 local Pattern=$'^([^\n]*\n?)(.*)$'
 while [[ ("$Text" != '') &&
 ("$Text" =~ $Pattern) ]]; do
 ProcessLine "${BASH_REMATCH[1]}"
 Text="${BASH_REMATCH[2]}"
 done
}
MyVar1=$'a1\nb1\nc1\n'
MyVar2=$'a2\n\nb2\nc2'
MyVar3=$'a3\nb3\nc3'
ProcessText1 "$MyVar1"
ProcessText1 "$MyVar2"
ProcessText1 "$MyVar3"

Output:

root@cerberus:~/scripts# ./test4
a1
b1
c1
a2
b2
c2a3
b3
c3root@cerberus:~/scripts#

A few notes:

The behavior depends on what variant of ProcessText you use. In the example above, I have used ProcessText1.

Note that

ProcessText1 keeps newline characters at the end of lines
ProcessText1 processes the last line of the variable (which contains the text c3) although that line does not contain a trailing newline character. Because of the missing trailing newline, the command prompt after the script execution is appended to the last line of the variable without being separated from the output.
ProcessText1 always considers the part between the last newline in the variable and the end of the variable as a line, even if it is empty; of course, that line, whether empty or not, does not have a trailing newline character. That is, even if the last character in the variable is a newline, ProcessText1 will treat the empty part (null string) between that last newline and the end of the variable as a (yet empty) line and will pass it to line processing. You can easily prevent this behavior by wrapping the second call to ProcessLine into an appropriate check-if-empty condition; however, I think it is more logical to leave it as-is.

ProcessText1 needs to call ProcessLine at two places, which might be uncomfortable if you would like to place a block of code there which directly processes the line, instead of calling a function which processes the line; you would have to repeat the code which is error-prone.

In contrast, ProcessText3 processes the line or calls the respective function only at one place, making replacing the function call by a code block a no-brainer. This comes at the cost of two while conditions instead of one. Apart from the implementation differences, ProcessText3 behaves exactly the same as ProcessText1, except that it does not consider the part between the last newline character in the variable and the end of the variable as line if that part is empty. That is, ProcessText3 will not go into line processing after the last newline character of the variable if that newline character is the last character in the variable.

ProcessText2 works like ProcessText1, except that lines must have a trailing newline character. That is, the part between the last newline character in the variable and the end of the variable is not considered to be a line and is silently thrown away. Consequently, if the variable does not contain any newline character, no line processing happens at all.

I like that approach more than the other solutions shown above, but probably I have missed something (not being very experienced in bash programming, and not being interested in other shells very much).

Question 31

You can use <<< to simply read from the variable containing the newline-separated data:

while read -r line
do 
 echo "A line of input: $line"
done <<<"$lines"

Question 32

Welcome to Unix&Linux! This essentially duplicates an answer from four years ago. Please don’t post an answer unless you have something new to contribute.

dogbane dogbane 30.7k17 gold badges85 silver badges61 bronze badges · Accepted Answer · 2011-03-21 10:40:09Z

117

You can use a while loop with process substitution:

while read -r line
do
 echo "$line"
done < <(jobs)

An optimal way to read a multiline variable is to set a blank IFS variable and printf the variable in with a trailing newline:

# Printf '%s\n' "$var" is necessary because printf '%s' "$var" on a
# variable that doesn't end with a newline then the while loop will
# completely miss the last line of the variable.
while IFS= read -r line
do
 echo "$line"
done < <(printf '%s\n' "$var")

Note: As per shellcheck sc2031, the use of process substition is preferable to a pipe to avoid [subtly] creating an subshell.

Also, please realize that by naming the variable jobs it may cause confusion since that is also the name of a common shell command.

Share

Improve this answer

edited Mar 6, 2018 at 6:21

Jay Taylor's user avatar

Jay Taylor

3153 silver badges12 bronze badges

answered Mar 21, 2011 at 10:40

dogbane's user avatar

dogbane dogbane

30.7k17 gold badges85 silver badges61 bronze badges

8

5

If you want to keep all your whitespace, then use while IFS= read .... If you want to prevent \ interpretation, then use read -r

Peter.O
– Peter.O

2011年03月21日 15:41:25 +00:00
Commented Mar 21, 2011 at 15:41
1

I've fixed the points fred.bear mentioned, as well as changed echo to printf %s, so that your script would work even with non-tame input.

Gilles 'SO- stop being evil'
– Gilles 'SO- stop being evil'

2011年03月21日 20:57:56 +00:00
Commented Mar 21, 2011 at 20:57
To read from a multiline variable, a herestring is preferable to piping from printf (see l0b0's answer).

ata
– ata

2011年11月25日 20:34:16 +00:00
Commented Nov 25, 2011 at 20:34
1

@ata Though I've heard this "preferable" often enough, it must be noted that a herestring always requires the /tmp directory to be writable, as it relies on being able to create a temporary work file. Should you ever find yourself on a restricted system with /tmp being read-only (and not changeable by you), you will be happy about the possibility of using an alternate solution, e. g. with the printf pipe.

syntaxerror
– syntaxerror

2014年12月04日 22:07:49 +00:00
Commented Dec 4, 2014 at 22:07
1

In the second example, if the multi-line variable doesn't contain a trailing newline you will loose the last element. Change it to: printf "%s\n" "$var" | while IFS= read -r line

David H. Bennett
– David H. Bennett

2015年01月15日 03:15:11 +00:00
Commented Jan 15, 2015 at 3:15

| Show 3 more comments

Stack Exchange Network

How can I read line by line from a variable in bash?

7 Answers 7

You must log in to answer this question.

Linked

Hot Network Questions

How can I read line by line from a variable in bash?

7 Answers 7

You must log in to answer this question.

Linked

Related

Hot Network Questions