I know that a custom IFS value can be set for the scope of a single command/built-in. Is there a way to set a custom IFS value for a single statement?? Apparently not, since based on the below the global IFS value is affected when this is attempted
#check environment IFS value, it is space-tab-newline
printf "%s" "$IFS" | od -bc
0000000 040 011 012
\t \n
0000003
#invoke built-in with custom IFS
IFS=$'\n' read -r -d '' -a arr <<< "$str"
#environment IFS value remains unchanged as seen below
printf "%s" "$IFS" | od -bc
0000000 040 011 012
\t \n
0000003
#now attempt to set IFS for a single statement
IFS=$'\n' a=($str)
#BUT environment IFS value is overwritten as seen below
printf "%s" "$IFS" | od -bc
0000000 012
\n
0000001
7 Answers 7
In some shells (including bash
):
IFS=: command eval 'p=($PATH)'
(with bash
, you can omit the command
if not in sh/POSIX emulation). But beware that when using unquoted variables, you also generally need to set -f
, and there's no local scope for that in most shells.
With zsh, you can do:
(){ local IFS=:; p=($=PATH); }
$=PATH
is to force word splitting which is not done by default in zsh
(globbing upon variable expansion is not done either so you don't need set -f
unless in sh emulation).
However, in zsh
, you'd rather use $path
which is an array tied to $PATH
, or to split with arbitrary delimiters: p=(${(s[:])PATH})
or p=("${(s[:]@)PATH}")
to preserve empty elements.
(){...}
(or function {...}
) are called anonymous functions and are typically used to set a local scope. with other shells that support local scope in functions, you could do something similar with:
e() { eval "$@"; }
e 'local IFS=:; p=($PATH)'
To implement a local scope for variables and options in POSIX shells, you can also use the functions provided at https://github.com/stephane-chazelas/misc-scripts/blob/master/locvar.sh. Then you can use it as:
. /path/to/locvar.sh
var=3,2,2
call eval 'locvar IFS; locopt -f; IFS=,; set -- $var; a=1ドル b=2ドル c=3ドル'
(by the way, it's invalid to split $PATH
that way above except in zsh
as in other shells, IFS is field delimiter, not field separator).
IFS=$'\n' a=($str)
Is just two assignments, one after the other just like a=1 b=2
.
A note of explanation on var=value cmd
:
In:
var=value cmd arg
The shell executes /path/to/cmd
in a new process and passes cmd
and arg
in argv[]
and var=value
in envp[]
. That's not really a variable assignment, but more passing environment variables to the executed command. In the Bourne or Korn shell, with set -k
, you can even write it cmd var=value arg
.
Now, that doesn't apply to builtins or functions which are not executed. In the Bourne shell, in var=value some-builtin
, var
ends up being set afterwards, just like with var=value
alone. That means for instance that the behaviour of var=value echo foo
(which is not useful) varies depending on whether echo
is builtin or not.
POSIX and/or ksh
changed that in that that Bourne behaviour only happens for a category of builtins called special builtins. eval
is a special builtin, read
is not. For non special builtin, var=value builtin
sets var
only for the execution of the builtin which makes it behave similarly to when an external command is being run.
The command
command can be used to remove the special attribute of those special builtins. What POSIX overlooked though is that for the eval
and .
builtins, that would mean that shells would have to implement a variable stack (even though it doesn't specify the local
or typeset
scope limiting commands), because you could do:
a=0; a=1 command eval 'a=2 command eval echo \$a; echo $a'; echo $a
Or even:
a=1 command eval myfunction
with myfunction
being a function using or setting $a
and potentially calling command eval
.
That was really an overlook because ksh
(which the spec is mostly based on) didn't implement it (and AT&T ksh
and zsh
still don't), but nowadays, except those two, most shells implement it. Behaviour varies among shells though in things like:
a=0; a=1 command eval a=2; echo "$a"
though. Using local
on shells that support it is a more reliable way to implement local scope.
-
1Weirdly,
IFS=: command eval ...
setsIFS
only for the duration of theeval
, as mandated by POSIX, in dash, pdksh and bash, but not in ksh 93u. It's unusual to see ksh being the odd-non-compliant-one-out.Gilles 'SO- stop being evil'– Gilles 'SO- stop being evil'2013年09月24日 21:29:43 +00:00Commented Sep 24, 2013 at 21:29
Standard save-and-restore taken from "The Unix Programming Environment" by Kernighan and Pike:
#!/bin/sh
old_IFS=$IFS
IFS="something_new"
some_program_or_builtin
IFS=${old_IFS}
-
4thank you and +1. Yes I am aware of this option, but I would like to know if there is a "cleaner" option if you know what i meaniruvar– iruvar2013年09月24日 16:46:11 +00:00Commented Sep 24, 2013 at 16:46
-
1You could jam it onto one line with semi-colons, but I don't think that's cleaner. It might be nice if everything you wanted to express had special syntactic support, but then we'd probably have to learn carpentry or sumptin instead of coding ;)msw– msw2013年09月24日 16:49:43 +00:00Commented Sep 24, 2013 at 16:49
-
12That fails to restore
$IFS
correctly if it was previously unset.Stéphane Chazelas– Stéphane Chazelas2013年09月24日 17:17:47 +00:00Commented Sep 24, 2013 at 17:17 -
3If it's unset, Bash treats it as
$'\t\n'' '
, as explained here: wiki.bash-hackers.org/syntax/expansion/…davide– davide2015年03月15日 01:36:48 +00:00Commented Mar 15, 2015 at 1:36 -
2@davide, that would be
$' \t\n'
. space has to be first as that's used for"$*"
. Note that it's the same in all Bourne-like shells.Stéphane Chazelas– Stéphane Chazelas2015年05月12日 10:00:59 +00:00Commented May 12, 2015 at 10:00
This snippet from the question:
IFS=$'\n' a=($str)
is interpreted as two separate global variable assignments evaluated from left to right, and is equivalent to:
IFS=$'\n'; a=($str)
or
IFS=$'\n'
a=($str)
This explains both why the global IFS
was modified, and why the word-splitting of $str
into array elements was performed using the new value of IFS
.
You might be tempted to use a subshell to limit the effect of the IFS
modification like this:
str="value 0:value 1"
a=( old values )
( # Following code runs in a subshell
IFS=":"
a=($str)
printf 'Subshell IFS: %q\n' "${IFS}"
echo "Subshell: a[0]='${a[0]}' a[1]='${a[1]}'"
)
printf 'Parent IFS: %q\n' "${IFS}"
echo "Parent: a[0]='${a[0]}' a[1]='${a[1]}'"
but you will quickly notice that the modification of a
is also limited to the subshell:
Subshell IFS: :
Subshell: a[0]='value 0' a[1]='value 1'
Parent IFS: $' \t\n'
Parent: a[0]='old' a[1]='values'
Next, you would be tempted to save/restore IFS using the solution from this previous answer by @msw or to try and use a local IFS
inside a function as suggested by @helpermethod.
But pretty soon, you notice you are in all sorts of trouble, especially if you are a library author who needs to be robust against misbehaving invoking scripts:
- What if
IFS
was initially unset? - What if we are running with
set -u
(a.k.aset -o nounset
)? - What if
IFS
was made read-only viadeclare -r IFS
? - What if I need the save/restore mechanism to work with recursion and or asynchronous execution (such as a
trap
handler`)?
Please don't save/restore IFS. Instead, stick to temporary modifications:
To limit the variable modification to a single command, built-in or function invocation, use
IFS="value" command
.To read into multiple variables by splitting on a specific character (
:
used below as example), use:IFS=":" read -r var1 var2 <<< "$str"
To read into an array use (do this instead of
array_var=( $str )
):IFS=":" read -r -a array_var <<< "$str"
Limit the effects of modifying the variable to a subshell.
To output an array's elements separated by comma:
(IFS=","; echo "${array[*]}")
To capture that into a string:
csv="$(IFS=","; echo "${array[*]}")"
-
1This is an excellent answer, and extremely underrated. Thank you!Dan Moulding– Dan Moulding2019年12月03日 23:21:18 +00:00Commented Dec 3, 2019 at 23:21
-
@DanMoulding: This answer was written 5 years later, that's why it is underratedIgor Chubin– Igor Chubin2021年10月27日 07:41:09 +00:00Commented Oct 27, 2021 at 7:41
Put your script into a function and invoke that function passing the commandline arguments to it. As IFS is defined local, changes to it don't affect the global IFS.
main() {
local IFS='/'
# the rest goes here
}
main "$@"
For this command:
IFS=$'\n' a=($str)
There is an alternative solution:
to give the first assignment (IFS=$'\n'
) a command to execute (a function):
$ split(){ a=( $str ); }
$ IFS=$'\n' split
That will put IFS in the environment to call split, but will not be retained in the present environment.
This also avoids the always risky use of eval.
-
In ksh93 and mksh, and bash and zsh when in POSIX mode, that still leaves
$IFS
set to$'\n'
afterwards as required by POSIX.Stéphane Chazelas– Stéphane Chazelas2019年01月09日 12:26:09 +00:00Commented Jan 9, 2019 at 12:26
The proposed answer from @helpermethod is certainly an interesting approach. But it's also a bit of a trap because in BASH local variable scope extends from the caller to the called function. Therefore, setting IFS in main(), will result in that value persisting to functions called from main(). Here's an example:
#!/usr/bin/env bash
#
func() {
# local IFS='\'
local args=${@}
echo -n "$FUNCNAME A"
for ((i=0; i<${#args[@]}; i++)); do
printf "[%s]: %s" "${i}" "${args[$i]}"
done
echo
local f_args=( $(echo "${args[0]}") )
echo -n "$FUNCNAME B"
for ((i=0; i<${#f_args[@]}; i++)); do
printf "[%s]: %s" "${i}" "${f_args[$i]} "
done
echo
}
main() {
local IFS='/'
# the rest goes here
local args=${@}
echo -n "$FUNCNAME A"
for ((i=0; i<${#args[@]}; i++)); do
printf "[%s]: %s" "${i}" "${args[$i]}"
done
echo
local m_args=( $(echo "${args[0]}") )
echo -n "$FUNCNAME B"
for ((i=0; i<${#m_args[@]}; i++)); do
printf "[%s]: %s" "${i}" "${m_args[$i]} "
done
echo
func "${m_args[*]}"
}
main "$@"
And the output...
main A[0]: ick/blick/flick
main B[0]: ick [1]: blick [2]: flick
func A[0]: ick/blick/flick
func B[0]: ick [1]: blick [2]: flick
If IFS declared in main() wasn't still in scope in func(), then the array would not have been properly parsed in func() B. Uncomment the first line in func() and you get this output:
main A[0]: ick/blick/flick
main B[0]: ick [1]: blick [2]: flick
func A[0]: ick/blick/flick
func B[0]: ick/blick/flick
Which is what you should get if IFS had gone out of scope.
A far better solution IMHO, is to forego changing or relying on IFS at the global/local level. Instead, spawn a new shell and fiddle with IFS in there. For instance, if you were to call func() in main() as follows, passing the array as a string with a backward slash field separator:
func $(IFS='\'; echo "${m_args[*]}")
...that change to IFS will not be reflected within func(). The array will be passed as a string:
ick\blick\flick
...but inside of func() the IFS will still be "/" (as set in main()) unless changed locally in func().
More information about isolating changes to IFS can be viewed at the following links:
How do I convert a bash array variable to a string delimited with newlines?
Hints and Tips for general shell script programing -- See "NOTE the use of sub-shells..."
-
-
"Bash string to array with IFS"
IFS=$'\n' declare -a astr=(...)
perfect thanks!Aquarius Power– Aquarius Power2014年11月16日 04:54:19 +00:00Commented Nov 16, 2014 at 4:54
The most straight forward solution is to take a copy of the original $IFS
, as in e.g. the answer of msw. However, this solution does not distinguish between an unset IFS
and an IFS
set equal to the empty string, which is important for many applications. Here is a more general solution which capture this distinction:
# Functions taking care of IFS
set_IFS(){
if [ -z "${IFS+x}" ]; then
IFS_ori="__unset__"
else
IFS_ori="$IFS"
fi
IFS="1ドル"
}
reset_IFS(){
if [ "${IFS_ori}" == "__unset__" ]; then
unset IFS
else
IFS="${IFS_ori}"
fi
}
# Example of use
set_IFS "something_new"
some_program_or_builtin
reset_IFS