28

How do I correctly run a few commands with an altered value of the IFS variable (to change the way field splitting works and how "$*" is handled), and then restore the original value of IFS?

I know I can do

(
 IFS='my value here'
 my-commands here
)

to localize the change of IFS to the sub-shell, but I don't really want to start a sub-shell, especially not if I need to change or set the values of variables that needs to be visible outside of the sub-shell.

I know I can use

saved_IFS=$IFS; IFS='my value here'
my-commands here
IFS=$saved_IFS

but that seems to not restore IFS correctly in the case that the original IFS was actually unset.

Looking for answers that are shell agnostic (but POSIX).

Clarification: That last line above means that I'm not interested in a bash-exclusive solution. In fact, the system I'm using most, OpenBSD, does not even come with bash installed at all by default, and bash is not a shell I use for anything much other than to answer questions on this site. It's much more interesting to see solutions that I may use in bash or other POSIX-like shells without making an effort to write non-portable code.

asked Mar 19, 2021 at 11:03
12
  • 2
    @CharlesDuffy, in bash like in all shells with scoping, you'd rather use local (though it works best with shells with static scoping or with zsh's private instead (not that you'd use $IFS in zsh)) . The output of bash's declare -p is not always safe for evaling. Commented Mar 19, 2021 at 19:20
  • 2
    @CharlesDuffy see Escape a variable for use as content of another script. IOW, I wouldn't use eval on anything that has been quoted with anything other than the single-quote based approaches there (and even then, it's best to avoid evaling arbitrary data if that can be avoided) Commented Mar 19, 2021 at 19:35
  • 1
    @StéphaneChazelas, mm, I have a hard time telling how that answer (to "Escape a variable for use as content of another script") would tell why declare -p within a single Bash script would be a problem? It seems to focus on differences between shells, and mentions a number of different ways for producing quoted versions of a variable, so it's rather hard to pick up what issue you're referring to. Commented Mar 20, 2021 at 9:53
  • 1
    @CharlesDuffy, yes, just still means that the unset case needs special treatment with declare too. A bit like with the unset IFS [ -n "${save+set}" ] && IFS=$save; case below (it's exactly the same workaround of course, since in the other direction you can just declare -p IFS 2> /dev/null) Commented Mar 20, 2021 at 13:47
  • 1
    If it’s only a Single command you can probably also use IFS="Xy" command Commented Mar 20, 2021 at 17:13

6 Answers 6

30

Yes, in the case when IFS is unset, restoring the value from $saved_IFS would actually set the value of IFS (to an empty value).

This would affect the way field splitting of unquoted expansions is done, it would affect field splitting for the read built-in utility, and it would affect the way the positional parameters are combined into a string when using "$*".

With an unset IFS these things would happen as if IFS had the value of a space, a tab character, and a newline character, but with an empty value, there would be no field splitting and the positional parameters would be concatenated into a string with no delimiter when using "$*". So, there's a difference.

To correctly restore IFS, consider setting saved_IFS only if IFS is actually set to something.

unset saved_IFS
[ -n "${IFS+set}" ] && saved_IFS=$IFS

The parameter substitution ${IFS+set} expands to the string set only if IFS is set, even if it is set to an empty string. If IFS is unset, it expands to an empty string, which means that the -n test would be false and saved_IFS would remain unset.

Now, saved_IFS is unset if IFS was initially unset, or it has the value that IFS had, and you can set whatever value you want for IFS and run your code.

When restoring IFS, you do a similar thing:

unset IFS
[ -n "${saved_IFS+set}" ] && { IFS=$saved_IFS; unset saved_IFS; }

The final unset saved_IFS isn't really necessary, but it may be good to clean up old variables from the environment.


An alternative way of doing this, suggested by LL3 in comments (now deleted), relies on prefixing the unset command by :, a built-in utility that does nothing, effectively commenting out the unset, when it's not needed:

saved_IFS=$IFS
${IFS+':'} unset saved_IFS

This sets saved_IFS to the value of $IFS, but then unsets it if IFS was unset.

Then set IFS to your value and run you commands. Then restore with

IFS=$saved_IFS
${saved_IFS+':'} unset IFS

(possibly followed by unset saved_IFS if you want to clean up that variable too).

Note that : must be quoted, as above, or escaped as \:, so that it isn't modified by $IFS containing : (the unquoted parameter substitution invokes field splitting, after all).

answered Mar 19, 2021 at 11:03
4
  • 5
    Note that those kinds of approaches are not re-entrant in that for instance, in between the setting and restoring, you can't call a function that uses the same approach. Commented Mar 19, 2021 at 18:55
  • 2
    Your $IFS+: approach reminds me of groups.google.com/g/comp.unix.shell/c/25QYE-0toQA/m/… :-) Commented Mar 19, 2021 at 19:04
  • 1
    groups.google.com/g/comp.unix.shell/c/00mMle2zpgc/m/… is probably where it was invented. You'll notice Laura Fairhead participated in that thread who coined a few shell idiom pearls. Commented Mar 19, 2021 at 19:12
  • 2
    The change from ${IFS+:} to ${IFS:+':'} would have been as a work around for older versions of zsh, where in sh emulation ${IFS+:} would have expanded to two empty strings if $IFS contained : (: undergoing IFS-splitting) Commented Mar 19, 2021 at 19:17
7

Inside a bash function, you can use local IFS=$'\n' or whatever to shadow the global (or parent function's local) value of IFS while inside the scope of this function. Further assignment to IFS will still be modifying your local version.

In bash,

It is an error to use local when not within a function.

So this doesn't help if you're not writing a function, or using a shell without local (or equivalent), but if you are (and you know IFS values you wants at all points until it returns), there is an easy and good solution.

A function doesn't involve a subshell as long as you define it with
foo(){ ...; } instead of foo() ( ... ).

answered Mar 19, 2021 at 22:26
13
  • 2
    local isn't POSIX, but Bash/Dash/Busybox do have it. Ksh is a problem here, though. Commented Mar 20, 2021 at 10:01
  • @ilkkachu: Oh, I missed the part of the question that was asking for shell-agnostic / POSIX. Even so, I wanted to post for future readers who come across this question without that limitation, because it's enough nicer that it's worth knowing about. Commented Mar 20, 2021 at 10:12
  • 1
    @zwol, yeah, that argument could be used for about half the questions and answers on this site. Also, given that checkbashisms exists, not every script author seems to have gotten that memo. (sure, it's better nowadays, but, still.) Commented Mar 21, 2021 at 8:34
  • 3
    @BrianDrake: No, zwol seems to be arguing that the only reason to write a shell script is portability, which means only using POSIX sh features. (And only features that aren't known to be buggy on any important shells, see zwol's answer). With that mindset, there's never a reason to write bash-only scripts. (This is of course flawed logic; e.g. autocomplete scripts are very shell-specific, and for performance and other reasons are written in the shell's own language.) Commented Mar 21, 2021 at 8:45
  • 2
    @Kusalananda, about that, see List of shells that support `local` keyword for defining local variables Commented Mar 21, 2021 at 13:04
4

In sufficiently old shells, unset either doesn't exist at all or is unusably buggy (comments in Autoconf's source code say that unset IFS may crash the process). Kusalananda's answer cannot be used with such shells.

If you have to worry about shells this old, your best bet is to set IFS to a space, a tab, and a newline, in that order, as early as possible:

# There is a hard tab between the second pair of single quotes.
IFS=' '' ''
'

This setting has the same effect as an unset IFS, but it can be safely saved and restored with the second construct from the question:

saved_IFS="$IFS"; IFS='my value here'
my commands here
IFS="$saved_IFS"

(Double-quoting the right hand side of variable=$othervariable is technically not necessary, but it makes life easier for everyone who might have to read your shell script in the future if you don't make them remember that.)

answered Mar 20, 2021 at 23:52
11
  • +1 Simple, shell agnostic and double-quotes the variable expansions (which the question and other answers failed to do). I suggest you add an explanation about that last point. Commented Mar 21, 2021 at 3:50
  • @BrianDrake, note that foo=$bar is one of the cases where double-quoting is not necessary. (bar e.g. some earlier buggy cases with Certain Shells.) Commented Mar 21, 2021 at 8:43
  • Can you mention a shell that does not have unset? Would this be a POSIX shell? Commented Mar 21, 2021 at 12:18
  • 3
    @Kusalananda POSIX does require unset. The problem is, /bin/sh on several of the most popular surviving proprietary Unixes isn't POSIX compliant -- its behavior was intentionally frozen without the changes required by Unix95. And since /bin/sh is the only shell that is guaranteed to exist, and the one run by system and similar... Commented Mar 21, 2021 at 14:06
  • 1
    @zwol Sorry, but I'm intrigued. What other current popular commercial Unix contains an original Bourne shell? The Korn shell playing the role of sh on AIX has no issue with its unset AFAIK. Only the old SunOS sh on Solaris is documented to not be able to unset IFS (or PATH, or MAILCHECK or the prompt variables). macOS sh is bash, so there should be no issue there. Commented Mar 21, 2021 at 20:52
0

In Bash, I'd do it this way:

[ -v IFS ] && oldIFS="$IFS" || unset oldIFS
IFS=something
some commands
[ -v oldIFS ] && IFS="$oldIFS" || unset IFS

or this way:

[ "${IFS+set}" ] && oldIFS="$IFS" || unset oldIFS
IFS=something
some commands
[ "${oldIFS+set}" ] && IFS="$oldIFS" || unset IFS
answered Mar 21, 2021 at 4:08
7
  • Did you mean [[ instead of [? Your answer mentions Bash and according to the manpage [(1) on my system, there is no -v test. Commented Mar 21, 2021 at 4:22
  • @Brian Drake: Did you try it? I never had a reason to look into [[. Commented Mar 21, 2021 at 4:26
  • From reading the bash manpage more carefully, it turns out that it has its own version of [, which supports the same tests as [[. I do not understand why there are two forms, nor how POSIX-compatible either of them is. Commented Mar 21, 2021 at 4:39
  • Anyway, both [ and [[ work in bash --posix. Perhaps my new question should be: Why did you mention Bash at all? The question asked for a shell-agnostic answer. Commented Mar 21, 2021 at 4:49
  • 3
    @BrianDrake Running bash in POSIX mode does not disable all non-POSIX features. The fact that [[ works in POSIX mode in bash does not mean that [[ is a POSIX feature. The fact that it's not mentioned in the POSIX standard (other than "causing unspecified result") means it's not a POSIX feature. It's allowed to be interpreted (in an unspecified way) by a shell running in POSIX mode. Commented Mar 21, 2021 at 8:32
0

copy

The initial goal is to copy a variable (a) to another (b).
Doing a simple b=$a works if a is set (either a "" or a value), but if a is unset, b needs to be unset as well. If not, b will be set to "".

An unset IFS works differently than a null IFS (in bash):

 $' \t\n' unset null("")
Split Expansions default default no splitting
join arguments with "$*" "1ドルc2ドルc..." "1ドル 2ドル ..." "1ドル2ドル"

So, we need two steps, copy the value and unset the copied variable (if needed). A variable copy from a to b could be done in several ways:

if [ -n "${a+set}" ]; then unset b; else b="$a"; fi
[ -n "${a+set}" ] && unset b || b="$a"
[ "${a+set}" ] && unset b || b="$a"
${a+'false'} && b=$a || unset b

Then, for IFS, we can copy it to oldIFS, change the value of IFS as needed, and restore it after use:

${IFS+'false'} && oldIFS=$IFS || unset oldIFS
IFS='new value'
${oldIFS+'false'} && IFS=$oldIFS || unset IFS

function(s)

The only way to improve this is to use a function, and yes, a function would be able to copy two vars:

copyIFS () { ${IFS+'false'} && oldIFS=$IFS || unset oldIFS; }

provided that the names of the variables to modify are known before writing the function as the function must access such variables at the global scope. No local possible, no use of declare/typeset.

It is not possible in sh to create a function for copyvars var1 var2 (with var1 and var2 variable). That would require the use of named vars.

The restore function (using the swapped variable names) is:

restoreIFS () { ${oldIFS+'false'} && IFS=$oldIFS || unset IFS; }

Defining those two functions, we can do:

copyIFS
IFS='a new value'
restoreIFS

Probably simpler, less prone to mistakes.

answered Apr 11, 2021 at 20:09
0

Not an expert, but in zsh you can also use an anonymous function.

myArray=($'1円', $'1円')
printf "before: "
typeset -p IFS
function {
 local IFS=$'0円'
 joinedArray=${(j::)myArray}
 printf "during: "
 typeset -p IFS
}
printf "after: "
typeset -p IFS

This prints:

before: typeset IFS=$' \t\n\C-@'
during: typeset IFS=$'\C-@'
after: typeset IFS=$' \t\n\C-@'

So the value of IFS is restored. I'm guessing this is probably more lightweight than a subshell.

answered Jul 8, 2023 at 22:50

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.