basic, innocent question: In bash scripting, why ever using a function, if one can set a variable containing command substitution with the essence of the function - a certain command or set of commands, which is supposed to output a certain value?
In other words: Does it matter, if one defines a variable, or a function for a certain, desired output? When and why to implement it as a variable? When and why it's better to implement it as a function?
Example: Let's say, there is a directory on your system, which contains a lot of sub-directories in 1st level, and you want to find out with a bash script, what's the most recently modified.
In a bash script, you can define a variable rece_dir
for it, and
print out its content on demand:
#!/bin/bash
# latest-directory-displayer
#
# Copyleft π― 2024
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
# Displays what's the most recently modified sub-directory within
# current directory.
# Tool variable set
find="/usr/bin/find"
sort="/usr/bin/sort"
tail="/usr/bin/tail"
grep="/bin/grep"
sed="/bin/sed"
# Variable set
rece_dir="`"$find" . -maxdepth 1 -type d -printf '%T+ %p\n' | \
"$sort" | "$tail" -1 | "$grep" -o "/.*" | \
"$sed" 's/ /\\\&/g;s+$+/+'`"
# Function set
################################## Main part ###################################
printf "$rece_dir\n"
exit
Cute. But why not doing it like this?
#!/bin/bash
# latest-directory-displayer
#
# Copyleft π― 2024
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU Affero General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU Affero General Public License for more details.
#
# You should have received a copy of the GNU Affero General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
# Displays what's the most recently modified sub-directory within
# current directory.
# Tool variable set
find="/usr/bin/find"
sort="/usr/bin/sort"
tail="/usr/bin/tail"
grep="/bin/grep"
sed="/bin/sed"
# Variable set
# Function set
display_latest_dir() {
"$find" ./ -maxdepth 1 -type d -printf '%T+ %p\n' |
"$sort" |
"$tail" -1 |
"$grep" -o "/.*" |
"$sed" 's/ /\\&/g;s+$+/+'
}
################################## Main part ###################################
display_latest_dir
exit
Same output, same basic inner workings, but one over variable, while the other over function.
A minor difference I've spotted was different escape requirements.
As soon as you put your set of commands into a variable as command substitution,
it probably needs 1x more \
wherever you've escaped with a back slash.
Why not always use variables, instead of functions? Why not always use functions, instead of variables?
-
Please read mywiki.wooledge.org/BashFAQ/050. Also copy/paste your first script into shellcheck.net and fix the issues it tells you about.Ed Morton– Ed Morton2024εΉ΄06ζ08ζ₯ 00:20:28 +00:00Commented Jun 8, 2024 at 0:20
2 Answers 2
In the first example, the find
is executed when the variable is set, so the contents of the variable are static. This means that if the directory contents change then the variable will be "wrong".
You can test this by
rmdir a b
mkdir a
printf "$rece_dir\n"
mkdir b
printf "$rece_dir\n"
When I first ran this I got two blank lines 'cos my test directory was empty; second time I ran it I got two "b" results.
In the second case the find
is executed each time the function is called.
Compare:
rmdir a b
mkdir a
display_latest_dir
mkdir b
display_latest_dir
This time I correctly get "a" and "b" output.
If you only want static output then for a simple command (and your's is sufficiently simple) then I wouldn't use a function. But if there was a lot of work (eg loops) then I might make it into one.
BTW as a matter of coding style you may want to use $(...)
instead of the older backtick notation
rece_dir=$($find . -maxdepth 1 -type d -printf '%T+ %p\n' |
$sort | $tail -1 | $grep -o "/.*" |
$sed 's/ /\\\&/g;s+$+/+')
And, of course, you can set variable to the output of a function
rece_dir=$(display_latest_dir)
Variables are for data, functions are for code. (And see also How can we run a command stored in a variable?)
While you could store the path to some program in a variable to make sure you get the correct instance (and I've done that too), it'd still be cleaner at the use site with a function. You wouldn't need all those quotes and dollar signs.
E.g. compare:
foo=/usr/bin/foo
"$foo" bar whatever
vs.
foo() {
/usr/bin/foo "$@"
}
foo bar whatever
(Though /bin
and /usr/bin
likely are in $PATH
anyway, so that shouldn't be too necessary.)
The significant difference in your example of rece_dir=
vs. display_latest_dir
seems to be that in one of the cases, you store the output from the pipeline to a variable, and in another, you let it get printed to the terminal. The choice between those depends on what you want to do. Function or not, there's no need to use a command substitution to store something in a variable just to print it out again. And if you do need it in variable, you can still use a function with the command substitution too.
Both of these work:
findsomething() {
find ... | blah | blah
}
result=$(findsomething)
and
result=$(find ... | blah | blah)
Usually the difference is that functions can be useful for reuse and splitting code into logical pieces for easier reading.
(Note that filenames can contain newlines, so a pipeline like find -printf "...\n" | ...
is not safe in general. And you could probably replace tail | grep | sed
with just a single sed
.)
You must log in to answer this question.
Explore related questions
See similar questions with these tags.