Background
I’ve started putting all my configuration files (which include many shell files) under version control. In the Git repository, I’ve configured a pre-commit hook that runs the checkbashisms
utility from Debian’s devscripts package on each sh
file in the repository as a simple sanity check to ensure that I don’t inadvertently introduce Bash-specific syntax into scripts that should be POSIX compatible.
Reason for using POSIX sh
I’m using /bin/sh
as the interpreter for this pre-commit hook as it’s quicker for systems where /bin/sh
is a link to dash
. Committing files shouldn’t take a noticeable amount of time.
I don’t want to run checkbashisms
if the only files staged for committing are non-shell files and for performance, I only want to execute checkbashisms
command once and supply as arguments the names of all the relevant files that should be checked.
Arguments that may contain spaces
While I don’t normally create files with names containing unusual characters, I tried to write this script to be reasonably robust (without becoming overly complex, harder to read and prone to error on various inputs).
In a more fully-featured shell such as Bash, I’d process a list of file name arguments using an array. However, a plain POSIX shell (such as dash) doesn’t support array variables.
My solution
When using a POSIX shell, the positional parameters can often be used as a substitute but in this case, I couldn’t see how I could easily modify them to only add the filenames of shell scripts. The technique I used was to split fields using newlines (not spaces) and to use a string to store the arguments to be passed to checkbashisms
.
I’ve verified that it works with filenames containing spaces and single quotes – but due to to how the shell represents names containing other characters, it doesn’t work with filenames containing double quotes, tabs, etc. This is fine as I really don’t expect to have to process such filenames.
However, it feels hacky and I don’t know if I’m introducing the potential for other errors ("unknown unknowns"). Modifying IFS
is explicitly not recommended by Greg’s Wiki. I thought I’d post here in the hope that the robustness and/or maintainability of the code can be improved.
Script
#!/bin/sh
# Check shell scripts for Bashisms if the `checkbashisms` utility is installed.
# Called by "git commit" with no arguments. This hook should exit with a
# non-zero status after issuing an appropriate message if it wants to stop the
# commit.
# Allows for files with spaces or single quotes in their name – but not any
# other unusual characters.
set -u # -o nounset – exit with failure if an unset shell variable is referenced.
set -e # -o errexit – exit script if any command fails.
set -o noclobber # prevent shell redirection from over-writing files.
if ! command -v checkbashisms >/dev/null 2>&1; then
echo "‘checkbashisms’ is not available; it can be installed from the ‘devscripts’ package."
else
echo "Checking for Bashisms in shell scripts..."
# Use only newline characters (not spaces) to split filenames.
IFS="
" # POSIX way to set IFS to newline (be careful with indentation).
set -f # Disable globbing of pathnames.
# Arrays are not defined by POSIX so store arguments in a string variable.
filenames=""
# Get list of filenames that have been staged for committing.
for file in $(git diff --cached --name-only); do
# Only process shell scripts.
case "$file" in *.sh|shell/*.sh)
# Add file to list of arguments (separated by newlines).
filenames="$filenames
$file" # again, be careful with indentation of the code.
esac
done;
# If any shell files have been staged, check their syntax for Bashisms.
if [ "$filenames" ]; then
if checkbashisms $filenames; then # no quotes for field-splitting.
printf "(No Bashisms found)"
else
exit 1
fi
fi
# Restore file globbing and IFS.
unset IFS
set +f
fi
# Further code to check for other issues ...
1 Answer 1
First thing, you could turn the check if the checkbashisms
tool is installed into a guard clause, shaving off one level of indentation:
if ! command -v checkbashisms >/dev/null 2>&1; then
echo "‘checkbashisms’ is not available; it can be installed from the ‘devscripts’ package."
exit 1
fi
The remaining logic can be simplified a lot with the xargs
tool.
if git diff --cached --name-only | grep "\\.sh$" | xargs checkbashisms; then
printf "(No Bashisms found)"
else
exit 1
fi
I understand that the code you showed it only part of a bigger script and you still want to use the generated list of files. However, calling git diff
repeatedly will have negligible overhead and you can wrap the code in a shell function like this:
staged_scripts() {
git diff --cached --name-only | grep "\\.sh$"
}
staged_scripts | xargs checkbashisms
Or even
check_staged_scripts() {
git diff --cached --name-only | grep "\\.sh$" | xargs 1ドル
}
check_staged_scripts "checkbashisms"
-
\$\begingroup\$ Thanks Anselm, that’s a great review. I’ve since modified my script somewhat but your answer is still applicable. I like your
xargs
pipeline – it’s much more elegant than the for loop with its conditional statements. BTW, I see you only recently joined the Stack Exchange network so I’d like to say, "Wilkommen". \$\endgroup\$Anthony Geoghegan– Anthony Geoghegan2016年05月26日 08:38:37 +00:00Commented May 26, 2016 at 8:38 -
1\$\begingroup\$ Dankeschön! It was my pleasure. \$\endgroup\$Anselm Helbig– Anselm Helbig2016年05月26日 11:21:47 +00:00Commented May 26, 2016 at 11:21
-
\$\begingroup\$ Just pointing out that some people sometimes name bash files
.bash
, so I would add that to the valid file extension regex. \$\endgroup\$Chev_603– Chev_6032017年07月06日 05:02:50 +00:00Commented Jul 6, 2017 at 5:02