POSIX shell script using a string to store command arguments

Question 1

Background

I’ve started putting all my configuration files (which include many shell files) under version control. In the Git repository, I’ve configured a pre-commit hook that runs the checkbashisms utility from Debian’s devscripts package on each sh file in the repository as a simple sanity check to ensure that I don’t inadvertently introduce Bash-specific syntax into scripts that should be POSIX compatible.

Reason for using POSIX sh

I’m using /bin/sh as the interpreter for this pre-commit hook as it’s quicker for systems where /bin/sh is a link to dash. Committing files shouldn’t take a noticeable amount of time.

I don’t want to run checkbashisms if the only files staged for committing are non-shell files and for performance, I only want to execute checkbashisms command once and supply as arguments the names of all the relevant files that should be checked.

Arguments that may contain spaces

While I don’t normally create files with names containing unusual characters, I tried to write this script to be reasonably robust (without becoming overly complex, harder to read and prone to error on various inputs).

In a more fully-featured shell such as Bash, I’d process a list of file name arguments using an array. However, a plain POSIX shell (such as dash) doesn’t support array variables.

My solution

When using a POSIX shell, the positional parameters can often be used as a substitute but in this case, I couldn’t see how I could easily modify them to only add the filenames of shell scripts. The technique I used was to split fields using newlines (not spaces) and to use a string to store the arguments to be passed to checkbashisms.

I’ve verified that it works with filenames containing spaces and single quotes – but due to to how the shell represents names containing other characters, it doesn’t work with filenames containing double quotes, tabs, etc. This is fine as I really don’t expect to have to process such filenames.

However, it feels hacky and I don’t know if I’m introducing the potential for other errors ("unknown unknowns"). Modifying IFS is explicitly not recommended by Greg’s Wiki. I thought I’d post here in the hope that the robustness and/or maintainability of the code can be improved.

Script

#!/bin/sh
# Check shell scripts for Bashisms if the `checkbashisms` utility is installed.
# Called by "git commit" with no arguments. This hook should exit with a
# non-zero status after issuing an appropriate message if it wants to stop the
# commit.
# Allows for files with spaces or single quotes in their name – but not any
# other unusual characters.
set -u # -o nounset – exit with failure if an unset shell variable is referenced.
set -e # -o errexit – exit script if any command fails.
set -o noclobber # prevent shell redirection from over-writing files.
if ! command -v checkbashisms >/dev/null 2>&1; then
 echo "‘checkbashisms’ is not available; it can be installed from the ‘devscripts’ package."
else
 echo "Checking for Bashisms in shell scripts..."
 # Use only newline characters (not spaces) to split filenames.
 IFS="
" # POSIX way to set IFS to newline (be careful with indentation).
 set -f # Disable globbing of pathnames.
 # Arrays are not defined by POSIX so store arguments in a string variable.
 filenames=""
 # Get list of filenames that have been staged for committing.
 for file in $(git diff --cached --name-only); do
 # Only process shell scripts.
 case "$file" in *.sh|shell/*.sh)
 # Add file to list of arguments (separated by newlines).
 filenames="$filenames
$file" # again, be careful with indentation of the code.
 esac
 done;
 # If any shell files have been staged, check their syntax for Bashisms.
 if [ "$filenames" ]; then
 if checkbashisms $filenames; then # no quotes for field-splitting.
 printf "(No Bashisms found)"
 else
 exit 1
 fi
 fi
 # Restore file globbing and IFS.
 unset IFS
 set +f
fi
# Further code to check for other issues ...

Question 2

First thing, you could turn the check if the checkbashisms tool is installed into a guard clause, shaving off one level of indentation:

if ! command -v checkbashisms >/dev/null 2>&1; then
 echo "‘checkbashisms’ is not available; it can be installed from the ‘devscripts’ package."
 exit 1
fi

The remaining logic can be simplified a lot with the xargs tool.

if git diff --cached --name-only | grep "\\.sh$" | xargs checkbashisms; then
 printf "(No Bashisms found)"
else
 exit 1
fi

I understand that the code you showed it only part of a bigger script and you still want to use the generated list of files. However, calling git diff repeatedly will have negligible overhead and you can wrap the code in a shell function like this:

staged_scripts() {
 git diff --cached --name-only | grep "\\.sh$"
}
staged_scripts | xargs checkbashisms

Or even

check_staged_scripts() {
 git diff --cached --name-only | grep "\\.sh$" | xargs 1ドル
}
check_staged_scripts "checkbashisms"

Question 3

Thanks Anselm, that’s a great review. I’ve since modified my script somewhat but your answer is still applicable. I like your xargs pipeline – it’s much more elegant than the for loop with its conditional statements. BTW, I see you only recently joined the Stack Exchange network so I’d like to say, "Wilkommen".

Question 4

Dankeschön! It was my pleasure.

Question 5

Just pointing out that some people sometimes name bash files .bash , so I would add that to the valid file extension regex.

Anselm Helbig 1061 silver badge4 bronze badges · Accepted Answer · 2016-05-25 21:02:06Z

First thing, you could turn the check if the checkbashisms tool is installed into a guard clause, shaving off one level of indentation:

if ! command -v checkbashisms >/dev/null 2>&1; then
 echo "‘checkbashisms’ is not available; it can be installed from the ‘devscripts’ package."
 exit 1
fi

The remaining logic can be simplified a lot with the xargs tool.

if git diff --cached --name-only | grep "\\.sh$" | xargs checkbashisms; then
 printf "(No Bashisms found)"
else
 exit 1
fi

I understand that the code you showed it only part of a bigger script and you still want to use the generated list of files. However, calling git diff repeatedly will have negligible overhead and you can wrap the code in a shell function like this:

staged_scripts() {
 git diff --cached --name-only | grep "\\.sh$"
}
staged_scripts | xargs checkbashisms

Or even

check_staged_scripts() {
 git diff --cached --name-only | grep "\\.sh$" | xargs 1ドル
}
check_staged_scripts "checkbashisms"

Thanks Anselm, that’s a great review. I’ve since modified my script somewhat but your answer is still applicable. I like your xargs pipeline – it’s much more elegant than the for loop with its conditional statements. BTW, I see you only recently joined the Stack Exchange network so I’d like to say, "Wilkommen".
Just pointing out that some people sometimes name bash files .bash , so I would add that to the valid file extension regex.

Stack Exchange Network

POSIX shell script using a string to store command arguments

Background

Reason for using POSIX sh

Arguments that may contain spaces

My solution

Script

1 Answer 1

You must log in to answer this question.

Hot Network Questions

POSIX shell script using a string to store command arguments

Background

Reason for using POSIX sh

Arguments that may contain spaces

My solution

Script

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions