3
\$\begingroup\$

This is my best attempt so far at a bash script argument parser written without GNU getopt or bash getopts

the first two functions, usage and err can be more or less ignored, but I plan on adding the ability to specify an exit code when calling err.

Now for the first section of code in the main function:

shopt -s extglob
args=()
for (( i = 1; i <= "$#"; i++ )); do
 arg="${!i}"
 case "${arg}" in
 -[[:alpha:]?]+([[:alpha:]?]))
 for (( j = 1; j < "${#arg}"; j++ )); do
 args+=("-${arg:j:1}")
 done ;;
 -[[:alpha:]]=*|--*=*)
 args+=("${arg%%=*}")
 args+=("${arg#*=}") ;;
 *)
 args+=("${arg}") ;;
 esac
done
set -- "${args[@]}"
shopt -u extglob

This section de-concatenates flags and separates flags joined to their values with an =. as an example, ./script -xYz --test=value would pop out as ./script -x -Y -z --test value. Most of this could likely have been combined with the second part, but I think there's at least a little value in being able to access/save the intermediate form, and it made debugging easier. Single letter flags with = can also be processed, but as I understand it, this isn't an extremely common thing to see anyway. Flags that are already in the correct format, flags that could not be reformatted due to user error (./script -xYz=value, for instance), and positional parameters would both be passed to the second part without any modification. This hasn't caused any issues yet, but I am considering trying to further differentiate between good/bad input. at the very end, the reformatted args are set for later use.

Part two of the main function:

 args=()
 for (( i = 1; i <= "$#"; i++ )); do
 arg="${!i}"
 case "${arg}" in
 --)
 break ;;
 -*)
 case "${arg}" in
 -h|--help|-\?)
 usage ;;
 -x|-Y|-z) ;;
 -t|--type)
 arg2="${!i+1}"
 if [[ -n "${arg2}" ]] && [[ "${arg2:0:1}" != "-" ]]; then
 type="${arg2}"
 (( i++ ))
 else
 err "Invalid option: ${arg} requires an argument"
 fi ;;
 -*)
 err "Invalid option: ${arg}" ;;
 esac ;;
 *)
 args+=("${arg}")
 esac
 done
 set -- "${args[@]}"

This is where flags and their values actually get processed. Flags (and their arguments, if applicable) are checked one at a time, but only unused positional parameters are put back in the array to be set once again. For example ./script -x -Y -z --test value hello world would pop out as ./script hello world

I did what I could to prevent any special cases slipping through, and to account for as many common formats as possible, but I couldn't find a list of either, so I'd really appreciate advice on both of those issues.

I also have very little experience writing bash scripts, so general bash scripting advice would also be greatly appreciated.

Full code:

#!/bin/bash
usage() {
 echo "help me"
 exit 0
}
err() {
 echo "$*" >&2
 exit 1
}
shopt -s extglob
main() {
 shopt -s extglob
 args=()
 for (( i = 1; i <= "$#"; i++ )); do
 arg="${!i}"
 case "${arg}" in
 -[[:alpha:]?]+([[:alpha:]?]))
 for (( j = 1; j < "${#arg}"; j++ )); do
 args+=("-${arg:j:1}")
 done ;;
 -[[:alpha:]]=*|--*=*)
 args+=("${arg%%=*}")
 args+=("${arg#*=}") ;;
 *)
 args+=("${arg}") ;;
 esac
 done
 set -- "${args[@]}"
 shopt -u extglob
 echo "0ドル $@"
 args=()
 for (( i = 1; i <= "$#"; i++ )); do
 arg="${!i}"
 case "${arg}" in
 --)
 break ;;
 -*)
 case "${arg}" in
 -h|--help|-\?)
 usage ;;
 -x|-Y|-z) ;;
 -t|--type)
 arg2="${!i+1}"
 if [[ -n "${arg2}" ]] && [[ "${arg2:0:1}" != "-" ]]; then
 type="${arg2}"
 (( i++ ))
 else
 err "Invalid option: ${arg} requires an argument"
 fi ;;
 -*)
 err "Invalid option: ${arg}" ;;
 esac ;;
 *)
 args+=("${arg}")
 esac
 done
 set -- "${args[@]}"
 echo "0ドル ${@:-No positional parameters set}"
 echo "test: ${test:-Test not set}"
}
shopt -u extglob
main "$@"
asked Oct 29, 2021 at 5:18
\$\endgroup\$

1 Answer 1

1
\$\begingroup\$

I think this is pretty nicely written Bash.

I see the parsing happens in two passes:

  • Convert the argument list to some sort of canonical form
  • Validate the argument list

This is easy to understand and I think it makes sense.

Handling arguments after --

As written, the program ignores all further arguments after --.

The common practice is to take all arguments after -- verbatim, without further parsing. For example this behavior makes it possible to use the rm command to delete a file named -f if you ever need it. You would do that with rm -- -f instead of rm -f (which usually does nothing).

Keeping things "simple"

I'm not a fan of advanced features of Bash. I think they are pushing the limits of the language, and a common source of bugs, and code that's difficult to understand.

Look at what extglob forces you to do:

  • shopt -s extglob before the declaration of main and cleaning up with shopt -u extglob after it, so that main can be parsed
  • Then inside main, again shopt -s extglob before you need it, and cleaning up with shopt -u extglob when you no longer need it

I find this double activation / deactivation dirty.

If you gotta use it, you gotta use it. If I have a chance to do without it, I would. And here I see an opportunity. By reorganizing the conditions, you could achieve something similar:

args=()
for (( i = 1; i <= $#; i++ )); do
 arg="${!i}"
 case "${arg}" in
 -[[:alpha:]]=*|--*=*)
 args+=("${arg%%=*}")
 args+=("${arg#*=}") ;;
 --)
 for (( j = i; j <= $#; j++ )); do
 args+=("${!j}")
 done
 break ;;
 --*)
 args+=("${arg}") ;;
 -*)
 for (( j = 1; j < "${#arg}"; j++ )); do
 args+=("-${arg:j:1}")
 done ;;
 *)
 args+=("${arg}") ;;
 esac
done
set -- "${args[@]}"

The difference from your original is that -[[:alpha:]?]+([[:alpha:]?]) is replaced with simply -*. To put it simply, I think the practical implication is that an argument like -c9 would be converted to -c -9 instead of keeping it as -c9.

I don't know if this would be acceptable to you. If yes, then you could get rid of all the shopt, and I think that would be a good thing.

shellcheck

As you are new to Bash, it's probably good to point out shellcheck.net (also available as a command line tool), a nice tool to check Bash code against common mistakes and bad practices. It finds just a minor issue about echo "0ドル $@", where the recommended usage would be echo "0ドル $*".

answered Oct 29, 2021 at 14:39
\$\endgroup\$
1
  • 1
    \$\begingroup\$ so instead of just ending the parse loop, everything after -- should be just be passed without modification to be used as a positional parameter? That makes sense, and explains why set -- behaves the way it does. That should also let me remove the nested case in part 2. are there any other special cases (like --) I may have missed? \$\endgroup\$ Commented Oct 29, 2021 at 19:04

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.