POSIX shell compatible and POSIX conforming dirname implementation

Question 1

The discussion in the comments of this answer made me wonder how hard it would be to write a built-in only version of dirname.

This is what I ended up with but I'm reasonably confident it should be able to be improved (I haven't spent much time thinking about it yet).

Any improvements/suggestions are welcome.

#!/bin/sh
p=1ドル
alt=2ドル
skip78=
stripslashes() {
 i=1ドル
 while [ "$i" != "$o" ]; do
 o=$i
 i=${i%/}
 done
 eval "2ドル=\$i"
}
[ "$p" = '//' ] || {
 case "$p" in
 *[!/]*|'')
 stripslashes "$p" p
 case "$p" in
 */*)
 p=${p%/*}
 ;;
 *)
 p=.
 skip78=skip78
 ;;
 esac
 ;;
 *)
 p=/
 skip78=skip78
 ;;
 esac
}
[ -n "$skip78" ] || {
 { [ "$p" != '//' ] || [ -z "$alt" ]; } && {
 stripslashes "$p" p
 [ -z "$p" ] && p=/
 }
}
printf -- %s\\n "$p"

The POSIX spec for dirname is:

If string is //, skip steps 2 to 5.

If string consists entirely of <slash> characters, string shall be set to a single <slash> character. In this case, skip steps 3 to 8.

If there are any trailing <slash> characters in string, they shall be removed.

If there are no <slash> characters remaining in string, string shall be set to a single <period> character. In this case, skip steps 5 to 8.

If there are any trailing non- <slash> characters in string, they shall be removed.

If the remaining string is //, it is implementation-defined whether steps 7 and 8 are skipped or processed.

If there are any trailing <slash> characters in string, they shall be removed.

If the remaining string is empty, string shall be set to a single <slash> character.

Update: Made a gist.

Question 2

bash is the only option possible or a awk, sed, ... could be used also ?

Question 3

@NeronLeVelu The idea was something to replace the sub-shell needed for calling dirname. So anything that needs a sub-shell mostly negates the purpose. In theory this script would be turned into a function like stripslashes that operates on its arguments.

Question 4

I don't think there's a really compelling reason for this eval in stripslashes:

eval "2ドル=\$i"

It's a dirty hack to update the parameter variable in-place. But this is not a common practice, I don't know other commands or built-ins that work that way, and I would find such usage unexpected. I would find it more usable if stripslashes behaved more like other commands do:

print the result on stdout
set the result in a designated variable (like REPLY)

That would be more in-line with common UNIX practices, and unsurprising, familiar.

As a minor nitpick, I avoid quoting when it's unnecessary. For example instead of [ "$p" != '//' ], I would write [ "$p" != // ]

Instead of this:

 [ -z "$p" ] && p=/

This form is shorter:

 [ "$p" ] || p=/

This might be subjective, but I would rewrite this part using regular if-else:

[ -n "$skip78" ] || {
 { [ "$p" != '//' ] || [ -z "$alt" ]; } && {
 stripslashes "$p" p
 [ -z "$p" ] && p=/
 }
}

Question 5

We can't use stdout for stripslashes because that requires a sub-shell and at that point we are back to "just use dirname". We could just write into p directly by name but that makes stripslashes a one-use helper function as opposed to being generally useful written this way. With bash as opposed to sh I'd use printf -v there instead of eval. As to the quoting and the -z test those were to avoid possibly confusing people less familiar with the shell though [ -n "$p" ] || p=/ might be better for that purpose as I imagine -n is much more commonly used than -z.

Question 6

But actually I just realized that : "${p:=/}" is clearly the right way to do that bit.

Question 7

Whenever one sets out to reimplement some existing functions, it is wise to prepare a test suite to ensure the old behaviors have been faithfully preserved. I begin my review by presenting my test script that generates test cases using fuzzing and hand-picked edge cases:

#!/bin/sh
for dir in '' '/' '//' $(tr -dc abc/ </dev/urandom | head -c800 | split -b8 --filter='cat; echo' -)
do
 printf '%s' "Testing with input '$dir'..."
 my_output=$(./dirname.sh "$dir")
 reference_output=$(dirname "$dir")
 if [ "$my_output" != "$reference_output" ]
 then
 printf '%s\n' "my output ($my_output) differs from the reference output ($reference_output)!"
 exit 1
 else
 printf '%s\n' 'OK!'
 fi
done

It assumes the reimplemented function is named dirname.sh with the read and execute bits set, and is placed in and run from the same directory as the test script.

Congratulations! Your solution passes my tests! This gives us confidence in the correctness of your implementation. However, it turns out my test suite wasn't perfect, as it missed out on an extreme edge case that may or may not be important for you. If the user has set the nounset shell option, which may be inherited by scripts in the case of bash at least, or from the current environment if you decide to turn this into a function, your implementation will fail with an error message that may look like this:

./dirname.sh: line 4: 2ドル: unbound variable
# or
./dirname.sh: line 9: o: unbound variable

while the original dirname utility will not be affected by the presence or absence of this shell option, unless an undefined variable is passed directly as an argument to it.

One little nitpick: nowhere is it mentioned on the printf page that the utility shall conform to XBD Utility Syntax Guidelines, so it is unclear if -- works as intended on any POSIX-conforming printf implementation.

As the other answer has mentioned, eval should be seen as a last resort hack because it is extremely easy to shoot yourself in the foot with this potentially dangerous command, and you're even using it on unsanitized user input!

I gave the challenge a try and came up with my own solution that, in my opinion, is more readable and follows the outlined steps more closely. It does not spawn any subshells, staying true to your intentions. It also doesn't use any eval, and uses only one local variable that I promptly clean up to not clutter the environment so that it can readily be turned into a litter-free POSIX function. The environment variables SKIP_STEP_7 and SKIP_STEP_8 can be set to mimick implementation-dependent behaviors that are allowed by the spec. Here is my solution presented as an overall improvement (no function call cost and fewer lines of code, etc.):

#!/bin/sh
dir=1ドル
case $dir in
 '//') # Step 1
 ;;
 *[!/]* | '')
 dir=${dir%"${dir##*[!/]}"} # Step 3
 case $dir in
 */*)
 dir=${dir%"${dir##*/}"} # Step 5
 if [ "$dir" != '//' ]
 then
 # The following steps are NOT implementation defined!
 dir=${dir%"${dir##*[!/]}"} # Step 7
 dir=${dir:-/} # Step 8
 fi
 ;;
 *)
 dir=. # Step 4
 ;;
 esac
 ;;
 *)
 dir=/ # Step 2
 ;;
esac
if [ "$dir" = '//' ] # Step 6
then
 # The following steps ARE implementation defined!
 [ "${SKIP_STEP_7+skip}" ] || dir=${dir%"${dir##*[!/]}"} # Step 7
 [ "${SKIP_STEP_8+skip}" ] || dir=${dir:-/} # Step 8
fi
printf '%s\n' "$dir"
unset -v dir

janos janos 113k15 gold badges154 silver badges396 bronze badges · Answer 1 · 2015-10-16 19:25:43Z

I don't think there's a really compelling reason for this eval in stripslashes:

eval "2ドル=\$i"

It's a dirty hack to update the parameter variable in-place. But this is not a common practice, I don't know other commands or built-ins that work that way, and I would find such usage unexpected. I would find it more usable if stripslashes behaved more like other commands do:

print the result on stdout
set the result in a designated variable (like REPLY)

That would be more in-line with common UNIX practices, and unsurprising, familiar.

As a minor nitpick, I avoid quoting when it's unnecessary. For example instead of [ "$p" != '//' ], I would write [ "$p" != // ]

Instead of this:

 [ -z "$p" ] && p=/

This form is shorter:

 [ "$p" ] || p=/

This might be subjective, but I would rewrite this part using regular if-else:

[ -n "$skip78" ] || {
 { [ "$p" != '//' ] || [ -z "$alt" ]; } && {
 stripslashes "$p" p
 [ -z "$p" ] && p=/
 }
}

We can't use stdout for stripslashes because that requires a sub-shell and at that point we are back to "just use dirname". We could just write into p directly by name but that makes stripslashes a one-use helper function as opposed to being generally useful written this way. With bash as opposed to sh I'd use printf -v there instead of eval. As to the quoting and the -z test those were to avoid possibly confusing people less familiar with the shell though [ -n "$p" ] || p=/ might be better for that purpose as I imagine -n is much more commonly used than -z.
But actually I just realized that : "${p:=/}" is clearly the right way to do that bit.

Gao Gao 1,2209 silver badges21 bronze badges · Answer 2 · 2025-08-12 15:15:32Z

Whenever one sets out to reimplement some existing functions, it is wise to prepare a test suite to ensure the old behaviors have been faithfully preserved. I begin my review by presenting my test script that generates test cases using fuzzing and hand-picked edge cases:

#!/bin/sh
for dir in '' '/' '//' $(tr -dc abc/ </dev/urandom | head -c800 | split -b8 --filter='cat; echo' -)
do
 printf '%s' "Testing with input '$dir'..."
 my_output=$(./dirname.sh "$dir")
 reference_output=$(dirname "$dir")
 if [ "$my_output" != "$reference_output" ]
 then
 printf '%s\n' "my output ($my_output) differs from the reference output ($reference_output)!"
 exit 1
 else
 printf '%s\n' 'OK!'
 fi
done

It assumes the reimplemented function is named dirname.sh with the read and execute bits set, and is placed in and run from the same directory as the test script.

Congratulations! Your solution passes my tests! This gives us confidence in the correctness of your implementation. However, it turns out my test suite wasn't perfect, as it missed out on an extreme edge case that may or may not be important for you. If the user has set the nounset shell option, which may be inherited by scripts in the case of bash at least, or from the current environment if you decide to turn this into a function, your implementation will fail with an error message that may look like this:

./dirname.sh: line 4: 2ドル: unbound variable
# or
./dirname.sh: line 9: o: unbound variable

while the original dirname utility will not be affected by the presence or absence of this shell option, unless an undefined variable is passed directly as an argument to it.

One little nitpick: nowhere is it mentioned on the printf page that the utility shall conform to XBD Utility Syntax Guidelines, so it is unclear if -- works as intended on any POSIX-conforming printf implementation.

As the other answer has mentioned, eval should be seen as a last resort hack because it is extremely easy to shoot yourself in the foot with this potentially dangerous command, and you're even using it on unsanitized user input!

I gave the challenge a try and came up with my own solution that, in my opinion, is more readable and follows the outlined steps more closely. It does not spawn any subshells, staying true to your intentions. It also doesn't use any eval, and uses only one local variable that I promptly clean up to not clutter the environment so that it can readily be turned into a litter-free POSIX function. The environment variables SKIP_STEP_7 and SKIP_STEP_8 can be set to mimick implementation-dependent behaviors that are allowed by the spec. Here is my solution presented as an overall improvement (no function call cost and fewer lines of code, etc.):

#!/bin/sh
dir=1ドル
case $dir in
 '//') # Step 1
 ;;
 *[!/]* | '')
 dir=${dir%"${dir##*[!/]}"} # Step 3
 case $dir in
 */*)
 dir=${dir%"${dir##*/}"} # Step 5
 if [ "$dir" != '//' ]
 then
 # The following steps are NOT implementation defined!
 dir=${dir%"${dir##*[!/]}"} # Step 7
 dir=${dir:-/} # Step 8
 fi
 ;;
 *)
 dir=. # Step 4
 ;;
 esac
 ;;
 *)
 dir=/ # Step 2
 ;;
esac
if [ "$dir" = '//' ] # Step 6
then
 # The following steps ARE implementation defined!
 [ "${SKIP_STEP_7+skip}" ] || dir=${dir%"${dir##*[!/]}"} # Step 7
 [ "${SKIP_STEP_8+skip}" ] || dir=${dir:-/} # Step 8
fi
printf '%s\n' "$dir"
unset -v dir

Stack Exchange Network

POSIX shell compatible and POSIX conforming dirname implementation

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

POSIX shell compatible and POSIX conforming dirname implementation

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions