I would welcome feedback about this script to automate the building of docker images from a set of dockerfiles.
The dockerfiles are provided to the script via stdin. The images are saved to the first (and only) argument of this script. In normal use either one or two specific images are removed manually from the machine so that they get rebuilt or all images are removed (using a different script) for them all to be rebuilt.
Example invocation:
$ find dockerfiles -mindepth 1 -maxdepth 1 -type d | dockerfiles-build images
Actual output:
Reading dockerfile directories from std input. Building can take a long time...
Building golang-build:2017年03月05日T22-28GMT from dockerfiles/golang-build
Sending build context to Docker daemon 4.608 kB
<Lots of docker build process text>
Successfully built cc287d22a661
Saving golang-build:2017年03月05日T22-28GMT.save.xz from cc287d22a661
Use images/run-golang-build to use the image
Reading next dockerfile...
$
My code (MIT license):
#!/bin/bash
# Script to build and save specific docker containers to the specified directory with the paths to the dockerfile directory arriving on the standard input
set -eu -o pipefail
if [ -z "${1-}" ] || ! [ -d "${1-}" ]; then
echo "Correct invocation is: $(basename "0ドル") _out-directory_ <_dockerfile-directory-paths_" >&2
exit 1
fi
outdir="${1-out}"
timestamp=$(date '+%Y-%m-%dT%H-%M%Z')
echo "Reading dockerfile directories from std input. Building can take a long time..." >&2
while read dockerfilepath; do
# Filter out empty input
if [ -n "$dockerfilepath" ]; then
# Process the path name
name="$(basename $(readlink -f "$dockerfilepath"))"
# Check that image does not already exist
if ! docker images | grep "$name"; then
# Get docker to build the image
echo "Building $name:$timestamp from $dockerfilepath" >&2
docker build -t "$name":$timestamp "$dockerfilepath"
# Output the image (but only one if there are duplicates with different tags)
image="$(docker images | grep "$name" | awk '{print 3ドル}' | sort -u)"
if [ "$(echo "$image" | wc -w)" -ne "1" ]; then
echo "Multiple images detected for $name; aborting" >&2
exit 1
fi
echo "Saving $name:$timestamp.save.xz from $image" >&2
docker save "$image" | xz -z9 > "$outdir/$name:$timestamp.save.xz"
# output the image specific run script, enforcing a naming convention
cp "$dockerfilepath"/run* "$outdir/run-$name"
echo "Use $outdir/run-$name to use the image" >&2
else
echo "A $name image already exists; remove it to build it again" >&2
fi
else
echo "Skipping empty path" >&2
fi
echo "Reading next dockerfile..." >&2
done
Points of particular interest:
- Are there any unsafe filepath manipulations
- Are variables treated safely
- Could any names break the script
- Anything that could lead to the script being unreliable for unsupervised operation
- Any general optimisations that you could suggest
1 Answer 1
This is a good shell script. You're checking for the existence of directories and files before taking any action and using set -eu
and set -o pipefail
.
In no particular order, here's what comes to mind.
1) In my opinion, the logic is getting slightly too complex for a shell script. I think you'd be better off in the long run with Python / Perl / Ruby.
2) reading from stdin and processing input from stdin in a loop.
See Stéphane Chazelas' answer to this question.
If you're going to do it anyway, you should probably to use IFS= read -r dockerfilepath
.
3) name="$(basename $(readlink -f "$dockerfilepath"))"
should have quotations around $(readlink ...)
, i.e.
name="$(basename "$(readlink -f "$dockerfilepath")")"
.
I don't think quotes around the whole expression are necessary in this case, but it's probably better to keep them there for peace of mind.
also readlink -f
is not portable.
I'm also not sure why you're using it. Are you expecting lots of symlinks in your environment?
4) ! docker images | grep "$name"
You can use grep -q -F -- "$name"
if you don't need the output (and just want the exit status) and want to treat "$name"
as a fixed string.
I think there are a few other places where you're using grep with $name. Always use -F --
unless you want the regex behavior.
5) cp "$dockerfilepath"/run* "$outdir/run-$name"
This is actually fine from a whitespace perspective. (http://mywiki.wooledge.org/glob)
but you may want to set nullglob
or failglob
just to be safe.
Also should just check for the existence of the "$outdir/run-$name" directory earlier so that the operation fails before we attempt to build the container.
I'm not sure what it means if the directory doesn't exist, but a simple
[ -d "$outdir/run-$name" ]
immediately after computing name
would be useful.
6) At the beginning test for the existence of "$outdir"
outdir is just like 1ドル
except it hasa default value of out
. You can move the test for the existence of ${1-}
until after outdir is set.
7) since you're using sort(1)
, set LC_ALL=C; export LC_ALL
. Also, set LC_COLLATE=C; export LC_COLLATE
for good measure.
8) If you aren't going to go the portable route, you can use [[ .. ]]
instead of [ .. ]
for tests.
9) use printf
over echo
More explanation here, basically echo
does weird things with backslashes. This isn't a huge deal since you're only using echo for diagnostic information, except in $(echo "$images" | wc -w)
which should be changed to $(printf "%s" "$images" | wc -w)
.
https://unix.stackexchange.com/questions/65803/why-is-printf-better-than-echo
Here's one possible script with the changes incorporated. I changed the logic somewhat so it now says Reading next dockerfile
at the very beginning and uses continue
to skip the current iteration rather than having multiple levels of if-else in the main loop.
#!/bin/bash
# Script to build and save specific docker containers to the specified directory with the paths to the dockerfile directory arriving on the standard input
set -eu
set -o pipefail
shopt -s failglob
LC_ALL=C; export LC_ALL
LC_COLLATE=C; export LC_COLLATE
outdir="${1-out}"
timestamp=$(date '+%Y-%m-%dT%H-%M%Z')
if [[ ! -d $outdir ]]; then
echo "Correct invocation is: $(basename "0ドル") _out-directory_ <_dockerfile-directory-paths_" >&2
exit 1
fi
diag() {
>&2 printf '%s\n' "1ドル"
}
diag "Reading dockerfile directories from std input. Building can take a long time..."
while IFS= read -r dockerfilepath; do
diag "Reading next dockerfile..."
# Filter out empty input
if [[ -z "$dockerfilepath" ]]; then
diag "skipping empty path"
continue
fi
# Process the path name
name="$(basename "$(readlink -f "$dockerfilepath")")"
outpath="$outdir/run-$name"
# Check that image does not already exist
if docker images | grep -q -F -- "$name"; then
diag "A $name image already exists; remove it to build it again"
continue
fi
# create output directory if it doesn't exist
if [[ ! -d $outpath ]]; then
mkdir -p "$outpath"
fi
# Get docker to build the image
diag "Building $name:$timestamp from $dockerfilepath"
docker build -t "$name":"$timestamp" "$dockerfilepath"
# Output the image (but only one if there are duplicates with different tags)
image="$(docker images | grep -F -- "$name" | awk '{print 3ドル}' | sort -u)"
if [[ "$(printf "%s" "$image" | wc -w)" -ne 1 ]]; then
diag "Multiple images detected for $name; aborting"
exit 1
fi
diag "Saving $name:$timestamp.save.xz from $image"
docker save "$image" | xz -z9 > "$outdir/$name:$timestamp.save.xz"
# output the image specific run script, enforcing a naming convention
cp "$dockerfilepath"/run* "$outpath"
diag "Use $outpath to use the image"
done
-
\$\begingroup\$ Is there a portable equivalent of readlink -f? \$\endgroup\$RidiculousRichard– RidiculousRichard2017年03月06日 07:34:25 +00:00Commented Mar 6, 2017 at 7:34
-
\$\begingroup\$ Is there any way not performing a readlink -f could cause the script to fail? \$\endgroup\$RidiculousRichard– RidiculousRichard2017年03月06日 07:35:24 +00:00Commented Mar 6, 2017 at 7:35
-
\$\begingroup\$ I don't think you need
readlink -f
. If you don't have it, the docker containers will just be named after the last component of the non-canonical path to the directory containing the dockerfile. The most convenient portable-ish equivalent ofreadlink -f
I can think of isperl -MCwd -e 'print Cwd::realpath($ARGV[0])'
. I would only bother with it if you need the script to run without modification on OS X as well and you're using docker machine or something. \$\endgroup\$Greg Nisbet– Greg Nisbet2017年03月06日 13:19:22 +00:00Commented Mar 6, 2017 at 13:19 -
\$\begingroup\$ Are you expecting to run this script as root? \$\endgroup\$Greg Nisbet– Greg Nisbet2017年03月06日 13:24:24 +00:00Commented Mar 6, 2017 at 13:24
-
\$\begingroup\$ I don't; on Centos7 the user is added to a docker group which allows them to perform docker operations. \$\endgroup\$RidiculousRichard– RidiculousRichard2017年03月06日 17:42:12 +00:00Commented Mar 6, 2017 at 17:42
set -o pipefail
? That seems to be the only bashism that you're using. Also, how concerned are you about portability (e.g.readlink -f
is a GNU extension). \$\endgroup\$