I recently wrote a small script to deploy a mongodb cluster on a single machine. The cluster is composed of :
- 3 configurations server that holds the same informations (they are replicas )
- 2 shard server wich will store the data
- one mongos which manage connection to the cluster
The script parse a configuration file holding the cluster configuration, and then starts the different member in a specific order.
Here is the config file:
#config file to store host and port
#config server list, in folowing format:
# config=<host>:<port>
# started with --dbpath /data/configX where
# X is the position of the server in the list
config=localhost:27018
config=localhost:27019
config=localhost:27020
#mongos instance in folowing format:
# mongos=<host>:<port>
mongos=localhost:27017
#shard list in folowinf format:
# shard=<host>:<port>
# started with --dbpath /data/shardX where
# X is the position of the shard in the list
shard=localhost:27021
shard=localhost:27022
and here is the script:
#!/bin/bash
# this script should be launched on the server where you want the
# mongos to run. It should be run like this:
# ./deploy.sh path/to/config.txt
# make sure that mongod, mongos and mongo are linked correctly
# you can achieved this using the following command:
# sudo ln -s /path/to/mongo/bin/mongo /bin/mongo
# host:port for config server
CONFIG=()
# host:port for shards
SHARD=()
# text colors
red=`tput setaf 1`
green=`tput setaf 2`
reset=`tput sgr0`
# clear log file
echo "" > log.txt
# ignore empty or comment line
sed '/^#/d;/^$/d' 1ドル |
{
while read -r line
do
IFS='=' read -a array <<< "$line"
val="${array[1]}"
key="${array[0]}"
if [ "$key" == "config" ]; then
CONFIG+=("$val")
echo "mongod config server: $val"
fi
if [ "$key" == "mongos" ]; then
MONGOS="$val"
echo "mongos instance: $val"
fi
if [ "$key" == "shard" ]; then
SHARD+=("$val")
echo "shard instance: $val"
fi
done
#start config servers
index=0
for conf in "${CONFIG[@]}"
do
# ping each host to make sure it's reacheable
IFS=':' read -a config <<< ${CONFIG[$index]}
mkdir /data/config$index
echo "starting config server $index"
mongod --configsvr --port ${config[1]} --dbpath /data/config$index --replSet conf > log.txt&
index=$(($index + 1))
sleep 1
done
sleep 10
echo "${green}config servers deployed${reset}"
# setup the config replica set. Only neccessary on first launch
IFS=':' read -a config0 <<< ${CONFIG[0]}
mongo --host ${config0[0]} --port ${config0[1]} --eval "rs.initiate( { _id: \"conf\", members: [ {_id: 0, host:\"${CONFIG[0]}\"}, {_id: 1, host:\"${CONFIG[1]}\"}, {_id: 2, host:\"${CONFIG[2]}\"} ]})"&
# sleep so a primary shard can be designed among config servers
sleep 15
# get mongos infos
IFS=':' read -a mongos <<< $MONGOS
#start mongos
mongos --port ${mongos[1]} --configdb "conf/${CONFIG[0]},${CONFIG[1]},${CONFIG[2]}" > log.txt&
echo "${green}mongos instance configured${reset}"
sleep 5
# start each shard and add them to the cluster
shardnb=0
for shard in "${SHARD[@]}"
do
IFS=':' read -a sh <<< ${SHARD[$shardnb]}
mkdir /data/shard$shardnb
echo "starting shard $shardnb"
mongod --shardsvr --port ${sh[1]} --dbpath /data/shard$shardnb > log.txt&
sleep 5
mongo --host ${mongos[0]} --port ${mongos[1]} --eval "sh.addShard(\"$shard\");"&
sleep 5
shardnb=$(($shardnb + 1))
echo "${green}shard $shard added${reset}"
done
# make sure that the sharded cluster has been deployed correctly
mongo --host ${mongos[0]} --port ${mongos[1]} --eval "sh.status();"
}
I use this to quickly deploy a cluster in a docker container in order to run automated integration test. As this is one of my first bach script, I'm looking for advises on style / readability !
1 Answer 1
To begin with, I agree with Jesse_b that command substitution should use $()
. There's no reason to use backticks in a bash script that uses arrays (that's to say, you're not particularly aiming for portability with some obscure shell).
Indentation
I didn't notice until the end of the script that the majority of your code is in a {...}
block. Indenting that portion of the code more than the braces would have helped.
Reading the config file
Continuing from the previous point, the majority of your code seems to be in the block because of the pipe used for reading the file.
Avoid the pipe
You can use process substitution to avoid the pipe:
while read -r line
do
...
done < <(sed '...' file)
This way, you eliminate the pipe and the need to keep most of your code in a subshell. You can also make the comments more friendly by allowing leading spaces:
sed '/^[[:space:]]*#/d;/^[[:space:]]*$/d'
Or:
sed '/^[[:space:]]*\(#.*\)\?$/d'
Eliminate unnecessary reads and variables
That said, the code for reading the lines seems to be unnecessarily convoluted: a read
first and then again to split? Why not do it in one op? You also use an array, then immediately proceed to assign the array elements to two named variables and thereafter ignore the array completely. Just set those variables via read
directly;
while IFS='=' read -r key value
do
...
done < <(sed '/^[[:space:]]*\(#.*\)\?$/d' file)
Bug, again, later on you split the value
s read here, because they are all host-port pairs. If so, then do the splitting now, and save them in separate arrays in the first place:
declare -a CONFIG_HOSTS CONFIG_PORTS SHARD_HOSTS SHARD_PORTS
while IFS='=:' read -r key host port
do
...
done < <(...)
Repetitive if statements
Next, the chain of if
s are so repetitive that I suggest taking out the echo
s, by making an array out of the labels you print:
declare -A key_labels
key_labels["config"]="mongod config server"
key_labels["mongos"]="mongos instance"
key_labels["shard"]="shard instance"
Then, a single printf
should do. And the block of if
s also look like they fit the role of a case
statement. So:
printf '%s: %s:%s\n' "${key_labels[$key]}" "$host" "$port"
case key in
config)
CONFIG_HOST+=("$host")
CONFIG_PORT+=("$port")
;;
mongos)
MONGOS_HOST="$host"
MONGOS_PORT="$port"
;;
shard)
SHARD_HOST+=("$host")
SHARD_PORT+=("$port")
;;
esac
So, the entire config reading section becomes:
declare -a CONFIG_HOSTS CONFIG_PORTS SHARD_HOSTS SHARD_PORTS
declare -A key_labels
key_labels["config"]="mongod config server"
key_labels["mongos"]="mongos instance"
key_labels["shard"]="shard instance"
while IFS='=:' read -r key host port
do
printf '%s: %s:%s\n' "${key_labels[$key]}" "$host" "$port"
case key in
config)
CONFIG_HOSTS+=("$host")
CONFIG_PORTS+=("$port")
;;
mongos)
MONGOS_HOST="$host"
MONGOS_PORT="$port"
;;
shard)
SHARD_HOSTS+=("$host")
SHARD_PORTS+=("$port")
;;
esac
done < <(sed '/^[[:space:]]*\(#.*\)\?$/d' file)
Looping with an index
Here, since you need the indices for creating the directories, you can just ask bash for the indices directly. Instead of:
index=0
for conf in "${CONFIG[@]}"
do
...
index=$(($index + 1))
done
Do:
for index in "${!CONFIG_HOSTS[@]}"
do
mkdir /data/config"$index"
echo "starting config server $index"
mongod --configsvr --port "${CONFIG_PORT[index]}" --dbpath /data/config"$index" --replSet conf > log.txt &
sleep 1
done
Other notes on this block:
- Note how I eliminated the
IFS=: read ...
and replaced"${config[1]}"
with"${CONFIG_PORT[index]}"
, which immediately makes the meaning of that variable clearer. - The indentation was mixed up, both in this and the other loop. Be consistent.
- The
echo "" > log.txt
from earlier was unnecessary, since you use> log.txt
here, the file will be truncated anyway. - You are losing the output from all but the last
mongod
process, since you use>
instead of>>
. - Always quote your variables.
This index method can be applied to the other loop as well. Also, every instance of IFS=: read ... <<<"$foo"
can now be eliminated, and the resultant variables replaced with the arrays we created earlier, as shown in point 1 above. So, things like config0[1]
become ${CONFIG_PORT[0]}
.
Explore related questions
See similar questions with these tags.
$()
instead of backticks, it is the current POSIX standard and is better for several reasons: Your tput lines should change fromred=
tput setaf 1` tored=$(tput setaf 1)
\$\endgroup\$