I'm a chemistry student working on a simple bash script to automate HPC density functional theory calculations using Quantum Espresso (pw.x). I am very much a novice at shell scripting and would appreciate some criticism/feedback before I try to implement it on the cluster. The script calls in-gen.py which is a Python script I wrote for generating new input files if certain parameter thresholds in the log file are not met.
#!/bin/bash
#SBATCH --job-name=2x2x6_PbCO3 # job name
#SBATCH --output=slurm.out # Output file name
#SBATCH --error=slurm.err # Error file name
#SBATCH --partition=batch # Partition
#SBATCH --qos=medium+ # Queue
#SBATCH --time=24:00:00 # Time limit
#SBATCH --nodes=3 # Number of nodes
#SBATCH --ntasks-per-node=16 # MPI processes per node
errors='MPI_ABORT|error|aborted|SIGTERM|TIME|CANCELLED|SIGCONT|terminated|fork'
job='2x2x6_PbCO3'
max_runs=5
prefix=PbCO3
#-------------------------------------------------------------------------------
Clear ()
{
# Checks for input/log from a completed batch script
if [[ -f "$prefix.in" && -f "$prefix.log" ]]; then
# If both are found a new input file is generated
python ~/path-to-in-gen.py $prefix.log $prefix.in $job override
if [ -z "$(ls -A ../logs_and_inputs)" ]; then
index=1
else
indices=()
# Old input/log are tagged and stored
for entry in "../logs_and_inputs"/$prefix.in-*; do
indices+=("${entry: -1}")
done
max=${indices[0]}
for n in "${indices[@]}" ; do
((n > max)) && max=$n
done
index=$((max+1))
rm -r *x*x*
mv $prefix.in ../logs_and_inputs/$prefix.in-$index
mv $prefix.log ../logs_and_inputs/$prefix.log-$index
mv $prefix.in-new $prefix.in
fi
fi
}
#-------------------------------------------------------------------------------
# Creates a directory for old logs and inputs to be stored with a numerical tag from 1 to k
Sort ()
{
if [ ! -d "../logs_and_inputs" ]; then
mkdir ../logs_and_inputs
iter=1
else
if [ -z "$(ls -A ../logs_and_inputs)" ]; then
iter=1
else
indices=()
for entry in "../logs_and_inputs"/$prefix.in-*; do
indices+=("${entry: -1}")
done
max=${indices[0]}
for n in "${indices[@]}" ; do
((n > max)) && max=$n
done
iter=$((max+1))
fi
fi
max_iter=$((iter+max_runs))
}
#-------------------------------------------------------------------------------
Automode ()
{
mpirun ~/path-to-pw.x < $prefix.in > $prefix.log
# Runs pw.x job, checks for completion and errors
if grep -E -q -- $errors "slurm.err"; then
exit 1
fi
if grep -q "job DONE." "$prefix.log"; then
# Clear wavefunction files and temp directories
rm -r *x*x*
# Generates new input file
python ~/path-to-in-gen.py $prefix.log $prefix.in $job
# Tags and stores the previous input and log
mv $prefix.in ../logs_and_inputs/$prefix.in-$iter
mv $prefix.log ../logs_and_inputs/$prefix.log-$iter
else
exit 1
fi
# Checks for generation of new input file
if test -f "$prefix.in-new"; then
mv $prefix.in-new $prefix.in
else
exit 1
fi
let "iter=iter+1"
}
#-------------------------------------------------------------------------------
Clear
Sort
while ((iter < max_iter)); do
Automode
done
1 Answer 1
Specific suggestions:
- Function names are by convention
snake_case
. - Use More QuotesTM.
[[
should be used instead of[
, because it's safer.[[ EXPRESSION ]] && [[ EXPRESSION ]]
would be clearer than[[ EXPRESSION && EXPRESSION ]]
, in my opinion.set -o errexit -o noclobber -o nounset -o pipefail
andshopt -s globfail
would make the error handling much stricter.- This is no longer trivial code, so I'd recommend implementing it in a non-shell language like Python.
Tool suggestions:
- Running
shellcheck
on your script regularly is a good way to capture possible issues. In this case the only thing it finds is forrm -r *x*x*
:SC2035: Use ./glob or -- glob so names with dashes won't become options.
-
\$\begingroup\$ Thank you for pointing these issues out. Where would I use
[[ -e PATH ]]
in this script? In place of[ ! -d "../logs_and_inputs" ]
? \$\endgroup\$R.T– R.T2020年09月01日 03:25:04 +00:00Commented Sep 1, 2020 at 3:25 -
1\$\begingroup\$ Clarified, cheers. \$\endgroup\$l0b0– l0b02020年09月01日 03:28:16 +00:00Commented Sep 1, 2020 at 3:28
-
1\$\begingroup\$ I may be misunderstanding, but I used
[ -z "$(ls -A ../logs_and_inputs)" ]
for the scenario where../logs_and_inputs
already existed, but was an empty directory, in which case we start withiter=1
for the input/log tags \$\endgroup\$R.T– R.T2020年09月01日 03:38:19 +00:00Commented Sep 1, 2020 at 3:38 -
1
-
1\$\begingroup\$ Will implement all suggestions aside from 6, which I will read up on how to do before attempting. Much appreciated. \$\endgroup\$R.T– R.T2020年09月01日 04:01:50 +00:00Commented Sep 1, 2020 at 4:01