3
\$\begingroup\$

I'm a chemistry student working on a simple bash script to automate HPC density functional theory calculations using Quantum Espresso (pw.x). I am very much a novice at shell scripting and would appreciate some criticism/feedback before I try to implement it on the cluster. The script calls in-gen.py which is a Python script I wrote for generating new input files if certain parameter thresholds in the log file are not met.

#!/bin/bash
 
#SBATCH --job-name=2x2x6_PbCO3 # job name 
#SBATCH --output=slurm.out # Output file name 
#SBATCH --error=slurm.err # Error file name 
#SBATCH --partition=batch # Partition 
#SBATCH --qos=medium+ # Queue 
#SBATCH --time=24:00:00 # Time limit 
#SBATCH --nodes=3 # Number of nodes 
#SBATCH --ntasks-per-node=16 # MPI processes per node 
errors='MPI_ABORT|error|aborted|SIGTERM|TIME|CANCELLED|SIGCONT|terminated|fork'
job='2x2x6_PbCO3'
max_runs=5
prefix=PbCO3
#-------------------------------------------------------------------------------
Clear () 
{
# Checks for input/log from a completed batch script
if [[ -f "$prefix.in" && -f "$prefix.log" ]]; then
 # If both are found a new input file is generated
 python ~/path-to-in-gen.py $prefix.log $prefix.in $job override
 if [ -z "$(ls -A ../logs_and_inputs)" ]; then
 index=1
 else
 indices=()
 # Old input/log are tagged and stored
 for entry in "../logs_and_inputs"/$prefix.in-*; do
 indices+=("${entry: -1}")
 done
 max=${indices[0]}
 for n in "${indices[@]}" ; do
 ((n > max)) && max=$n
 done
 index=$((max+1))
 rm -r *x*x*
 mv $prefix.in ../logs_and_inputs/$prefix.in-$index
 mv $prefix.log ../logs_and_inputs/$prefix.log-$index
 mv $prefix.in-new $prefix.in
 fi
fi
}
#-------------------------------------------------------------------------------
# Creates a directory for old logs and inputs to be stored with a numerical tag from 1 to k
Sort ()
{
if [ ! -d "../logs_and_inputs" ]; then
 mkdir ../logs_and_inputs
 iter=1
else
 if [ -z "$(ls -A ../logs_and_inputs)" ]; then
 iter=1
 else
 indices=()
 for entry in "../logs_and_inputs"/$prefix.in-*; do
 indices+=("${entry: -1}")
 done
 max=${indices[0]}
 for n in "${indices[@]}" ; do
 ((n > max)) && max=$n
 done
 iter=$((max+1))
 fi
fi
max_iter=$((iter+max_runs))
}
#------------------------------------------------------------------------------- 
Automode () 
{
mpirun ~/path-to-pw.x < $prefix.in > $prefix.log
# Runs pw.x job, checks for completion and errors
if grep -E -q -- $errors "slurm.err"; then 
 exit 1
fi
if grep -q "job DONE." "$prefix.log"; then
 # Clear wavefunction files and temp directories
 rm -r *x*x*
 # Generates new input file
 python ~/path-to-in-gen.py $prefix.log $prefix.in $job
 # Tags and stores the previous input and log
 mv $prefix.in ../logs_and_inputs/$prefix.in-$iter
 mv $prefix.log ../logs_and_inputs/$prefix.log-$iter
else
 exit 1
fi
 # Checks for generation of new input file
if test -f "$prefix.in-new"; then
 mv $prefix.in-new $prefix.in
else
 exit 1
fi
let "iter=iter+1"
}
#------------------------------------------------------------------------------- 
Clear
Sort
while ((iter < max_iter)); do
 Automode
done
asked Sep 1, 2020 at 2:34
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$

Specific suggestions:

  1. Function names are by convention snake_case.
  2. Use More QuotesTM.
  3. [[ should be used instead of [, because it's safer.
  4. [[ EXPRESSION ]] && [[ EXPRESSION ]] would be clearer than [[ EXPRESSION && EXPRESSION ]], in my opinion.
  5. set -o errexit -o noclobber -o nounset -o pipefail and shopt -s globfail would make the error handling much stricter.
  6. This is no longer trivial code, so I'd recommend implementing it in a non-shell language like Python.

Tool suggestions:

  1. Running shellcheck on your script regularly is a good way to capture possible issues. In this case the only thing it finds is for rm -r *x*x*:

    SC2035: Use ./glob or -- glob so names with dashes won't become options.

answered Sep 1, 2020 at 3:02
\$\endgroup\$
5
  • \$\begingroup\$ Thank you for pointing these issues out. Where would I use [[ -e PATH ]] in this script? In place of [ ! -d "../logs_and_inputs" ]? \$\endgroup\$ Commented Sep 1, 2020 at 3:25
  • 1
    \$\begingroup\$ Clarified, cheers. \$\endgroup\$ Commented Sep 1, 2020 at 3:28
  • 1
    \$\begingroup\$ I may be misunderstanding, but I used [ -z "$(ls -A ../logs_and_inputs)" ] for the scenario where ../logs_and_inputs already existed, but was an empty directory, in which case we start with iter=1 for the input/log tags \$\endgroup\$ Commented Sep 1, 2020 at 3:38
  • 1
    \$\begingroup\$ Ah, this one? Ok, that actually looks like a legit use of ls in a script, the first one I've seen so far. Removed that line. \$\endgroup\$ Commented Sep 1, 2020 at 3:44
  • 1
    \$\begingroup\$ Will implement all suggestions aside from 6, which I will read up on how to do before attempting. Much appreciated. \$\endgroup\$ Commented Sep 1, 2020 at 4:01

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.