I'm a student working as a research assistant and wrote this script to automate density functional theory HPC tasks with SLURM. When a calculation is complete the script checks the contents of a log file for the total force, and if it's above the desired threshold it generates a new input file, file.in.new
, with relaxed atomic positions from the log which is passed to another script in the automation process. Please point out any issues with formatting, syntax, etc, and if there's anything I could do to simplify things.
Example of use: python generate.py file.log file.in
from sys import argv
def arg_parse(argv):
try:
log_file=argv[1]
input_file=argv[2]
override=False
except IndexError as e:
raise SystemExit
if len(argv)==4 and argv[3].strip("-")=='o':
override=True
scan_and_write(log_file, input_file, override)
#---------------------------------------------------------------------
def scan_and_write(log_file, input_file, override):
with open(log_file, 'r+') as log:
total_force=[float(line.split()[3]) for line in log if line.rfind('Total force =') != -1][-1]
tolerance(total_force, override, input_file)
log.seek(0)
total_cycles=sum([1 for line in log if line.rfind('ATOMIC_POSITIONS (crystal)') != -1])
log.seek(0)
index=[int(line.split("=")[1]) for line in log if ("number of atoms/cell" in line)][0]
log.seek(0)
for line in log:
if line.rfind('ATOMIC_POSITIONS (crystal)') != -1:
atomic_positions=[log.readline().split() for i in range(index)]
new_input=open(input_file.replace('.in', '.in.new'), "w+")
fmt = '{:2} {:12.9f} {:12.9f} {:12.9f}\n'
with open(input_file, 'r+') as old_input:
for line in old_input:
if len(line.split()) != 4 and not line[0].isnumeric():
new_input.write(line)
if ('ATOMIC_POSITIONS') in line:
for position in atomic_positions:
new_input.write(fmt.format(position[0],*[float(xred) for xred in position[1:4]]))
#---------------------------------------------------------------------
def tolerance(force, override, file_name):
print('A total force of {} was achieved in the last SCF cycle'.format(force))
if (force < 0.001 and not override):
print("Relaxation sufficient, total force = %s...terminating" %force)
raise SystemExit
if (force < 0.001 and override):
print("Relaxation sufficient...total force = %s\n\nOverriding threshold"\
" limit and generating %s" %(force, file_name.replace('.in', '.in.new')))
#---------------------------------------------------------------------
if __name__ == "__main__":
arg_parse(argv)
1 Answer 1
One tip: use the argparse module.
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--log_file", dest="log_file", type=str, required=True, help="Add some help text here")
parser.add_argument("--input_file", dest="input_file", type=str, required=True, help="Add some help text here")
args = parser.parse_args()
# show the values, will reach this point only if the two parameters were provided
print(f"log_file: {args.log_file}")
print(f"input_file: {args.input_file}")
Then you call your script like this:
python3 generate.py --log_file test.log --input_file test.txt
Parameter order is free.
I have to admit I don't understand much about the purpose since I don't know about your input files. If they are CSV then you might consider using the csv module. Then you should be able to simplify some statements like this one:
total_force=[float(line.split()[3]) for line in log if line.rfind('Total force =') != -1][-1]
Something is lacking in your script, badly: comments. They will help you too, especially when go back to reviewing code you wrote a few months ago. Probably you will have forgotten details and will have to reanalyze your own code.
-
\$\begingroup\$ The argparse tip is super helpful, thank you! The logs and inputs are all just formatted plain text, we're using programs like Quantum Espresso and Abinit. \$\endgroup\$R.T– R.T2020年09月23日 20:01:45 +00:00Commented Sep 23, 2020 at 20:01
log_file
four times. It looks likelog_file
can contain multiple lines with"ATOMIC_POSITIONS"
. If so, each occurrence overwrites the previousatomic_positions
, which seems incorrect. \$\endgroup\$