I have a lot of subfolders in a parent folder and inside each subfolder, there is a log file. Inside the log file I have a lot of data like this:
> Rotational constants (GHZ): 0.0423083 0.0029364
> 0.0027927 Standard basis: 6-31G(d,p) (6D, 7F) There are 1566 symmetry adapted cartesian basis functions of A symmetry. There are
> 1566 symmetry adapted basis functions of A symmetry. 1566 basis
> functions, 3052 primitive gaussians, 1566 cartesian basis functions
> 355 alpha electrons 355 beta electrons
> nuclear repulsion energy 15971.0567247177 Hartrees. NAtoms= 130 NActive= 130 NUniq= 130 SFac= 1.00D+00 NAtFMM= 60
> NAOKFM=T Big=T Integral buffers will be 131072 words long.
> Raffenetti 2 integral format. Two-electron integral symmetry is
> turned on. One-electron integrals computed using PRISM. NBasis=
> 1566 RedAO= T EigKep= 2.31D-04 NBF= 1566 NBsUse= 1566 1.00D-06
> EigRej= -1.00D+00 NBFU= 1566 Initial guess from the checkpoint file:
> > 0.000000 0.000000 0.000000
> Rot= 1.000000 -0.000006 0.000001 -0.000001 Ang= 0.00 deg. Requested convergence on RMS density matrix=1.00D-08 within 128 cycles. Requested convergence on MAX density matrix=1.00D-06.
> Requested convergence on energy=1.00D-06. No special
> actions if energy rises. SCF Done: E(RB3LYP) = -8526.66394979
> A.U. after 6 cycles
> NFock= 6 Conv=0.72D-08 -V/T= 2.0055 Calling FoFJK, ICntrl= 2127 FMM=T ISym2X=0 I1Cent= 0 IOpClX= 0 NMat=1 NMatS=1
> NMatT=0.
As an example, I am looking for SCF Done: E(RB3LYP) = -8526.66394979
in the abovementioned text. the value after the =
changes in each file. What I need is to extract all the values and put them in a text file in the parent folder. For example, I have 3 folders: bar, baz, and foo. Now I need the following result:
bar : -8526.66394979
baz : -112232.123391
foo : 12312313:34574
After running the following script, I will have only one value (i.e -8526.66394979). Could you please help me to fix the problem?
#!/bin/bash
for file_name in *
do
cd $file_name
EE=$(grep -i 'scf done' *.log | tail -1 | awk 'NR==1 {print 5ドル}')
echo "Electronic Energy : $EE" | column -t -s ":" > ${file_name%%.*}.txt
mv ${file_name%%.*}.txt ../
done
3 Answers 3
If you only have one log file in each directory and you want to keep the last value from that log file and store it in a text file whose name is the name of the directory, you can do something like this:
for dir in */; do
grep -i 'scf done' "$dir"/*.log |
awk 'END{print "Electronic Energy : "5ドル}' |
column -t -s ":" > "${dir///}".txt
done
For example, I used the following setup:
$ tree
.
├── dir1
│ └── file.log
├── dir10
│ └── file.log
├── dir2
│ └── file.log
├── dir3
│ └── file.log
├── dir4
│ └── file.log
├── dir5
│ └── file.log
├── dir6
│ └── file.log
├── dir7
│ └── file.log
├── dir8
│ └── file.log
└── dir9
└── file.log
Each file.log
contained this:
$ cat dir1/file.log
a b scf done 123
Running the for
loop above resulted in:
$ ls *txt
dir10.txt dir1.txt dir2.txt dir3.txt dir4.txt dir5.txt dir6.txt dir7.txt dir8.txt dir9.txt
Each of which contained:
$ cat dir1.txt
Electronic Energy 123
If that doesn't work for you, please update your question and show us the relevant directory structure, file names, example input and expected output.
-
I have edited the question.msndm– msndm2022年06月05日 18:02:05 +00:00Commented Jun 5, 2022 at 18:02
I prepared this one:
#!/bin/bash
for dir in */; do
grep -i 'scf done' "$dir"/*.log |
awk 'END{print ""5ドル}'|
column -t -s ":" > "${dir///}".tmp
done
for file_name in *.tmp
do
echo "${file_name%%.*} : "
cat "$file_name"
done > tmp
awk 'NR%2{printf "%s ",0ドル;next;}1' tmp > tmp2
sort -k 3 tmp2 > Energy.txt
rm *.tmp tmp tmp2
cat Energy.txt
It works and covers all the things that I need. However; I am looking for an advanced way of coding by using probably efficient commands.
Hoping I understood (and reproduced) your directory structure correctly, try this:
awk '/SCF Done/ {print FILENAME ": " $NF}' */*.log
bar/b1.log: -8526.66394979
baz/b2.log: -7777777.22222
baz/b2.log: -112232.123391
foo/f3.log: -7777777.22222
foo/f3.log: 12312313.34574
If you need the directory only, split
the FILENAME
and use the first element of the resulting array. If you need the last entry in every file only, try
awk '/SCF Done/ {print FILENAME ": " $NF}' */*.log | tac | sort -u -k1,1
bar/b1.log: -8526.66394979
baz/b2.log: -112232.123391
foo/f3.log: 12312313.34574
which is easier and more straightforward than arranging it in awk
.
-
It is better but there are some errors. First We need to use 5ドル instead of &NF. bar/bar.log should change to "bar" only And this script extracts all scf done in all files. We need only the last scf done in each file.msndm– msndm2022年06月06日 13:20:55 +00:00Commented Jun 6, 2022 at 13:20
-
So the sample in your question, where the desired number IS $NF, does not represent your real world data? And, did you read the remarks in the post?RudiC– RudiC2022年06月06日 13:27:08 +00:00Commented Jun 6, 2022 at 13:27
-
I don't know the meaning of splitting the file name! Sorry I am absolutely neophite.msndm– msndm2022年06月06日 13:36:02 +00:00Commented Jun 6, 2022 at 13:36
scf done
?