0

I have a lot of subfolders in a parent folder and inside each subfolder, there is a log file. Inside the log file I have a lot of data like this:

> Rotational constants (GHZ): 0.0423083 0.0029364 
> 0.0027927 Standard basis: 6-31G(d,p) (6D, 7F) There are 1566 symmetry adapted cartesian basis functions of A symmetry. There are
> 1566 symmetry adapted basis functions of A symmetry. 1566 basis
> functions, 3052 primitive gaussians, 1566 cartesian basis functions 
> 355 alpha electrons 355 beta electrons
> nuclear repulsion energy 15971.0567247177 Hartrees. NAtoms= 130 NActive= 130 NUniq= 130 SFac= 1.00D+00 NAtFMM= 60
> NAOKFM=T Big=T Integral buffers will be 131072 words long. 
> Raffenetti 2 integral format. Two-electron integral symmetry is
> turned on. One-electron integrals computed using PRISM. NBasis= 
> 1566 RedAO= T EigKep= 2.31D-04 NBF= 1566 NBsUse= 1566 1.00D-06
> EigRej= -1.00D+00 NBFU= 1566 Initial guess from the checkpoint file:
> > 0.000000 0.000000 0.000000
> Rot= 1.000000 -0.000006 0.000001 -0.000001 Ang= 0.00 deg. Requested convergence on RMS density matrix=1.00D-08 within 128 cycles. Requested convergence on MAX density matrix=1.00D-06. 
> Requested convergence on energy=1.00D-06. No special
> actions if energy rises. SCF Done: E(RB3LYP) = -8526.66394979 
> A.U. after 6 cycles
> NFock= 6 Conv=0.72D-08 -V/T= 2.0055 Calling FoFJK, ICntrl= 2127 FMM=T ISym2X=0 I1Cent= 0 IOpClX= 0 NMat=1 NMatS=1
> NMatT=0.

As an example, I am looking for SCF Done: E(RB3LYP) = -8526.66394979 in the abovementioned text. the value after the = changes in each file. What I need is to extract all the values and put them in a text file in the parent folder. For example, I have 3 folders: bar, baz, and foo. Now I need the following result:

bar : -8526.66394979
baz : -112232.123391
foo : 12312313:34574

After running the following script, I will have only one value (i.e -8526.66394979). Could you please help me to fix the problem?

#!/bin/bash
for file_name in *
do
cd $file_name
EE=$(grep -i 'scf done' *.log | tail -1 | awk 'NR==1 {print 5ドル}')
echo "Electronic Energy : $EE" | column -t -s ":" > ${file_name%%.*}.txt
mv ${file_name%%.*}.txt ../
done
asked Jun 5, 2022 at 17:04
7
  • What problem? How does this fail? It is needlessly complcated for what I think you need, but since you don't tell us how it fails, or show us anything about the directories and files you need to process it is hard to know. Can you please edit your question and i) show us the directory structure and ii) some example files, iii) tell us how this failed and, most importantly, iv) show us the output you expect. Commented Jun 5, 2022 at 17:13
  • For instance, your script is only keeping the last value from the last file. Is that what you want or do you want the last value from each file? Commented Jun 5, 2022 at 17:14
  • I have tried to elaborate on what I need. your code is very good but it saves the values in different .txt files. I need only one .txt file in the parent folder. Thank you so much for your attention. Commented Jun 5, 2022 at 17:41
  • Oh. OK, your code did the same thing, that's why I did it that way. Do you also need that last occurrence of scf done in each file? Can there be more than one or do all your files just have one scf done? Commented Jun 5, 2022 at 20:43
  • Thank you for answering. I prepared the following code: #!/bin/bash for dir in /; do grep -i 'scf done' "$dir"/.log | awk 'END{print ""5ドル}'| column -t -s ":" > "${dir///}".tmp done for file_name in .tmp do echo "${file_name%%.} : " cat "$file_name" done > tmp awk 'NR%2{printf "%s ",0ドル;next;}1' tmp > tmp2 sort -k 3 tmp2 > Energy.txt rm *.tmp tmp tmp2 cat Energy.txt It does work! But rather stupid. Can we convert it to a more efficient script using advanced commands? Commented Jun 5, 2022 at 20:53

3 Answers 3

0

If you only have one log file in each directory and you want to keep the last value from that log file and store it in a text file whose name is the name of the directory, you can do something like this:

for dir in */; do
 grep -i 'scf done' "$dir"/*.log | 
 awk 'END{print "Electronic Energy : "5ドル}' |
 column -t -s ":" > "${dir///}".txt
done

For example, I used the following setup:

$ tree
.
├── dir1
│  └── file.log
├── dir10
│  └── file.log
├── dir2
│  └── file.log
├── dir3
│  └── file.log
├── dir4
│  └── file.log
├── dir5
│  └── file.log
├── dir6
│  └── file.log
├── dir7
│  └── file.log
├── dir8
│  └── file.log
└── dir9
 └── file.log

Each file.log contained this:

$ cat dir1/file.log 
a b scf done 123

Running the for loop above resulted in:

$ ls *txt
dir10.txt dir1.txt dir2.txt dir3.txt dir4.txt dir5.txt dir6.txt dir7.txt dir8.txt dir9.txt

Each of which contained:

$ cat dir1.txt 
Electronic Energy 123

If that doesn't work for you, please update your question and show us the relevant directory structure, file names, example input and expected output.

answered Jun 5, 2022 at 17:22
1
  • I have edited the question. Commented Jun 5, 2022 at 18:02
0

I prepared this one:

#!/bin/bash
for dir in */; do
 grep -i 'scf done' "$dir"/*.log | 
 awk 'END{print ""5ドル}'|
 column -t -s ":" > "${dir///}".tmp
done
for file_name in *.tmp
 do
 echo "${file_name%%.*} : " 
 cat "$file_name" 
 
 
done > tmp
awk 'NR%2{printf "%s ",0ドル;next;}1' tmp > tmp2
 sort -k 3 tmp2 > Energy.txt
rm *.tmp tmp tmp2
cat Energy.txt

It works and covers all the things that I need. However; I am looking for an advanced way of coding by using probably efficient commands.

answered Jun 5, 2022 at 20:57
2
  • Is this an answer or a new question? Commented Jun 9, 2022 at 6:46
  • It is an answer but it is not beautiful. I think it requests an advanced way of coding. Commented Jun 9, 2022 at 23:26
0

Hoping I understood (and reproduced) your directory structure correctly, try this:

awk '/SCF Done/ {print FILENAME ": " $NF}' */*.log
bar/b1.log: -8526.66394979
baz/b2.log: -7777777.22222
baz/b2.log: -112232.123391
foo/f3.log: -7777777.22222
foo/f3.log: 12312313.34574

If you need the directory only, split the FILENAME and use the first element of the resulting array. If you need the last entry in every file only, try

awk '/SCF Done/ {print FILENAME ": " $NF}' */*.log | tac | sort -u -k1,1
bar/b1.log: -8526.66394979
baz/b2.log: -112232.123391
foo/f3.log: 12312313.34574

which is easier and more straightforward than arranging it in awk.

answered Jun 6, 2022 at 9:09
3
  • It is better but there are some errors. First We need to use 5ドル instead of &NF. bar/bar.log should change to "bar" only And this script extracts all scf done in all files. We need only the last scf done in each file. Commented Jun 6, 2022 at 13:20
  • So the sample in your question, where the desired number IS $NF, does not represent your real world data? And, did you read the remarks in the post? Commented Jun 6, 2022 at 13:27
  • I don't know the meaning of splitting the file name! Sorry I am absolutely neophite. Commented Jun 6, 2022 at 13:36

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.