I have a thousand files and I need to verify if they all have exactly the same information in the second column up to a certain number of lines. Below an example. I would like to print the names of the files if the first 5 lines of the second columns of the files file1.txt and file2.txt were not equal. In this case the result should show: "difference between files file1.txt and file2.txt"
file1.txt
jose 50
maria 50
fernando 50
andres 50
martin 30
pablo 30
.
.
.
file2.txt
julia 50
julio 50
alan 50
ruth 50
ana 40
manuel 40
.
.
.
-
Are you only going to compare two at a time?Nasir Riley– Nasir Riley2019年04月18日 20:11:15 +00:00Commented Apr 18, 2019 at 20:11
-
No, I want to compare the files file2.txt, ..., file999.txt with the first (file1.txt)user140259– user1402592019年04月19日 04:55:29 +00:00Commented Apr 19, 2019 at 4:55
1 Answer 1
Hmm. I think I would do a for loop through the files and compare then with comm
.
/tmp ❯ comm -3 <(cat file1.txt|awk '{print 2ドル}') <(cat file2.txt|awk '{print 2ドル}') ⏎
30
30
40
40
Note the 30's and 40's are output from the files. Some basic usage of comm
:
comm -1 -3 <(sort -u FILE1.txt) <(sort -u FILE2.txt)
- -1 suppress lines unique to FILE1
- -2 suppress lines unique to FILE2
- -3 suppress lines that appear in both files
So to put all this together something like:
cd /path/to/files && find . -type f -name "*.txt" | while read filename
do
echo "*** Checking $filename ***"; comm -3 <(cat reference.txt|awk '{print 2ドル}') <(cat $filename|awk '{print 2ドル}'); echo "";
done