0

I have a thousand files and I need to verify if they all have exactly the same information in the second column up to a certain number of lines. Below an example. I would like to print the names of the files if the first 5 lines of the second columns of the files file1.txt and file2.txt were not equal. In this case the result should show: "difference between files file1.txt and file2.txt"

file1.txt

jose 50
maria 50
fernando 50
andres 50
martin 30
pablo 30
.
.
.

file2.txt

julia 50
julio 50
alan 50
ruth 50
ana 40
manuel 40
.
.
. 
asked Apr 18, 2019 at 19:56
2
  • Are you only going to compare two at a time? Commented Apr 18, 2019 at 20:11
  • No, I want to compare the files file2.txt, ..., file999.txt with the first (file1.txt) Commented Apr 19, 2019 at 4:55

1 Answer 1

0

Hmm. I think I would do a for loop through the files and compare then with comm.

/tmp ❯ comm -3 <(cat file1.txt|awk '{print 2ドル}') <(cat file2.txt|awk '{print 2ドル}') ⏎
30
30
 40
 40

Note the 30's and 40's are output from the files. Some basic usage of comm: comm -1 -3 <(sort -u FILE1.txt) <(sort -u FILE2.txt)

  • -1 suppress lines unique to FILE1
  • -2 suppress lines unique to FILE2
  • -3 suppress lines that appear in both files

So to put all this together something like:

cd /path/to/files && find . -type f -name "*.txt" | while read filename
do
 echo "*** Checking $filename ***"; comm -3 <(cat reference.txt|awk '{print 2ドル}') <(cat $filename|awk '{print 2ドル}'); echo "";
done
answered Apr 18, 2019 at 20:06

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.