How to make diff command ignore certain lines of second file(bash)?

Question 1

For example:

file1.txt:

I need to buy apples.
I need to run the laundry.
I need to wash the dog.
I need to get the car detailed.

file2.txt

I need to buy apples.
I need to run the laundry.
I need to wash the car.
I need to get the car detailed.

file3.txt

I need to wash the car.

If I do diff file1.txt file2.txt, the statements present in file3.txt should be ignored by diff command if it is present in file2.txt. So, in this case, there should be no difference.

Using ignore flag (diff -I "$line") won't help since it finds pattern in both files.

How can I do this?

Question 2

I don't think there's a direct way - you'll have to strip off lines in file1 corresponding to those lines in file2.

Question 3

@muru That was what came into my mind. But that seems to be a tough way. I am using diff command inside if condition so I thought first filter out the line from file2 and corresponding lines from file1 somehow and count the lines of diff output.

Question 4

Please revise your question so it makes sense. If "the statements present in file3.txt should be ignored by diff command if it is present in file2.txt", then you are logically comparing a four-line file to a three-line file, and so there is a difference. Please state the rule that you want the answer to implement that would justify not reporting that "I need to wash the dog." is present in file1.txt but not file2.txt.

Question 5

A workaround would be to strip of the corresponding lines and then diff it. That's to say, both file1 and file2 would look like:

I need to buy apples.
I need to run the laundry.
I need to get the car detailed.

You can do this using a combination of grep, perl and sed:

$ lines_to_ignore=$(grep -nFf file3 file2 | perl -pe 's|^(\d+):.*|1ドルs/.//g;|')
$ echo $lines_to_ignore 
3s/.//g;
$ diff <(sed "$lines_to_ignore" file1) <(sed "$lines_to_ignore" file2) 
$ echo $?
0

I use grep to get the matching lines (along with line numbers) in file2
Then I use perl to get the line numbers from the grep output and make sed commands from them (Ns/.//g deletes every character on line N).
Then I use process substitution to feed the result of sed running these commands on the files to diff.

Question 6

You could combine diff and combine here:

$ diff file1.txt <(combine file2.txt NOT file3.txt)
3d2
< I need to wash the dog.

Updated to reflect changes in OP.

Question 7

"So, in this case, there should be no difference." - from question.

Question 8

"the statements present in file3.txt should be ignored by diff command if it is present in file2.txt. So, in this case, there should be no difference."

Question 9

And how do you interpret that?

Question 10

Sorry, question edited. The difference is only in third line.

Question 11

This won't work then. How would you decide which entry correspondents to another changed entry? Are the line numbers fixed like in @muru`s example?

Question 12

use grep option to filter out line from file

$ diff f1 f2
3c3
< I need to wash the dog.
---
> I need to wash the car.
$ diff <( grep -v -f f3 -x f1) <( grep -v -f f3 -x f2)
3d2
< I need to wash the dog.

where

<( ) is a bash syntax to create a temporary file
in grep
- -x match whole lie
- -f f3 take patterm from file f3
- -v show unmatched pattern

Question 13

This looks good but the corresponding difference in f1 should also be removed.

Question 14

diff might not be the right tool. It sounds like you need to use comm, which classifies each line as being in one file, the other file, or common to both.

The key limitation, though, is that comm requires both input files to be sorted

muru muru 77.7k16 gold badges211 silver badges316 bronze badges · Accepted Answer · 2015-04-09 11:26:48Z

A workaround would be to strip of the corresponding lines and then diff it. That's to say, both file1 and file2 would look like:

I need to buy apples.
I need to run the laundry.
I need to get the car detailed.

You can do this using a combination of grep, perl and sed:

$ lines_to_ignore=$(grep -nFf file3 file2 | perl -pe 's|^(\d+):.*|1ドルs/.//g;|')
$ echo $lines_to_ignore 
3s/.//g;
$ diff <(sed "$lines_to_ignore" file1) <(sed "$lines_to_ignore" file2) 
$ echo $?
0

I use grep to get the matching lines (along with line numbers) in file2
Then I use perl to get the line numbers from the grep output and make sed commands from them (Ns/.//g deletes every character on line N).
Then I use process substitution to feed the result of sed running these commands on the files to diff.

Stack Exchange Network

How to make diff command ignore certain lines of second file(bash)?

4 Answers 4

You must log in to answer this question.

Hot Network Questions

How to make diff command ignore certain lines of second file(bash)?

4 Answers 4

You must log in to answer this question.

Related

Hot Network Questions