For example:
file1.txt:
I need to buy apples.
I need to run the laundry.
I need to wash the dog.
I need to get the car detailed.
file2.txt
I need to buy apples.
I need to run the laundry.
I need to wash the car.
I need to get the car detailed.
file3.txt
I need to wash the car.
If I do diff file1.txt file2.txt
,
the statements present in file3.txt should be ignored by diff command if it is present in file2.txt. So, in this case, there should be no difference.
Using ignore flag (diff -I "$line"
) won't help since it finds pattern in both files.
How can I do this?
4 Answers 4
A workaround would be to strip of the corresponding lines and then diff it. That's to say, both file1
and file2
would look like:
I need to buy apples.
I need to run the laundry.
I need to get the car detailed.
You can do this using a combination of grep
, perl
and sed
:
$ lines_to_ignore=$(grep -nFf file3 file2 | perl -pe 's|^(\d+):.*|1ドルs/.//g;|')
$ echo $lines_to_ignore
3s/.//g;
$ diff <(sed "$lines_to_ignore" file1) <(sed "$lines_to_ignore" file2)
$ echo $?
0
- I use
grep
to get the matching lines (along with line numbers) infile2
- Then I use
perl
to get the line numbers from thegrep
output and make sed commands from them (Ns/.//g
deletes every character on line N). - Then I use process substitution to feed the result of
sed
running these commands on the files todiff
.
You could combine diff
and combine
here:
$ diff file1.txt <(combine file2.txt NOT file3.txt)
3d2
< I need to wash the dog.
Updated to reflect changes in OP.
-
"So, in this case, there should be no difference." - from question.muru– muru2015年04月09日 11:04:06 +00:00Commented Apr 9, 2015 at 11:04
-
"the statements present in file3.txt should be ignored by diff command if it is present in file2.txt. So, in this case, there should be no difference."FloHimself– FloHimself2015年04月09日 11:09:12 +00:00Commented Apr 9, 2015 at 11:09
-
And how do you interpret that?muru– muru2015年04月09日 11:10:02 +00:00Commented Apr 9, 2015 at 11:10
-
Sorry, question edited. The difference is only in third line.Menon– Menon2015年04月09日 11:21:40 +00:00Commented Apr 9, 2015 at 11:21
-
1This won't work then. How would you decide which entry correspondents to another changed entry? Are the line numbers fixed like in @muru`s example?FloHimself– FloHimself2015年04月09日 11:34:59 +00:00Commented Apr 9, 2015 at 11:34
use grep option to filter out line from file
$ diff f1 f2
3c3
< I need to wash the dog.
---
> I need to wash the car.
$ diff <( grep -v -f f3 -x f1) <( grep -v -f f3 -x f2)
3d2
< I need to wash the dog.
where
<( )
is a bash syntax to create a temporary file- in grep
-x
match whole lie-f f3
take patterm from file f3-v
show unmatched pattern
-
This looks good but the corresponding difference in
f1
should also be removed.Menon– Menon2015年04月09日 12:03:58 +00:00Commented Apr 9, 2015 at 12:03
diff
might not be the right tool. It sounds like you need to use comm
, which classifies each line as being in one file, the other file, or common to both.
The key limitation, though, is that comm
requires both input files to be sorted
if
condition so I thought first filter out the line from file2 and corresponding lines from file1 somehow and count the lines ofdiff
output.