7

Problem:

  1. Need to compare two files,
  2. removing the duplicate from the first file
  3. then appending the lines of file1 to file2

Illustration by example

Suppose, The two files are test1 and test2.

$ cat test2
www.xyz.com/abc-2
www.xyz.com/abc-3
www.xyz.com/abc-4
www.xyz.com/abc-5
www.xyz.com/abc-6

And test1 is

$ cat test1
www.xyz.com/abc-1
www.xyz.com/abc-2
www.xyz.com/abc-3
www.xyz.com/abc-4
www.xyz.com/abc-5

Comparing test1 to test2 and removing duplicates from test 1

Result Required:

$ cat test1
www.xyz.com/abc-1

and then adding this test1 data in to test2

$ cat test2
www.xyz.com/abc-2
www.xyz.com/abc-3
www.xyz.com/abc-4
www.xyz.com/abc-5
www.xyz.com/abc-6
www.xyz.com/abc-1

Solutions Tried:

join -v1 -v2 <(sort test1) <(sort test2)

which resulted into this (that was wrong output)

$ join -v1 -v2 <(sort test1) <(sort test2)
www.xyz.com/abc-1
www.xyz.com/abc-6

Another solution i tried was :

fgrep -vf test1 test2

which resulted nothing.

Andreas Louv
47.3k14 gold badges109 silver badges126 bronze badges
asked May 28, 2016 at 19:46
1

4 Answers 4

13

Remove lines from test1 because they are in test2:

$ grep -vxFf test2 test1
www.xyz.com/abc-1

To overwrite test1:

grep -vxFf test2 test1 >test1.tmp && mv test1.tmp test1

To append the new test1 to the end of test2:

cat test1 >>test2

The grep options

grep normally prints matching lines. -v tells grep to do the reverse: it prints only lines that do not match

-x tells grep to do whole-line matches.

-F tells grep that we are using fixed strings, not regular expressions.

-f test2 tells grep to read those fixed strings, one per line, from file test2.

answered May 28, 2016 at 19:59
Sign up to request clarification or add additional context in comments.

1 Comment

$ grep -vxFf test2 test1 this is resulting nothing. No output.
8

With awk:

% awk 'NR == FNR{ a[0ドル] = 1;next } !a[0ドル]' test2 test1
www.xyz.com/abc-1

Breakdown:

NR == FNR { # Run for test2 only
 a[0ドル] = 1 # Store whole line as key in associative array
 next # Skip next block
}
!a[0ドル] # Print line from test1 that are not in a
answered May 28, 2016 at 20:30

Comments

3

Solution to 1 and 2 problem.

diff test1 test2 |grep "<"|sed 's/< \+//g' > test1.tmp|mv test1.tmp test1

here is the output

$ cat test1
www.xyz.com/abc-1

solution to 3 problem.

cat test1 >> test2

here is the output

$ cat test2
www.xyz.com/abc-2
www.xyz.com/abc-3
www.xyz.com/abc-4
www.xyz.com/abc-5
www.xyz.com/abc-6
www.xyz.com/abc-1
answered May 28, 2016 at 21:04

2 Comments

$ cat test1 output is < www.xyz.com/abc-1 why this < ?
I have test this in bash, which SHELL you are using? sed 's/< \+//g' is handling it already. Please make sure to maintain the mentioned sequence of files in diff command.
0

If the lines in each file are unique as shown in your sample input then, since you are already sorting the input files in your attempted solutions so sorted output must be OK, this is all you need:

$ sort -u test1 test2
www.xyz.com/abc-1
www.xyz.com/abc-2
www.xyz.com/abc-3
www.xyz.com/abc-4
www.xyz.com/abc-5
www.xyz.com/abc-6

If you need something else then edit your question to clarify your requirements and provide sample input/output that would cause this to break.

answered May 28, 2016 at 22:57

2 Comments

I guess you didnt read the question properly. I want to remove the duplicates from the test1 file and then appending that to test2 file.
I read it perfectly but many times people ask for A when they actually want B and your question sounds like you are describing what you think are the steps required to solve a problem, not the problem itself. Why do you care where the lines from each file end up as long as the result is the unique set of lines from both files?

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.