File comparisons using awk: Match columns

Saturday, December 29, 2012 , 3 Comments


File3
a
c
e

File4
a 1 
b 2 
c 3 
d 4 
e 5

the one liner for comparing the first field of file4 with the first field of file3 is:

awk 'FNR==NR{a[0ドル];next}(1ドル in a)' file3 file4

and the output is:

a 1 
c 3 
e 5

And if you want to remove the lines which match just change the above mentioned command by adding a !

awk 'FNR==NR{a[0ドル];next}!(1ドル in a)' file3 file4

3 comments:

  1. Can you please explain how it is working ?

    Reply Delete
  2. awk 'FNR==NR{a[0ドル];next}(1ドル in a)' file3 file4

    FNR->line number of the file.
    NR->line number of all collected data of all the files.
    So the first thing is:
    FNR==NR->this condition will be a succes untill all the lines in the first file are completed processing.As soon as all the lines in the file3 are completed,FNR will be set back to 1 and NR will continue with its numbering.

    So untill this condition satisfies the array a keeps on building with 0ドル(which is the complete line of file3 here).So at the end of file3 the array has all the lines of file3.
    next is like continue in c language it will tell awk to start processing the next line.

    The rest of the code (1ドル in a) will applied only after all the lines in file3 are completed(that is from first line of file4).1ドル represents the first field of file4.
    (1ドル in a) will check whether ther is a 1ドル as a key in the array a.If success this will print the line

    Reply Delete
  3. I want to cmpare two files columnwise in unix using shell script
    file1
    datasrid BMStrid Mersionid country curr
    Met_CCD V14121011081 Recent US USD
    Met_CCD V14121011082 Recent US USD
    Met_CCD V14121011083 Recent GB GDB
    Met_CCD V14121011084 Recent IE GDB
    Met_CCD V14121011085 Recent GB GDB
    Met_CCD V14121011086 Recent AU AUD
    Met_CCD V14121011086 Recent HK HKD
    Met_CCD V14121011087 Recent IE GDB


    file2
    datasrid BMStrid Mersionid country curr
    Met_CCD V14121011081 Recent US USD
    Met_CCD V14121011082 Recent US USD
    Met_CCD V14121011083 Recent GB GDB
    Met_CCD V14121011088 Recent IE GDB
    Met_CCD V14121011085 Recent HK GDB
    Met_CCD V14121011086 Recent AU AUD
    Met_CCD V14121011086 Recent HK HKD
    Met_CCD V14121011087 Recent IE GDB

    Outputfile

    need to compare file2 wrt file1.
    change in any cell should get highlighted in output file.
    like
    o/p file should contain
    Met_CCD 'V14121011088' Recent IE GDB
    Met_CCD V14121011085 Recent 'HK' GDB

    Reply Delete

Subscribe to: Post Comments (Atom)