how to use patch and diff to merge two files and automatically resolve conflicts

Question 1

I have read about diff and patch but I can't figure out how to apply what I need. I guess its pretty simple, so to show my problem take these two files:

a.xml

<resources>
 <color name="same_in_b">#AAABBB</color>
 <color name="not_in_b">#AAAAAA</color>
 <color name="in_b_but_different_val">#AAAAAA</color>
 <color name="not_in_b_too">#AAAAAA</color>
</resources>

b.xml

<resources>
 <color name="same_in_b">#AAABBB</color>
 <color name="in_b_but_different_val">#BBBBBB</color>
 <color name="not_in_a">#AAAAAA</color>
</resources>

I want to have an output, which looks like this (order doesn't matter):

<resources>
 <color name="same_in_b">#AAABBB</color>
 <color name="not_in_b">#AAAAAA</color>
 <color name="in_b_but_different_val">#BBBBBB</color>
 <color name="not_in_b_too">#AAAAAA</color>
 <color name="not_in_a">#AAAAAA</color>
</resources>

The merge should contain all lines along this simple rules:

any line which is only in one of the files
if a line has the same name tag but a different value, take the value from the second

I want to apply this task inside a bash script, so it must not nessesarily need to get done with diff and patch, if another programm is a better fit

Question 2

diff can tell you which lines are in one file but not the other, but only on the granularity of entire lines. patch is only suitable for making the same changes to a similar file (perhaps a different version of the same file, or an entirely different file where however the line numbers and surrounding lines for each change are identical to your original file). So no, they are not particularly suitable for this task. You might want to have a look at wdiff but the solution probably requires a custom script. Since your data looks like XML, you might want to look for some XSL tool.

Question 3

Why all the answers with custom scripts? Merging is a standard and complex problem, and there are good tools for it. Don't reinvent the wheel.

Question 4

You don't need patch for this; it's for extracting changes and sending them on without the unchanged part of the file.

The tool for merging two versions of a file is merge, but as @vonbrand wrote, you need the "base" file from which your two versions diverged. To do a merge without it, use diff like this:

diff -DVERSION1 file1.xml file2.xml > merged.xml

It will enclose each set of changes in C-style #ifdef/#ifndef "preprocessor" commands, like this:

#ifdef VERSION1
<stuff added to file1.xml>
#endif
...
#ifndef VERSION1
<stuff added to file2.xml>
#endif

If a line or region differs between the two files, you'll get a "conflict", which looks like this:

#ifndef VERSION1
<version 1>
#else /* VERSION1 */
<version 2>
#endif /* VERSION1 */

So save the output in a file, and open it in an editor. Search for any places where #else comes up, and resolve them manually. Then save the file and run it through grep -v to get rid of the remaining #if(n)def and #endif lines:

grep -v '^#if' merged.xml | grep -v '^#endif' > clean.xml

In the future, save the original version of the file. merge can give you much better results with the help of the extra information. (But be careful: merge edits one of the files in-place, unless you use -p. Read the manual).

Question 5

I added something for if I had a conflict sed -e "s/^#else.*$/\/\/ conflict/g"

Question 6

I don't think that's a good idea. As I wrote in my answer, you should be removing the #else lines manually, in the editor during conflict resolution.

Question 7

sdiff (1) - side-by-side merge of file differences

Use the --output option, this will interactively merge any two files. You use simple commands to select a change or edit a change.

You should make sure that the EDITOR environment variable is set. The default editor for commands like "eb" is usually ed, a line editor.

EDITOR=nano sdiff -o merged.txt file1.txt file2.txt

Question 8

I find using vim as the EDITOR as better. But this is the best solution, it comes with the diff command too!

Question 9

merge(1) is probably nearer to what you want, but that requires a common ancestor to your two files.

A (dirty!) way of doing it is:

Get rid of the first and last lines, use grep(1) to exclude them
Smash the results together
sort -u leaves a sorted list, eliminates duplicates
Replace first/last line

Humm... something along the lines:

echo '<resources>'; grep -v resources file1 file2 | sort -u; echo '</resources>'

might do.

Question 10

does work in this particular example, but NOT in general: If the name in_b_but_different_val has a value of #00AABB sort will put that on top and erases the second value instead of the first one

Question 11

for the optimal solution in this case you'd have to parse the XML, with a real XML parser not the hacks above, and produce a new merged XML output from that. diff / patch / sort etc. are just all hacks tailored to "particular examples", for a general solution they're simply the wrong tools

Question 12

@alzheimer, whip up something simple to show us...

Question 13

Apparently diff3 works the same way. Requiring a common ancestor file. Why is there no simple CLI tool that just merges 2 files together based on what diff shows.

Question 14

Here a simple solution that works merging up to 10 files:

#!/bin/bash
strip(){
 i=0
 for f; do
 sed -r '
 /<\/?resources>/ d
 s/>/>'$((i++))'/
 ' "$f"
 done
}
strip "$@" | sort -u -k1,1 -t'>' | sed '
 1 s|^|<resources>\n|
 s/>[0-9]/>/
 $ a </resources>
'

please note the arg that comes first has the precedence so you have to call:

script b.xml a.xml

to get common values kept from b.xml rather than a.xml.

script b.xml a.xml outs:

<resources>
 <color name="in_b_but_different_val">#BBBBBB</color>
 <color name="not_in_a">#AAAAAA</color>
 <color name="not_in_b">#AAAAAA</color>
 <color name="not_in_b_too">#AAAAAA</color>
 <color name="same_in_b">#AAABBB</color>
</resources>

Question 15

Another horrible hack - could be simplified, but :P

#!/bin/bash
i=0
while read line
do
 if [ "${line:0:13}" == '<color name="' ]
 then
 a_keys[$i]="${line:13}"
 a_keys[$i]="${a_keys[$i]%%\"*}"
 a_values[$i]="$line"
 i=$((i+1))
 fi
done < a.xml
i=0
while read line
do
 if [ "${line:0:13}" == '<color name="' ]
 then
 b_keys[$i]="${line:13}"
 b_keys[$i]="${b_keys[$i]%%\"*}"
 b_values[$i]="$line"
 i=$((i+1))
 fi
done < b.xml
echo "<resources>"
i=0
for akey in "${a_keys[@]}"
do
 print=1
 for bkey in "${b_keys[@]}"
 do
 if [ "$akey" == "$bkey" ]
 then
 print=0
 break
 fi
 done
 if [ $print == 1 ]
 then
 echo " ${a_values[$i]}"
 fi
 i=$(($i+1))
done
for value in "${b_values[@]}"
do
 echo " $value"
done
echo "</resources>"

Question 16

OK, second try, now in Perl (not production quality, no checking!):

#!/usr/bin/perl
open(A, "a.xml");
while(<A>) {
 next if(m;^\<resource\>$;);
 next if(m;^\<\/resource\>$;);
 ($name, $value) = m;^\s*\<color\s+name\s*\=\s*\"([^"]+)\"\>([^<]+)\<\/color\>$;;
 $nv{$name} = $value if $name;
}
close(A);
open(B, "b.xml");
while(<B>) {
 next if(m;^\<resource\>$;);
 next if(m;^\<\/resource\>$;);
 ($name, $value) = m;^\s*\<color\s+name\s*\=\*\"([^"]+)\"\>([^<]+)\<\/color\>$;;
 $nv{$name} = $value if $name;
}
close(B);
print "<resource>\n";
foreach (keys(%nv)) {
 print " <color name=\"$_\">$nv{$_}</color>\n";
}
print "</resource>\n";

Question 17

Another one, using cut and grep... (takes a.xml b.xml as arguments)

#!/bin/bash
zap='"('"`grep '<color' "2ドル" | cut -d '"' -f 2 | tr '\n' '|'`"'")'
echo "<resources>"
grep '<color' "1ドル" | grep -E -v "$zap"
grep '<color' "2ドル"
echo "</resources>"

Question 18

echo is the default action, so xargs echo is superfluous. Why don't you simply tr '\n' '|' anyway?

Question 19

Good point - it's just a quick hack. I'll edit it.

Question 20

you can also use join:

JOIN(1) User Commands JOIN(1)
NAME
 join - join lines of two files on a common field
SYNOPSIS
 join [OPTION]... FILE1 FILE2
DESCRIPTION
 For each pair of input lines with identical join fields, write a line to
 standard output. The default join field is the first, delimited by blanks.

i found it here: https://stackoverflow.com/questions/10364455/merge-two-files-by-key-if-exists-in-the-first-file-bash-script

Question 21

Could you explain how join would be used in this particular case?

Question 22

So, "using join" may be a correct answer, but it's useless unless one knew how to apply join to this particular issue. The join utility crucially does not read XML, for example.

Question 23

The problem / requirements in the question you linked to are significantly different from those in this question.

alexis alexis 5,8693 gold badges24 silver badges28 bronze badges · Accepted Answer · 2013-02-02 12:23:50Z

You don't need patch for this; it's for extracting changes and sending them on without the unchanged part of the file.

The tool for merging two versions of a file is merge, but as @vonbrand wrote, you need the "base" file from which your two versions diverged. To do a merge without it, use diff like this:

diff -DVERSION1 file1.xml file2.xml > merged.xml

It will enclose each set of changes in C-style #ifdef/#ifndef "preprocessor" commands, like this:

#ifdef VERSION1
<stuff added to file1.xml>
#endif
...
#ifndef VERSION1
<stuff added to file2.xml>
#endif

If a line or region differs between the two files, you'll get a "conflict", which looks like this:

#ifndef VERSION1
<version 1>
#else /* VERSION1 */
<version 2>
#endif /* VERSION1 */

So save the output in a file, and open it in an editor. Search for any places where #else comes up, and resolve them manually. Then save the file and run it through grep -v to get rid of the remaining #if(n)def and #endif lines:

grep -v '^#if' merged.xml | grep -v '^#endif' > clean.xml

In the future, save the original version of the file. merge can give you much better results with the help of the extra information. (But be careful: merge edits one of the files in-place, unless you use -p. Read the manual).

I added something for if I had a conflict sed -e "s/^#else.*$/\/\/ conflict/g"
I don't think that's a good idea. As I wrote in my answer, you should be removing the #else lines manually, in the editor during conflict resolution.

Stack Exchange Network

how to use patch and diff to merge two files and automatically resolve conflicts

8 Answers 8

You must log in to answer this question.

Hot Network Questions

how to use patch and diff to merge two files and automatically resolve conflicts

8 Answers 8

You must log in to answer this question.

Related

Hot Network Questions