I want to diff two sets of mod_rewrite rules. The set of lines are about 90% identical, but the order is so different that diff basically says they are completely different.
How can I see which lines are truly different between two files, regardless of their line number?
3 Answers 3
sort
can be used to get the files into the same order so diff
can compare them and identify the differences. If you have process substitution, you can use that and avoid creating new sorted files.
diff <(sort file1) <(sort file2)
I made a script for this which keeps the line sequence intact. Here's an annotated version of the important lines:
# Strip all context lines
diff_lines="$(grep '^[><+-] ' | sed 's/^+/>/;s/^-/</')" || exit 0
# For each line, count the number of lines with the same content in the
# "left" and "right" diffs. If the numbers are not the same, then the line
# was either not moved or it's not obvious where it was moved, so the line
# is printed.
while IFS= read -r line
do
contents="${line:2}"
count_removes="$(grep -cFxe "< $contents" <<< "$diff_lines" || true)"
count_adds="$(grep -cFxe "> $contents" <<< "$diff_lines" || true)"
if [[ "$count_removes" -eq "$count_adds" ]]
then
# Line has been moved; skip it.
continue
fi
echo "$line"
done <<< "$diff_lines"
if [ "${line+defined}" = defined ]
then
printf "$line"
fi
My open-source Linux tool 'dif' compares files while ignoring various differences.
It has many options for sorting, ignoring timestamps, whitespace, or comments, doing search/replace, ignoring lines matching a regex, etc.
After preprocessing the input files, it runs the Linux tools meld, gvimdiff, tkdiff, diff, or kompare on these intermediate files.
Installation is not required, just download and run the 'dif' executable from https://github.com/koknat/dif
For your use case, try the "sort" option:
dif file1 file2 -sort
sort
fist.