4

One of the features in our project is to implement a comparison algorithm between two versions of text and provide a % change between the two versions. While I was researching, I came across google java-diff-utils project.

Has anyone used this for comparing text using java-diff-utils ? Using this utility, I can get a list of "delta" which I assume I can use it for the % of difference between two versions of the text? Is this a correct way of doing this?

If you have done any text comparison algorithm using Java, could you give me some pointers?

gnat
20.5k29 gold badges117 silver badges308 bronze badges
asked Apr 13, 2012 at 14:18

1 Answer 1

1

What does "the % of difference" mean? If you start with a block of text and replace the characters in every other word with "q"s has it changed by 50%? If every other word is replaced with a single "q" has it changed by more than 50%? How much more?

I think the problem is too complex to have a single number as the answer.

This is normally handled with 3 numbers; inserted, deleted & replaced. But the definition of "replaced" can become problematic.

answered Apr 19, 2012 at 13:00
1
  • There are ways to define meaningful metrics on this but you’re right that it’s certainly not trivial. Commented Apr 19, 2012 at 14:32

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.