WordDiff < Codev < TWiki

Diff is generally designed for source code, which is a line by line format. Html which may format a whole paragraph in one line makes the diff outputs less usefull when only small changes are made.

I suggest that wdiff (gnu? Word Diff) be an optional view when its not obvious what the changes were.

You could even get fancy and try to select the appropriate diff view depending on how much has changed.

Note this only effects presentation, rcs will still store files using line based diffs.

-- MathewBoorman - 24 Apr 2000

Interesting idea. Got an algorithm to do it?

diff works on a line (\n) basis, maybe breaking paragraphs down into sentense units and diffing on that. I have a feeling it might required more detailed knowledge of the content structure by the twiki parser.

-- NicholasLee - 24 Apr 2000

Interesting idea. Nicholas' algorithm would be an easy compromise / solution to do a regular diff after breaking up all paragraphs into single sentences ( [\.\?\:] delimiter ) . That is what WinDiff is doing with acceptable results (WinDiffthe GUI diff tool that comes with Micro$oft Visual Studio.)

-- PeterThoeny - 24 Apr 2000

From the gnu site: http://www.gnu.org/software/wdiff/wdiff.html

=The program wdiff is a front end to diff for comparing files on a word per word basis. A word is anything between whitespace. This is useful for comparing two texts in which a few words have been changed and for which paragraphs have been refilled. It works by creating two temporary files, one word per line, and then executes diff on these files. It collects the diff output and uses it to produce a nicer display of word differences between the original files.=

-- MathewBoorman - 24 Apr 2000

I've made a modification to cvsweb once that did exactly this. You can also use strike-through and such to show additions/deletions. If I ever get some time, I'll try and add it to TWiki.

-- MattQuail - 20 Jul 2000

I had a go at hacking this in, but didn't get it finished. I suspect the best way to do this is to post-process the output of line-by-line diff, using wdiff to highlight the actual words changed.

See DiffsHardToRecognize for another discussion of this issue.

-- RichardDonkin - 01 Aug 2001

Long time ago I found this somewhere on the web...

It might ease up on the requirement for the wdiff binary. I don't know if there will be a performance issue though

  • HtmlDiff.pl: A pure perl solution not requiring wdiff

-- JornH@personNOSPAMPLEASENOSPAM.dk aka TWikiGuest - 20 Sep 2001

The GNU wdiff supports:

 --start-delete argument
 Has the same effect as -w.
 -w argument
 Use argument as the "start delete" string. This
 string will be output prior to every sequence of
 deleted text, to mark where it starts. By default,
 no start delete string is used unless there is no
 other means of distinguishing where such text
 starts; in this case the default start delete
 string is [-.
 --end-delete argument
 Has the same effect as -x.
 -x argument
 Use argument as the "end delete" string. This
 string will be output after every sequence of
 deleted text, to mark where it ends. By default,
 no end delete string is used unless there is no
 other means of distinguishing where such text ends;
 in this case the default end delete string is -].

and

 --start-insert argument
 Has the same effect as -y.
 -y argument
 Use argument as the "start insert" string. This
 string will be output prior to any sequence of
 inserted text, to mark where it starts. By
 default, no start insert string is used unless
 there is no other means of distinguishing where
 such text starts; in this case the default start
 insert string is {+.
 --end-insert arguments
 Has the same effect as -z.
 -z argument
 Use argument as the "end insert" string. This
 string will be output after any sequence of
 inserted text, to mark where it ends. By default,
 no end insert string is used unless there is no
 other means of distinguishing where such text ends;
 in this case the default end insert string is +}.

so couldn't you extract the file versions and simple call wdiff with -w <s> and -x </s> to get (削除) deleted text (削除ここまで) and use some other font/color for inserted text. A change would be a deletion followed by an insertion. I tried it on the edited output of a man page and it worked fine.

-- JohnRouillard - 08 Dec 2001

Unless you are willing to make a fairly significant change to the TWiki code, it's simplest to run a line-based diff first, then do wdiff on every pair of line-based diffs (only effect of wdiff is to change the highlighting, i.e. it will still show some unchanged lines for context). Of course, this would be rather slow, so it might be better to do just word-based diff instead, or to build a Perl function that does the word diff, which might be more efficient since it avoids running wdiff for every set of changes.

-- RichardDonkin - 12 Dec 2001

See also DiffsHardToRecognize

-- WolfgangSlany - 31 Dec 2003

I'm going to defer this until Dakar - unless someone wants to play

-- SvenDowideit - 09 May 2004

Topic attachments
I Attachment History Action Size Date Who Comment
Perl source code filepl HtmlDiff.pl r1 manage 5.6 K 2001年09月20日 - 22:28 TWikiGuest A pure perl solution not requiring wdiff
Edit | Attach | (削除) Watch (削除ここまで) | Print version | History : r17 < r16 < r15 < r14 < r13 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r17 - 2006年04月29日 - SamHasler
This site is powered by the TWiki collaboration platform Powered by Perl Hosted by OICcam.com Ideas, requests, problems regarding TWiki? Send feedback. Ask community in the support forum.
Copyright © 1999-2026 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

AltStyle によって変換されたページ (->オリジナル) /