I'm using SVNKit to get diff information between two revisions. I'm using the diff utility to generate a diff file, however I still need to parse it into numbers.
I implemented a solution, but it is rather slow. JGit does something similar, however it actually parses the values itself and returns an object, rather than a output stream, and is much much faster. I was unable to determine how to leverage that for SVNKit, so attempted the following solution:
private Diff compareRevisions(final SVNRevision rev1, final SVNRevision rev2) throws SVNException {
final Diff diff = new Diff();
try (final ByteArrayOutputStream baos = new ByteArrayOutputStream()) {
doDiff(rev1, rev2, baos);
int filesChanged = 0;
int additions = 0;
int deletions = 0;
final String[] lines = baos.toString().split("\n");
for (final String line : lines) {
if (line.startsWith("---")) {
filesChanged++;
} else if (line.startsWith("+++")) {
// No action needed
} else if (line.startsWith("+")) {
additions++;
} else if (line.startsWith("-")) {
deletions++;
}
}
diff.additions = additions;
diff.deletions = deletions;
diff.changedFiles = filesChanged;
return diff;
} catch (final IOException e) {
LOGGER.trace("Could not close stream", e);
return diff;
}
}
I've taken to caching the values in files to improve time, but optimally I'd like to speed this up. Perhaps I could use external programs?
1 Answer 1
You need to parse the patch file format correctly. Otherwise the next patch that deletes an SQL comment will confuse your program, as it looks like this:
--- old_file.sql
+++ new_file.sql
@@ -1,1 +1,1 @@
--- SQL comment
+SELECT * FROM table;
Your current code interprets the removed line as a removed file.
The file format is explained here: http://www.gnu.org/software/diffutils/manual/html_node/Detailed-Unified.html
Since there are other people who had the same problem, you could just build on their work instead of writing your own, e.g. https://github.com/thombergs/diffparser.
-
\$\begingroup\$ That does take into account comments, although it still seems a bit slower than my implementation. I'm guessing that since it is Java, it's kind of hard to make as fast as perl might would be. \$\endgroup\$Himself12794– Himself127942016年07月25日 14:54:22 +00:00Commented Jul 25, 2016 at 14:54
Explore related questions
See similar questions with these tags.