I'm being faced with the task of generating statistics about the history of a Git project, and I need to produce some specific numbers and representations for various metrics - things like commits per author, commits-over-time/date histograms, that sort of thing.
The trouble is that I need all this data generated in a format that can be dealt with via a script or similar - the output has to be text, and if I can get the numbers into a Python (or similar) script, so much the better.
My question is this: are there any existing frameworks or projects that will provide such an interface? I've seen GitStats, and it does a lot of what I want, but then it dumps the results into a HTML structure instead of just providing textual or programmatic representations back to me. Are there (for example) Python bindings for a Git log parser, or even a Git statistics generator that returns a big text dump of data?
I realize it's a very specific need, and I'm willing to do some serious coding to get the precise format I want, but I'd like to think there's a starting point out there somewhere. Ideas?
1 Answer 1
How about using XML logs instead, and then you can parse the xml in python relativily easily and build your stats
see this answer for how to get an xml log from git
def create(self, data, path). Is there any reason this wouldn't be good for you?datais a GitDataCollector instance (a custom class internal to the project), not a dictionary or other Python data structure. Still, it's a great start. Thanks for the pointer!