4

I'm being faced with the task of generating statistics about the history of a Git project, and I need to produce some specific numbers and representations for various metrics - things like commits per author, commits-over-time/date histograms, that sort of thing.

The trouble is that I need all this data generated in a format that can be dealt with via a script or similar - the output has to be text, and if I can get the numbers into a Python (or similar) script, so much the better.

My question is this: are there any existing frameworks or projects that will provide such an interface? I've seen GitStats, and it does a lot of what I want, but then it dumps the results into a HTML structure instead of just providing textual or programmatic representations back to me. Are there (for example) Python bindings for a Git log parser, or even a Git statistics generator that returns a big text dump of data?

I realize it's a very specific need, and I'm willing to do some serious coding to get the precise format I want, but I'd like to think there's a starting point out there somewhere. Ideas?

asked Jan 12, 2011 at 5:13
3
  • 1
    It seems like the right approach might be to try and make GitStats produce the output format you want. It happens to already be written in Python, too. There's an HTMLReportCreator in there, ~550 lines of code, but you could just drop in a replacement for that, or possibly even just grab the data structure that it's passed. def create(self, data, path). Is there any reason this wouldn't be good for you? Commented Jan 12, 2011 at 18:24
  • Jefromi: it's certainly possible. I looked at it, and it appears that data is a GitDataCollector instance (a custom class internal to the project), not a dictionary or other Python data structure. Still, it's a great start. Thanks for the pointer! Commented Jan 13, 2011 at 17:05
  • Jefromi: after more consideration, I've started developing my own library, but if you'll post your comment as an answer I'll accept it - it's the thing that got me thinking the most about what's the best solution to this issue. Commented Jan 21, 2011 at 8:45

1 Answer 1

1

How about using XML logs instead, and then you can parse the xml in python relativily easily and build your stats

see this answer for how to get an xml log from git

answered Jan 12, 2011 at 5:23
Sign up to request clarification or add additional context in comments.

1 Comment

XML logs would be great, but that answer has no info about how to get an XML log out of Git - I believe the answerer there mistook that format string (from another answer) for XML when in fact it's just a personal preferred format from another user: stackoverflow.com/questions/1441156/…

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.