4
\$\begingroup\$

I just finished writing this Python script to calculate daily additions and subtractions from my git log to use in making pretty graphs. This is a rewrite of something I wrote previously in Perl. I have made an attempt to clean it up, but I feel that some of my list comprehensions are messy to say the least, and I'm abusing certain Python features.

#!/usr/bin/env python
# get-pmstats.py
# Henry J Schmale
# November 25, 2017
# 
# Calculates the additions and deletions per day within a git repository
# by parsing out the git log. It opens the log itself.
# Produces output as a CSV
#
# This segments out certain file wildcards
import subprocess
from datetime import datetime
from fnmatch import fnmatch
def chomp_int(val):
 try:
 return int(val)
 except ValueError:
 return 0
def make_fn_matcher(args):
 return lambda x: fnmatch(''.join(map(str, args[2:])), x)
def print_results(changes_by_date):
 print('date,ins,del')
 for key,vals in changes_by_date.items():
 print(','.join(map(str, [key, vals[0], vals[1]])))
EXCLUDED_WILDCARDS = ['*.eps', '*.CSV', '*jquery*']
changes_by_date = {}
git_log = subprocess.Popen(
 'git log --numstat --pretty="%at"',
 stdout=subprocess.PIPE,
 shell=True)
date = None
day_changes = [0, 0]
for line in git_log.stdout:
 args = line.decode('utf8').rstrip().split()
 if len(args) == 1:
 old_date = date
 date = datetime.fromtimestamp(int(args[0]))
 if day_changes != [0, 0] and date.date() != old_date.date():
 changes_by_date[str(date.date())] = day_changes
 day_changes = [0, 0]
 elif len(args) >= 3:
 # Don't count changesets for excluded file types
 if True in map(make_fn_matcher(args), EXCLUDED_WILDCARDS):
 continue
 day_changes = [sum(x) for x in zip(day_changes, map(chomp_int, args[0:2]))]
print_results(changes_by_date)
Jamal
35.2k13 gold badges134 silver badges238 bronze badges
asked Nov 26, 2017 at 2:29
\$\endgroup\$

1 Answer 1

3
\$\begingroup\$

I would apply the following improvements:

  • use collections.defaultdict to count down the number of added and deleted lines separately. This would help to improve the counting logic and avoid having to check for the old date at all
  • use any() to check for the excluded wildcards
  • unpack args into added, deleted lines and a filename
  • switch to gitpython from subprocess

The new version of the code:

from collections import defaultdict
from datetime import datetime
from fnmatch import fnmatch
import git
EXCLUDED_WILDCARDS = ['*.eps', '*.CSV', '*jquery*']
def chomp_int(val):
 try:
 return int(val)
 except ValueError:
 return 0
repo = git.Repo(".")
git_log = repo.git.log(numstat=True, pretty="%at").encode("utf-8")
added, deleted = defaultdict(int), defaultdict(int)
for line in git_log.splitlines():
 args = line.decode('utf8').rstrip().split()
 if len(args) == 1:
 date = datetime.fromtimestamp(int(args[0])).date()
 elif len(args) >= 3:
 added_lines, deleted_lines, filename = args
 # Don't count changesets for excluded file types
 if any(fnmatch(filename, wildcard) for wildcard in EXCLUDED_WILDCARDS):
 continue
 added[date] += chomp_int(added_lines)
 deleted[date] += chomp_int(deleted_lines)
for date, added_lines in added.items():
 print(date, added_lines, deleted[date])
answered Nov 26, 2017 at 4:03
\$\endgroup\$
2
  • \$\begingroup\$ Looks like you have potential for an error on you elif's tuple unpacking \$\endgroup\$ Commented Nov 26, 2017 at 14:40
  • \$\begingroup\$ @Peilonrayz good point. I was thinking to add a *_ there, but experimented with a couple repositories I had cloned and could not find a case where args has more than 3 - not sure if it is possible to get with the git log --numstat --pretty="%at". But, yes, it should be == 3 if I am assuming that. Thanks! \$\endgroup\$ Commented Nov 26, 2017 at 15:30

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.