Skip to main content
Code Review

Return to Answer

Commonmark migration
Source Link
Source Link
Grajdeanu Alex
  • 9.3k
  • 4
  • 32
  • 71

###PEP8 - click, read, apply

PEP8 gives coding conventions for the Python code comprising the standard library in the main Python distribution

  • you should use 4 spaces per indentation level
  • function names should be lowercase, with words separated by underscores as necessary to improve readability. mixedCase is allowed only in contexts where that's already the prevailing style (e.g. threading.py), to retain backwards compatibility
  • variable names should also stick to the above rule
  • there should be two newlines between each method
  • use augmented assigments where possible (url = url + ('/git/trees/%s?recursive=1' % sha) -> url += '/git/trees/%s?recursive=1' % sha)
  • you're sometimes using double-quotes around strings, and sometimes single-quotes. Choose one and stick to that.
  • use format() instead of the old %
  • I'd rather suggest you use the print() function even if you're using python 2.7. It will make you code easily portable.
  • you're lacking docstrings. Write docstrings for all public modules, functions, classes, and methods. They are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the def line.)

Code:

import base64
import os
import re
import requests
def get_repos(since=0):
 """
 Left as an exercise for OP
 """
 url = 'http://api.github.com/repositories'
 data = '{{since: {}}}'.format(since)
 response = requests.get(url, data=data)
 if response.status_code == 403:
 print('Problem making request! {}'.format(response.status_code))
 print(response.headers)
 matches = re.match(r'<.+?>', response.headers['Link'])
 next = matches.group(0)[1:-1]
 return response.json(), next
def get_repo(url):
 """
 Left as an exercise for OP
 """
 return requests.get(url).json()
def get_readme(url):
 """
 Left as an exercise for OP
 """
 url += '/readme'
 return requests.get(url).json()
# todo: return array of all commits so we can examine each one
def get_repo_sha(url):
 """
 Left as an exercise for OP
 """
 commits = requests.get(url + '/commits').json()
 return commits[0]['sha']
def get_file_content(item):
 """
 Left as an exercise for OP
 """
 ignore_extensions = ['jpg']
 filename, extension = os.path.splitext(item['path'])
 if extension in ignore_extensions:
 return []
 content = requests.get(item['url']).json()
 lines = content['content'].split('\n')
 lines = map(base64.b64decode, lines)
 print('Path: '.format(item['path']))
 print('Lines: '.format(''.join(lines[:5])))
 return ''.join(lines)
def get_repo_contents(url, sha):
 """
 Left as an exercise for OP
 """
 url += '/git/trees/{}?recursive=1'.format(sha)
 return requests.get(url).json()

For the second .py file:

  • the above rules apply
  • don't import modules you're not using (import json)

Code

import github
def process_repo_contents(repo_contents):
 """
 Left as an exercise for OP
 """
 for tree in repo_contents['tree']:
 content_type = tree['type']
 print('content_type --- {}'.format(content_type))
 if content_type == 'blob':
 github.get_file_content(tree)
 print('***blob***')
 elif content_type == 'tree':
 print('***tree***')
if __name__ == '__main__':
 repos, next = github.get_repos()
 for repo in repos[0:10]:
 sha = github.get_repo_sha(repo['url'])
 repo_json = github.get_repo_contents(repo['url'], sha)
 process_repo_contents(repo_json)

Pay more attention to variable names and data types

  • next is a builtin keyword so I'd recommend you to change it to something else
  • ignore_extensions is a list in your case. And it only keeps one string. Don't you think it would be more appropriate to make it a string from the beginning ?
lang-py

AltStyle によって変換されたページ (->オリジナル) /