Return to Answer

Commonmark migration

Source Link

edited Jun 10, 2020 at 13:24

Community Bot

edited Jun 10, 2020 at 13:24

Community Bot

###PEP8 - click, read, apply

PEP8 - click, read, apply

###PEP8 - click, read, apply

PEP8 - click, read, apply

Source Link

answered Jan 2, 2017 at 7:49

Grajdeanu Alex

answered Jan 2, 2017 at 7:49

Grajdeanu Alex

9.3k
4
32
71

###PEP8 - click, read, apply

PEP8 gives coding conventions for the Python code comprising the standard library in the main Python distribution

you should use 4 spaces per indentation level
function names should be lowercase, with words separated by underscores as necessary to improve readability. mixedCase is allowed only in contexts where that's already the prevailing style (e.g. threading.py), to retain backwards compatibility
variable names should also stick to the above rule
there should be two newlines between each method
use augmented assigments where possible (url = url + ('/git/trees/%s?recursive=1' % sha) -> url += '/git/trees/%s?recursive=1' % sha)
you're sometimes using double-quotes around strings, and sometimes single-quotes. Choose one and stick to that.
use format() instead of the old %
I'd rather suggest you use the print() function even if you're using python 2.7. It will make you code easily portable.
you're lacking docstrings. Write docstrings for all public modules, functions, classes, and methods. They are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the def line.)

Code:

import base64
import os
import re
import requests
def get_repos(since=0):
 """
 Left as an exercise for OP
 """
 url = 'http://api.github.com/repositories'
 data = '{{since: {}}}'.format(since)
 response = requests.get(url, data=data)
 if response.status_code == 403:
 print('Problem making request! {}'.format(response.status_code))
 print(response.headers)
 matches = re.match(r'<.+?>', response.headers['Link'])
 next = matches.group(0)[1:-1]
 return response.json(), next
def get_repo(url):
 """
 Left as an exercise for OP
 """
 return requests.get(url).json()
def get_readme(url):
 """
 Left as an exercise for OP
 """
 url += '/readme'
 return requests.get(url).json()
# todo: return array of all commits so we can examine each one
def get_repo_sha(url):
 """
 Left as an exercise for OP
 """
 commits = requests.get(url + '/commits').json()
 return commits[0]['sha']
def get_file_content(item):
 """
 Left as an exercise for OP
 """
 ignore_extensions = ['jpg']
 filename, extension = os.path.splitext(item['path'])
 if extension in ignore_extensions:
 return []
 content = requests.get(item['url']).json()
 lines = content['content'].split('\n')
 lines = map(base64.b64decode, lines)
 print('Path: '.format(item['path']))
 print('Lines: '.format(''.join(lines[:5])))
 return ''.join(lines)
def get_repo_contents(url, sha):
 """
 Left as an exercise for OP
 """
 url += '/git/trees/{}?recursive=1'.format(sha)
 return requests.get(url).json()

For the second .py file:

the above rules apply
don't import modules you're not using (import json)

Code

import github
def process_repo_contents(repo_contents):
 """
 Left as an exercise for OP
 """
 for tree in repo_contents['tree']:
 content_type = tree['type']
 print('content_type --- {}'.format(content_type))
 if content_type == 'blob':
 github.get_file_content(tree)
 print('***blob***')
 elif content_type == 'tree':
 print('***tree***')
if __name__ == '__main__':
 repos, next = github.get_repos()
 for repo in repos[0:10]:
 sha = github.get_repo_sha(repo['url'])
 repo_json = github.get_repo_contents(repo['url'], sha)
 process_repo_contents(repo_json)

Pay more attention to variable names and data types

next is a builtin keyword so I'd recommend you to change it to something else
ignore_extensions is a list in your case. And it only keeps one string. Don't you think it would be more appropriate to make it a string from the beginning ?

lang-py