26
\$\begingroup\$

Given a Python file containing a Python script written with bad style, this script will output a review adressing its problems.

RESERVED_KEYWORDS=['abs','dict','help','min','setattr','all','dir','hex','next','slice',
'any','divmod','id','object','sorted','ascii','enumerate','input','oct',
'staticmethod','bin','eval','int','open','str','bool','exec','isinstance',
'ord','sum','bytearray','filter','issubclass','pow','super','bytes','float',
'iter','print','tuple','callable','format','len','property','type','chr',
'frozenset','list','range','vars','classmethod','getattr','locals','repr','zip',
'compile','globals','map','reversed',
'__import__','complex','hasattr','max','round','delattr','hash','memoryview','set']
FILENAME = "code_with_bad_style.py"
BULTIN_REASSIGNED_ERROR = """You wrote:
 {} = "something"
That is not good because {} is a built-in in Python
and you should never re-assign new values to the
built-ins, in case you are wondering wheter a word is a builtin or
not go to https://docs.python.org/3/library/functions.html to read the
complete list"""
NAME_NOT_USED_ERROR="""You should use
 if __name__ == "__name__":
 main()
So that your file is going to usable as both
a stand-alone programme and an importable programme.
"""
NO_DOCS_ERROR = """You should consider using some docstrings.
Docstrings are multiline comments that explain what a function does,
they are of great help for the reader. They look like the following:
 def function(a, b):
 \"\"\"Do X and return a list.\"\"\"
"""
USE_LIST_COMPREHENSION_ERROR = """In python there is
a very powerful language feature called [list comprehension][https://docs.python.org/2/tutorial/datastructures.html#list-comprehensions].
The following:
 result = []
 for i in lst:
 if foo(i):
 result.append(bar(i))
should be replaced with:
 result = [bar(i) for i in lst if foo(i)]
"""
USE_WITH_ERROR = """There is a very convenient way of handling files in python:
the with statement. It handles closing automatically so you do not
have to worry about it. It is very simple to use, look at the following example:
 with open("x.txt") as f:
 data = f.read()
 # do something with data
"""
PRINT_BREAKING_PYTHON_3_ERROR = """You should use parenthesis with your print
statement (in Python 3 it is a function) to keep compatibility with python 3"""
IMPORT_STAR_ERROR = """You should avoid using:
 from long_long_long_name import *
because people will not know where you are taking your functions from.
Instead use:
 import long_long_long_name as short
"""
SEPARATOR = """
----------
"""
def nextline(line,lines):
 return lines[lines.index(line) + 1]
def reassign_built_in_error(code):
 for built_in in RESERVED_KEYWORDS:
 if built_in + " =" in code or built_in + "=" in code:
 return BULTIN_REASSIGNED_ERROR.format(built_in,built_in)
def if_name_error(code):
 if "__name__" not in code:
 return NAME_NOT_USED_ERROR
def no_docs_error(code):
 for line in code.splitlines():
 if line.startswith("def") or line.startswith("class"):
 if '"""' not in nextline(line,code):
 return NO_DOCS_ERROR
def use_list_comprehension_error(code):
 if "append" in code:
 return USE_LIST_COMPREHENSION_ERROR
def with_not_used_error(code):
 if ".close()" in code and ".open()" in code:
 return USE_WITH_ERROR
def print_breaking_3_error(code):
 for line in code.splitlines():
 if line.startswith("print") and "(" not in line:
 return PRINT_BREAKING_PYTHON_3_ERROR
def import_star_error(code):
 for line in code.splitlines():
 if line.startswith("import") and "*" not in line:
 return IMPORT_STAR_ERROR
def main():
 ALL_ANALYSIS = [reassign_built_in_error,if_name_error,no_docs_error,
 use_list_comprehension_error,with_not_used_error,
 print_breaking_3_error,import_star_error]
 with open(FILENAME) as f:
 code = f.read()
 for analysis in ALL_ANALYSIS:
 result = analysis(code)
 if result:
 print(result)
 print(SEPARATOR)
if __name__ == "__main__":
 main()
asked Dec 13, 2014 at 15:44
\$\endgroup\$
1

3 Answers 3

22
\$\begingroup\$

This code doesn't pass even basic PEP8 tests. It makes it sound a bit a hypocritical ;-)


With the target filename hardcoded in the file, this script is very inconvenient:

FILENAME = "code_with_bad_style.py"

Take a look at argparse, and make this script work with a filename parameter on the command line.


Several methods split the code to multiple lines, for example:

def no_docs_error(code):
 for line in code.splitlines():
 # ...
def print_breaking_3_error(code):
 for line in code.splitlines():
 # ...

With larger source code, this can be very wasteful. It would be better to split to lines once in the beginning, and pass the list to the methods that need a list instead of the single string version.


This check is completely wrong:

 if line.startswith("import") and "*" not in line:
 return IMPORT_STAR_ERROR

... so this will match this kind of statement:

import re

... and it will NOT match this kind of statement:

from long_long_long_name import *

Dropping the "not" is not enough, because the rule won't match what's you're trying to prevent.

It would be better to use a regex, in this rule and many other rules. For example, define at the top of the file with the other globals:

RE_IMPORT_STAR = re.compile(r'^from .* import \*')

Then do a check like this:

if RE_IMPORT_STAR.match(line):
 return IMPORT_STAR_ERROR

Many other tests could use regexes, for better judgment, and often better performance too.


The rules you have defined are sometimes way too relaxed, for example:

for built_in in RESERVED_KEYWORDS:
 if built_in + " =" in code or built_in + "=" in code:
 return BULTIN_REASSIGNED_ERROR.format(built_in, built_in)

This won't match things like:

abs = 'hello'

At the same time, this same rule makes this perfectly valid code fail:

parser.add_argument('-a', '--alphabet',
 help='override the default alphabet')

There are many more similar examples in this code.

answered Dec 13, 2014 at 16:18
\$\endgroup\$
2
  • \$\begingroup\$ Please make a test before writing: if "import re".startswith("import") and "*" in "import re": print("Error") prints nothing. Anyway you are right that it will not match from long_long_long_name import * \$\endgroup\$ Commented Dec 13, 2014 at 16:39
  • 2
    \$\begingroup\$ My bad. Corrected now. In your code it's if line.startswith("import") and "*" not in line. I dropped the "not" in my version and incorrectly copied that one instead of yours. Corrected the post. \$\endgroup\$ Commented Dec 13, 2014 at 16:44
17
\$\begingroup\$

My Review:

When writing programs they should be able to take command-line arguments specifying the input files(s) to process. Your program hard-codes the name to require code_with_bad_style.py.

Many of your functions immediately split the lines and process them. Since this work is repeated many times, you should probably split the lines once, and then, instead of passing in the raw code, pass in the lines instead.

Additionally, for UNIX-like conventions, you should process STDIN if no file is given on the commandline.

On the other hand, it worked well for me when I figured that out.


Self Review

python code_with_bad_style.py

You should consider using some docstrings.

Docstrings are multiline comments that explain what a function does, they are of great help for the reader. They look like the following:

def function(a, b):
 """Do X and return a list."""

In python there is a very powerful language feature called list comprehension.

The following:

result = []
for i in lst:
 if foo(i):
 result.append(bar(i))

should be replaced with:

result = [bar(i) for i in lst if foo(i)]

There is a very convenient way of handling files in python: the with statement. It handles closing automatically so you do not have to worry about it. It is very simple to use, look at the following example:

with open("x.txt") as f:
 data = f.read()
 # do something with data

answered Dec 13, 2014 at 15:54
\$\endgroup\$
2
  • 1
    \$\begingroup\$ The self-review idea is genius, we humans shalt prepare, self-reviewing and shortly therafter self-improving code is coming! :) \$\endgroup\$ Commented Dec 13, 2014 at 16:07
  • 7
    \$\begingroup\$ You should be aware of two things, FYI: Review Question generator and unanswered Python questions \$\endgroup\$ Commented Dec 13, 2014 at 16:12
8
\$\begingroup\$

Your code could be simplified and corrected by making use of the ast module. The only error that could not be reimplemented at the ast level is "Usage of print without parentheses."

Example code checking "Reassinged builtin", "missing docstring" and __name__ not used:

import ast
import builtins
BUILTIN_NAMES = [name for name in dir(builtins) if not name.startswith("_")]
class ErrorChecker(ast.NodeVisitor):
 def __init__(self):
 self.errors = set()
 self.name_used = False
 def visit_Name(self, node):
 if node.id in BUILTIN_NAMES and isinstance(node.ctx,ast.Store):
 self.errors.add(BUILTIN_REASSIGNED_ERROR)
 elif node.id == "__name__":
 self.name_used = True
 def visit_FunctionDef(self, node):
 if not ast.get_docstring(node):
 self.errors.add(NO_DOCS_ERROR)
 self.generic_visit(node)
 visit_ClassDef = visit_FunctionDef

The other errors checkers could all be similarly implemented at AST level, which would make them not accidentally catch bad statements in string literals or get confused by string literals.

The "missing parentheses around print function" error can also be checked for using the python parser/compiler:

import __future__
try:
 compile(code, FILENAME, "exec", __future__.print_function.compiler_flag)
except SyntaxError:
 #Code uses print without parentheses
answered Jun 1, 2016 at 23:49
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.