I've been programming with Python 2.7 for about six months now and if possible, I'd like some critique to see what I might have done better or what I might be doing wrong.
My objective was to make a script scraper that finds variables and functions, outputting them to a text file along with their line number. It also organizes the output based on type and string length for readability. It works as intended, but I'd like to know if/how I could have wrote it better, cleaner.
import re
def change_line():
parse_next['line_count'] += 1
def make_readable(list):
fix_order = {}
for item in list:
fix_order[len(item)] = item
return sort_this(fix_order)
def sort_this(dict):
fixed = dict.keys()
fixed.sort()
newly_ordered = []
for number in fixed:
newly_ordered.append(dict[number])
return newly_ordered
def parse_match(match):
if '\n' in match:
parse_next[match]()
elif 'def' in match:
parse_next[match] = parse_next['line_count']
funcs.append(match)
elif match:
parse_next[match] = parse_next['line_count']
varis.append(match)
else:
print 'error'
funcs = []
varis = []
txt_file = open('c:\\code\\test.txt', 'r')
output = open('c:\\code\\output.txt', 'w+')
found = re.findall(r'\n|def\s.+|.+ = .+', txt_file.read())
parse_next = {
'\n': change_line,
'line_count': 1,
}
for match in found:
parse_match(match)
funcs = make_readable(funcs)
varis = make_readable(varis)
output.write('\t:Functions:\n\n')
for item in funcs:
s_fix = item.replace('def ', '',)
to_write = [s_fix, ' Line:', str(parse_next.get(item)), '\n']
output.writelines(to_write)
output.write('\n'*2)
output.write('\t:Variables:\n\n')
for item in varis:
to_write = [item.strip(),' Line:', str(parse_next.get(item)), '\n']
output.writelines(to_write)
output.close()
txt_file.close()
To run it, you'll need to edit this filepath:
txt_file = open('c:\code\test.txt', 'r')
with code from a script of your choice.
Be harsh if necessary, I'd really like to get better.
1 Answer 1
A couple of suggestions:
make_readable
has a very serious problem - what if two inputs have the same length? A dictionary can only have one value per key; I suggest making it a dictionary of lists of items (or a collections.defaultdict(list)
) instead. Also, don't call the argument list
- that's a built-in.
sort_this
could be shortened (and the argument shouldn't be called dict
):
def sort_this(d):
return [d[key] for key in sorted(d)]
Iterating over a dictionary (e.g. in sorted
) automatically iterates over the keys. Also, this will work without modification in Python 3.x (where fixed.sort()
will fail).
The global
variables (e.g. funcs
) should either be explicitly global (i.e. put global funcs
at the start of functions that use them) or, much better, passed as arguments to the functions:
def parse_match(match, funcs, varis, parse_next):
One bug you may want to look into - if a variable is defined over multiple lines:
my_list = [1, 2, 3,
4, 5, 6]
Your code only gets the start of the definition.
Finally, it would be neat to not have to change the hard-coded strings to change the files to analyse and output to! Look into taking command line arguments (sys.argv
) or using e.g. tkinter
to put a GUI in. At the very least, have a higher-level function analyse(input_file, output_file)
then call it at the end of your script:
if __name__ == "__main__":
analyse("C:\\code\\test.txt", "C:\\code\\output.txt")
It is not best practice to put the variables you have to change somewhere in the middle of the file - top or bottom is easier to find!
-
\$\begingroup\$ Thanks man! I'll post an updated version of the script once i fix everything. Might be a day or so before i get back and post it though. \$\endgroup\$Lusidium– Lusidium2014年03月13日 20:13:41 +00:00Commented Mar 13, 2014 at 20:13
-
\$\begingroup\$ Posted an edited version, but I found a major bug. I'll reupdate when it's fixed. \$\endgroup\$Lusidium– Lusidium2014年03月14日 19:22:37 +00:00Commented Mar 14, 2014 at 19:22
-
\$\begingroup\$ Sadly, I don't think I'm going to be able to fix it this time. I fed it a few different inputs and it gave me a different error every time. It would be so much easier to just remake the whole thing with the bug in mind. \$\endgroup\$Lusidium– Lusidium2014年03月14日 19:56:02 +00:00Commented Mar 14, 2014 at 19:56