3
\$\begingroup\$

This is a improved version of the program seen in my older question .


This program first displays a menu which shows some options that the user can select. Each option will the respective functions which do some things.

The user can insert a filename, read data from it, enter a regex pattern, test if each line of the file matches the given regex pattern.

Please provide feedback on how I can improve my program.

import re
def test_data(readData, regex):
 '''Checks if each line of data matches the regex'''
 if not regex:
 print('Empty regex')
 print('')
 return
 try:
 pattern = re.compile(regex)
 except:
 print('Invalid regex')
 print('')
 return
 if not readData:
 print('Read data is empty')
 print('')
 return
 for i, line in enumerate(readData, 1):
 line = line.strip() #Strip off the newline character from each line
 if pattern.match(line):
 print(' Line ', i, ':"', line,'"', ' matches current regex', sep='')
 else:
 print(' Line ', i, ':"', line,'"', ' does not match current regex', sep='')
 print('')
def change_regex(regex):
 '''Sets the regex variable'''
 regex = '"%s"' % regex if regex else 'empty'
 print('Current regex is', regex)
 regex = input('Enter new regex: ')
 print('')
 return regex
def read_file(filename):
 '''Reads data from file if possible'''
 try:
 with open(filename, "r") as file:
 readData = file.readlines()
 print('Data read successfully!')
 return readData
 except TypeError:
 print('filename is empty!')
 except FileNotFoundError:
 print('File not found!')
 print('')
def change_filename(filename):
 '''Sets the filename variable'''
 filename = '"%s"' % filename if filename else 'empty'
 print('Current filename is', filename)
 filename = input('Enter new filename (with extension): ')
 print('')
 return filename
def main():
 filename = regex = readData = None
 while True:
 print('Enter 1 to change filename')
 print('Enter 2 to read file')
 print('Enter 3 to change regex')
 print('Enter 4 to test if read data matches given regex')
 print('Enter 5 to exit')
 try:
 option = int(input('Option: '))
 except:
 print('Invalid input')
 print('Please enter a valid option')
 print('')
 continue
 if option == 1:
 filename = change_filename(filename)
 elif option == 2:
 readData = read_file(filename)
 elif option == 3:
 regex = change_regex(regex)
 elif option == 4:
 test_data(readData, regex)
 elif option == 5:
 break
 else:
 print('Invalid input')
 print('Please enter a valid option')
 print('')
if __name__ == "__main__":
 main()
asked Jun 20, 2015 at 8:03
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

Pure functions

All of your functions both "do work" and print. You should separate logic and In/Out.

print('') is poor practice

If you want an extra newline just concat it to the previous message:

print('Empty regex\n')
return

Avoid noise

Maybe in a file of 1000 lines, my regex will match 5 lines, why must I watch 995 useless lines of output? Just output lines where it does match.

Prefer a dictionary or indexes to an array to a long if elif chain

if option == 1:
 filename = change_filename(filename)
elif option == 2:
 readData = read_file(filename)
elif option == 3:
 regex = change_regex(regex)
elif option == 4:
 test_data(readData, regex)
elif option == 5:
 break
else:
 print('Invalid input')
 print('Please enter a valid option')
 print('')

In all sincerity I don't know how to do this in this particular case, but iit would also give the benefit of generating the printing message on the fly, instead of repeating the same thing twice.

Command-line?

But maybe all of that effort for the menu is counterproductive and you should allow the user to call the programme like:

python regex_script.py <regex> <filename>

by using sys.argv or argparse.

My take

This is my implementation of this program, it separates Input/Output and logic and can be used concisely from the command line:

import re
import sys
def lines_matched(regex, text):
 for line in text.splitlines():
 if re.search(regex, line):
 yield line
def main():
 if len(sys.argv) != 3:
 print("Usage: python {} <regex> <filename>".format(__file__))
 return 1
 regex, filename = sys.argv[1], sys.argv[2]
 with open(filename) as f:
 print('\n'.join(lines_matched(regex, f.read())))
if __name__ == "__main__":
 main()

argparse would make it more user friendly.

answered Jun 20, 2015 at 11:26
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.