This is my first program in Python and first time I'm writing in dynamically typed language. I have substituted real detectPeaks
function with a simple placeholer. I'm aware of several problems with this program.
I still don't quite know how to 'live with' Python scoping, hence the qFix
function. I didn't bother to find a better name since I want to rewrite whole program.
Doing anything outside of any function seems unclean to me. It seems that it is ok in Python. Am I right? # It did allow me to have global cmdArgs
object witch makes logic connected to cmd arguments clean.
I'm also concerned with naming functions and variables and structure of the program. The program does not have to be very flexible I would get away with hardcoding most of the things but prefer not unless it would complicate code significantly.
__version__ = dirt3
import sys, os.path, argparse, glob
import numpy as np
def detectPeaks(in_list):
in_list = [s.replace(',', '.') for s in in_list]
in_list.pop() # remove "\r\n"
X = np.array(map(float, in_list[:2]))
Y = np.array(map(float, in_list[2:]))
XminIndex = np.argmin(X)
YminIndex = np.argmin(Y)
val = X[XminIndex] + Y[YminIndex]
return (XminIndex, YminIndex, val)
def processFileLine(line):
No, time, A, B = line.split('\t')[0], line.split('\t')[1], line.split("\t")[2::2], line.split("\t")[3::2]
if cmdArgs.A:
retA = detectPeaks(A)
ansA = No + '\t' + time + '\t' + str(retA[0]) + '\t' + str(retA[1]) + '\t' + str(retA[2]) + '\n'
else:
ansA = "" #to get read of Unbound local var err
if cmdArgs.B:
retB = detectPeaks(B)
ansB = No + '\t' + time + '\t' + str(retB[0]) + '\t' + str(retB[1]) + '\t' + str(retB[2]) + '\n'
else:
ansB = ""
return ansA + ansB
def mkOutFilePath(outDirPath, inFilePath):
inFilename = os.path.basename(inFilePath)
return os.path.join(outDirPath, inFilename + ".peak")
def qFix(inFilePath):
if inFilePath == '-':
return sys.stdin
else:
return open(inFilePath, "r")
def processFile(inFilePath):
'''
if inFilePath == '-':
inFile = sys.stdin
else:
inFIle = open(inFilePath, "r")
'''
inFile = qFix(inFilePath)
if cmdArgs.outFiles == '-':
outFile = sys.stdout
elif os.path.isfile(cmdArgs.outFiles):
outFile = open(cmdArgs.outFiles, "a")
else:
outFile = open(mkOutFilePath(cmdArgs.outFiles, inFilePath), "w")
for line in inFile.readlines():
if line[0] == '#':
outFile.write(line)
elif line[0] == 'N' and line[1] == 'r' :
outFile.write("Nr\tTime\tX\tY\tValue\n")
else:
outFile.write(processFileLine(line))
def main():
if cmdArgs.inFiles == '-':
processFile('-')
else:
if os.path.isfile(cmdArgs.inFiles):
filesList = [cmdArgs.inFiles]
else:
if cmdArgs.recuresive:
filesList = [y for x in os.walk(cmdArgs.inFiles) for y in glob.glob(os.path.join(x[0], '*.dat'))]
else:
filesList = glob.glob(os.path.join(cmdArgs.inFiles, '*.dat'))
for filePath in filesList:
processFile(filePath)
def checkPath(path):
if (path != '-') and not os.path.exists(path):
exit(2)
return path
if __name__ == "__main__":
cmdArgsParser = argparse.ArgumentParser()
cmdArgsParser.add_argument('inFiles', nargs='?', default='-', type=checkPath, help="Path to input file(s).")
cmdArgsParser.add_argument('outFiles', nargs='?', default='-', type=checkPath, help="Path to output file(s).")
cmdArgsParser.add_argument("-d", "--delete", action="store_true", help="Delete input files after procesing.")
cmdArgsParser.add_argument("-r", "--recuresive", action="store_true", help="Operate recuresively.")
cmdArgsParser.add_argument("-A", action="store_true", help="Use data from A set.")
cmdArgsParser.add_argument("-B", action="store_true", help="Use data from B set.")
cmdArgsParser.add_argument("-V", "--version", action="version", version=__version__)
cmdArgs = cmdArgsParser.parse_args()
main()
Files read by the program have comments that are to be copied to the output. Example file(simplified, there is lots of data in real ones):
# Comments Nr: Time 1A 1B 2A 2B 3A 3B 4A 4B 1 2015 0,10 0,10 0,10 0,10 0,10 0,10 0,10 0,10 2 2015 0,10 0,10 0,10 0,10 0,10 0,10 0,10 0,10 3 2015 0,10 0,10 0,10 0,10 0,10 0,10 0,10 0,10 4 2015 0,10 0,10 0,10 -0,10 0,10 0,10 0,10 0,10 5 2015 0,10 0,10 0,10 0,10 0,10 0,10 0,10 0,10 6 2015 0,10 0,10 0,10 0,15 0,10 0,10 0,10 0,10 7 2015 0,10 0,20 0,30 0,20 0,10 0,10 0,10 0,10 8 2015 -0,10 0,10 0,10 0,10 -0,10 0,10 0,10 0,10
Output for sample file:
# Commests Nr Time X Y Value 1 2015 0 0 0.2 2 2015 0 0 0.2 3 2015 0 0 0.2 4 2015 0 0 0.2 5 2015 0 0 0.2 6 2015 0 0 0.2 7 2015 0 0 0.2 8 2015 0 0 -0.2
2 Answers 2
Some suggestions:
- Follow the pep8 style guide.
- In
detectPeaks
, you should slicein_list
initially, rather than popping. Soin_list[:-1]
. - In
detectPeaks
, you should convert to float in the initial list comprehension. - In
detectPeaks
, you should convert to a single numpy array, then slice that to get X and Y. - In
detectPeaks
, you don't need to wrap thereturn
in( )
- In
processFileLine
, don't put multiple operations on a single line like that. - In
processFileLine
, you should do the splitting once, then get the items from that. - In
processFileLine
, you should use string replacement. So, for example,ansA = '{}\t{}\t{}\t{}\t{}\n'.format(No, time, *retA)
. Or better yet you can define a stringansform = '{}\t{}\t{}\t{}\t{}\n'
at the beginning, then apply the format to it in each case, soansA = 'ansform.format(No, time, *retA)
andansB = 'ansform.format(No, time, *retB)
. - Use
with
to automatically open and close files when you are done with them. - When looping over a file object, you don't need
readlines
. Just do, for example,for line in inFile
. That will automatically loop over all the lines in the file. - In
processFile
, you should test for a slice of a string. Soelif line[:2] == 'Nr':
- In the loop of
processFile
, it would be better to define a string in theif...elif...else
block, and then after the block write that string. - Never, ever, under any circumstances call
exit
. If you need to exit, either allow the program to exit normally or throw an exception. In your case, raise an exception. - You should put the argument parsing in a function.
- In
main
, you shouldreturn
at the end of theif cmdArgs.inFiles == '-':
block, which will allow you to avoid theelse
case. Or better yet, doif cmdArgs.inFiles == '-' or os.path.isfile(cmdArgs.inFiles):
, sincefilesList = [cmdArgs.inFiles]
will work in both cases. - In
main
, you should doelif cmdArgs.recuresive
(which is miss-spelled, by the way). - In
main
, the list comprehension should be a generator expression. This will allow you to avoid having to store all the filenames.
Spelling counts
The usability of your programme suffers if a flag is named recuresive
, delete the second e
.
Repeat again
You repeated a function in the docs of another function, why?
Main should take args
Main should take the command line flags as args and those should appear only in main, as it performs IO and other functions logic.
Automatically closing files
Use with to automatically close files. Not the manual open. open may leave the file open if an exception occurs.