URL pattern matching

Question 1

The whole process mainly depends on the ProcessFile() function:

I get all the credentials and details from a JSON file.

Initially I get data from the db: two IDs and a URL.
There is a text file for address state matching, reading, and compiling with | to match regex.
The same is done for zip and zip keywords.
Iterate over the row.
Get content of the URL and remove unwanted things.
Match the address regex and store the result to a variable.
Match the zip regex, see if it is available in the zip keyword and write the result to variable append the total results to outputs list. Write only when the list size is equal or greater than 1000.
After completing all the URLs, see if outputs contains data. If so, write to the file, else leave it.
Move the result to another FTP location.

#Importing modules
import requests
import re
import sys
import pyodbc
import json
import datetime
import random
import time
import ftplib
import traceback
#loading regex files
start_time_split=datetime.datetime.now()
json_data=open('server_details.json')
server_details = json.load(json_data)
json_data.close() 
'''SQL SERVER CREDENTIALS AND DB NAME'''
SERVER=server_details["SERVER"]
DATABASE=server_details["DATABASE"]
UID=server_details["UID"]
PWD=server_details["PWD"]
ADDRESS=server_details["ADDRESS"]
ZIP=server_details["ZIP"]
ZIP_WORD=server_details["ZIP_WORD"]
FTP_IP=server_details["FTP_IP"]
FTP_USER=server_details["FTP_USER"]
FTP_PWD=server_details["FTP_PWD"]
FTP_LOCATION=server_details["FTP_LOCATION"]
def ProcessFile(fileReadStr):
 try:
 header = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0'}
 with open(ADDRESS,"r") as l:
 address="|".join([a.strip() for a in l])
 #address=address 
 with open(ZIP,"r") as ll:
 regex="|".join([a.strip() for a in ll])
 #regex=regex
 with open(ZIP_WORD) as inp:
 zip_word=[a.strip() for a in inp.readlines()]
 print "Started "+str(start_time_split)
 filename=str(random.randint(1,100000))+"_"+str(int(time.time()))+"_us_link_classifier.txt"
 with open(filename,"w") as output:
 output.write("ID\tClientID\tURL\tAddress\tword\tzip\tzipmatch\n")
 outputs=[]
 for line in fileReadStr:
 id1=line[0]
 web=line[1] if line[1].strip().startswith("http") else "http://"+line[1].strip()
 id2=line[2] 
 ad="0"
 word="None" 
 try:
 objReq = requests.get(web,timeout=30,verify=False,allow_redirects=True,headers=header)
 htmlStr = objReq.content
 htmlStr=re.sub("\s+"," ",htmlStr)
 htmlStr=re.sub("<!--.*?-->", " ", htmlStr)
 htmlStr=re.sub("<head.*?>.*?</head>"," ",htmlStr)
 htmlStr=re.sub("<.*?>", " ", htmlStr) 
 htmlStr=re.sub("\s+", " ", htmlStr)
 address_match=re.search(address,htmlStr,re.IGNORECASE)
 if address_match :
 ad="1"
 word=address_match.group(0).strip()
 address_zip=re.findall(regex,htmlStr)
 zip_match =[el for sub in address_zip for el in sub if el and el.split()[0] in zip_word] 
 if zip_match:
 zp="1"
 zip_match=zip_match[0]
 else:
 zip_match="None"
 outputs.append(str(id1)+"\t"+str(id2)+"\t"+web+"\t"+ad+"\t"+word+"\t"+zp+"\t"+zip_match)
 if len(outputs)>=150:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error.txt","a") as ded:
 ded.write(str(id1)+"\t"+web+"\t"+str(e)+"\n") 
 if outputs:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]
 infile = open(filename,'r') 
 ftp = ftplib.FTP(FTP_IP,FTP_USER ,FTP_PWD ) 
 ftp.storbinary('STOR '+FTP_LOCATION+str(filename), infile) 
 infile.close() 
 ftp.quit()
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_process_error.txt","a") as ded:
 ded.write( str(e))
if __name__ == '__main__':
 try:
 access_string='DRIVER={SQL Server};SERVER='+SERVER+';DATABASE='+DATABASE+';UID='+UID+';PWD='+PWD
 cnxn = pyodbc.connect(access_string,autocommit=True)
 cursor = cnxn.cursor()
 print sys.argv[1]+'\n'+sys.argv[2]
 exec1=server_details["QUERY"]
 cursor.execute(exec1,(sys.argv[1],sys.argv[2]))
 rows = cursor.fetchall()
 cnxn.close()
 ProcessFile(rows)
 end_time_split=datetime.datetime.now()
 elasped_time1=end_time_split-start_time_split
 print "Total time:"+str(elasped_time1)
 print "done"
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_starting_process_error.txt","a") as ded:
 ded.write(str(e))

Question 2

I'm going to reference stylistic and formatting stuff, you should definitely read through PEP0008.

You can remove the comment at the start, it's pretty clear that you're importing modules. Comments should be used to add information that wouldn't otherwise be clear. It's also recommended that you group modules together for slightly easier readability rather than one long list. Since you're importing a lot and most only require one or two uses, you could use from _ import _ to cut down on some importing. I wouldn't recomment from json import load since load is a generic name, but from ftplib import FTP and from random import randint are unlikely to clash.

import datetime
import json
import re
import sys
import time
import traceback
import requests
import pyodbc
from ftplib import FTP
from random import randint

Even for one line, you should use with. Especially because json.load can easily generate errors for invalid syntax. Almost always with will be better. I haven't found a reason not to use it personally.

#loading regex files
start_time_split=datetime.datetime.now()
with open('server_details.json') as json_data:
 server_details = json.load(json_data)

Why is '''SQL SERVER CREDENTIALS AND DB NAME''' not a comment? It functions essentially as a one line comment but formatted differently.

You should separate distinct blocks with whitespace, it's more readable. Like this (using your original code):

import traceback
#loading regex files
start_time_split=datetime.datetime.now()
json_data=open('server_details.json')
server_details = json.load(json_data)
json_data.close() 
'''SQL SERVER CREDENTIALS AND DB NAME'''
SERVER=server_details["SERVER"]
DATABASE=server_details["DATABASE"]
UID=server_details["UID"]
PWD=server_details["PWD"]
ADDRESS=server_details["ADDRESS"]
ZIP=server_details["ZIP"]
ZIP_WORD=server_details["ZIP_WORD"]
FTP_IP=server_details["FTP_IP"]
FTP_USER=server_details["FTP_USER"]
FTP_PWD=server_details["FTP_PWD"]
FTP_LOCATION=server_details["FTP_LOCATION"]
def ProcessFile(fileReadStr):

You have an unnecessary double spaces in a few places, which are just unnecessary. Also open defaults to read only mode, so 'r' isn't necessary.

with open(ADDRESS,"r") as l:
with open(ZIP,"r") as ll:

Also you can reuse the name in this case, though I'd personally recommend using f instead for a briefly opened file. At least avoid a lone l which can be mistaken for a number one or capital i.

Also these comments are either removed pieces of old code or redundant notes. The variable names are clear enough that they don't need to be noted.

#address=address 
#regex=regex

It's good practice to use parentheses for making code more Python3 compatible, and also you could use str.format instead of str() and concatenation. Also again I'd put an empty line after this line to separate this block of code from another.

print ("Started {}".format(start_time_split))

Setting the filename is a long line, try breaking it up over multiple lines for readability. I'd recommend format again here. I'd also recommend using the {0:06d} syntax that will pad your random int with zeroes so that filenames have equal numbers of characters at the start for neater display together. Don't forget that if you did change to from random import randint you just need to use randint() here.

filename= "{:06d}_{}_us_link_classifier.txt".format(
 randint(1,100000), int(time.time()))

Your output.write would be neater using a join, like this:

output.write("\t".join(["ID", "ClientID", "URL", "Address", "word", "zip", "zipmatch"])
output.write("\n")

That allows you to add new values easier and makes the list of headers more readable.

When setting your web variable you forget to strip line[1] if it does start with 'http' (as the ternary condition value isn't kept in any way). It'd be easier to by strip line[1] first and then do your shorter ternary expression.

line[1] = line[1].strip()
web = line[1] if line[1].startswith("http") else "http://"+line[1]

You should also be adding some whitespace to these variable declarations

id1 = line[0]
web = line[1] if line[1].strip().startswith("http") else "http://"+line[1].strip()
id2 = line[2] 
ad = "0"
word = "None"

And add some whitespace in your parameter lists:

objReq = requests.get(web, timeout=30, verify=False, allow_redirects=True, headers=header)
htmlStr = objReq.content
htmlStr=re.sub("\s+", " ", htmlStr)

I'm not sure if this is in your code or just the transcript here but you forgot to indent after with here:

if outputs:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]

Here's the overall formatting changes I'd make to your script:

import datetime
import json
import re
import sys
import time
import traceback
import requests
import pyodbc
from ftplib import FTP
from random import randint
# Loading regex files
start_time_split = datetime.datetime.now()
with open('server_details.json') as json_data:
 server_details = json.load(json_data)
# SQL SERVER CREDENTIALS AND DB NAME
SERVER = server_details["SERVER"]
DATABASE = server_details["DATABASE"]
UID = server_details["UID"]
PWD = server_details["PWD"]
ADDRESS = server_details["ADDRESS"]
ZIP = server_details["ZIP"]
ZIP_WORD = server_details["ZIP_WORD"]
FTP_IP = server_details["FTP_IP"]
FTP_USER = server_details["FTP_USER"]
FTP_PWD = server_details["FTP_PWD"]
FTP_LOCATION = server_details["FTP_LOCATION"]
def ProcessFile(fileReadStr):
 try:
 header = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0'}
 with open(ADDRESS) as f:
 address="|".join([a.strip() for a in f])
 with open(ZIP) as f:
 regex="|".join([a.strip() for a in f])
 with open(ZIP_WORD) as f:
 zip_word=[a.strip() for a in f.readlines()]
 print ("Started {}".format(start_time_split))
 filename= "{:06d}_{}_us_link_classifier.txt".format(
 randint(1,100000), int(time.time()))
 with open(filename,"w") as output:
 output.write("ID\tClientID\tURL\tAddress\tword\tzip\tzipmatch\n")
 outputs = []
 for line in fileReadStr:
 id1 = line[0]
 line[1] = line[1].strip()
 web = line[1] if line[1].startswith("http") else "http://" + line[1]
 id2 = line[2] 
 ad = "0"
 word = "None" 
 try:
 objReq = requests.get(web, timeout=30, verify=False,
 allow_redirects=True, headers=header)
 htmlStr = objReq.content
 htmlStr = re.sub("\s+", " ", htmlStr)
 htmlStr = re.sub("<!--.*?-->", " ", htmlStr)
 htmlStr = re.sub("<head.*?>.*?</head>", " ", htmlStr)
 htmlStr = re.sub("<.*?>", " ", htmlStr) 
 htmlStr = re.sub("\s+", " ", htmlStr)
 address_match = re.search(address, htmlStr, re.IGNORECASE)
 if address_match :
 ad = "1"
 word = address_match.group(0).strip()
 address_zip = re.findall(regex, htmlStr)
 zip_match = [el for sub in address_zip for el in sub if el and el.split()[0] in zip_word]
 if zip_match:
 zp = "1"
 zip_match = zip_match[0]
 else:
 zip_match = "None"
 outputs.append("\t".join([str(id1), str(id2), web, ad, word, zp, zip_match]))
 if len(outputs) >= 150:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error.txt","a") as ded:
 ded.write('\t'.join(str(id1), web, str(e)) + "\n")
 if outputs:
 with open(filename, "a") as output:
 output.write("\n".join(outputs) + "\n")
 del outputs[:]
 with open(filename) as infile:
 ftp = ftplib.FTP(FTP_IP,FTP_USER ,FTP_PWD ) 
 ftp.storbinary('STOR '+FTP_LOCATION+str(filename), infile) 
 ftp.quit()
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_process_error.txt","a") as ded:
 ded.write(str(e))
if __name__ == '__main__':
 try:
 access_string = 'DRIVER={SQL Server};SERVER='+SERVER+';DATABASE='+DATABASE+';UID='+UID+';PWD='+PWD
 cnxn = pyodbc.connect(access_string, autocommit=True)
 cursor = cnxn.cursor()
 print (sys.argv[1]+'\n'+sys.argv[2])
 exec1 = server_details["QUERY"]
 cursor.execute(exec1, (sys.argv[1], sys.argv[2]))
 rows = cursor.fetchall()
 cnxn.close()
 ProcessFile(rows)
 end_time_split = datetime.datetime.now()
 elasped_time1 = end_time_split-start_time_split
 print ("Total time:{}".format(elasped_time1))
 print ("Done")
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_starting_process_error.txt","a") as ded:
 ded.write(str(e))

Question 3

Thanks for your time and your points but will this affect the running time of the script

Question 4

@VigneshKalai It wont affect the performance, I'm not that experienced with using regex to help that. But it does make it easier to read and refactor things so you can spot parts for improvement.

Question 5

I strongly agree with @SuperBiasedMan's comments about formatting. His reformatting makes your code much more readable. Here are a few more comments:

Production vs development

If you are using this code in production, you probably don't want to print the traceback to the user. Printing the error message should be enough. I'd cut the error handling down to this:

except Exception,e:
 print e
 with open("us_error_starting_process_error.txt","a") as ded:
 ded.write(str(e))

Also you might consider some variable you can set to print or not depending on when the code is being used in production or not. For example something like

if __name__ == '__main__':
 TO_PRINT = True
else:
 TO_PRINT = False

And then:

if TO_PRINT:
 print...

Formatting

It's good that you wrote this module so that the script can be run or imported, but you should go one step further and declare a main() function (read more about it here). You can move all your try...except code into that.

Similarly, I spotted at least one typo:

elasped_time1

You want to avoid typos because it makes it harder for users to remember how to type your variables as well as just being annoying to replicate a misspelling.

Use 'with' to control resource use/closing

It's always better to format your Python code so that you don't need to remember to release resources. That way if you don't, no problem. I'd rewrite this part of your new main() function like this:

 with cnxn = pyodbc.connect(access_string,autocommit=True)
 cursor = cnxn.cursor()
 print sys.argv[1]+'\n'+sys.argv[2]
 exec1=server_details["QUERY"]
 cursor.execute(exec1,(sys.argv[1],sys.argv[2]))
 rows = cursor.fetchall()

score 4 · Answer 1 · 2015-08-20 09:33:18Z

I'm going to reference stylistic and formatting stuff, you should definitely read through PEP0008.

You can remove the comment at the start, it's pretty clear that you're importing modules. Comments should be used to add information that wouldn't otherwise be clear. It's also recommended that you group modules together for slightly easier readability rather than one long list. Since you're importing a lot and most only require one or two uses, you could use from _ import _ to cut down on some importing. I wouldn't recomment from json import load since load is a generic name, but from ftplib import FTP and from random import randint are unlikely to clash.

import datetime
import json
import re
import sys
import time
import traceback
import requests
import pyodbc
from ftplib import FTP
from random import randint

Even for one line, you should use with. Especially because json.load can easily generate errors for invalid syntax. Almost always with will be better. I haven't found a reason not to use it personally.

#loading regex files
start_time_split=datetime.datetime.now()
with open('server_details.json') as json_data:
 server_details = json.load(json_data)

Why is '''SQL SERVER CREDENTIALS AND DB NAME''' not a comment? It functions essentially as a one line comment but formatted differently.

You should separate distinct blocks with whitespace, it's more readable. Like this (using your original code):

import traceback
#loading regex files
start_time_split=datetime.datetime.now()
json_data=open('server_details.json')
server_details = json.load(json_data)
json_data.close() 
'''SQL SERVER CREDENTIALS AND DB NAME'''
SERVER=server_details["SERVER"]
DATABASE=server_details["DATABASE"]
UID=server_details["UID"]
PWD=server_details["PWD"]
ADDRESS=server_details["ADDRESS"]
ZIP=server_details["ZIP"]
ZIP_WORD=server_details["ZIP_WORD"]
FTP_IP=server_details["FTP_IP"]
FTP_USER=server_details["FTP_USER"]
FTP_PWD=server_details["FTP_PWD"]
FTP_LOCATION=server_details["FTP_LOCATION"]
def ProcessFile(fileReadStr):

You have an unnecessary double spaces in a few places, which are just unnecessary. Also open defaults to read only mode, so 'r' isn't necessary.

with open(ADDRESS,"r") as l:
with open(ZIP,"r") as ll:

Also you can reuse the name in this case, though I'd personally recommend using f instead for a briefly opened file. At least avoid a lone l which can be mistaken for a number one or capital i.

Also these comments are either removed pieces of old code or redundant notes. The variable names are clear enough that they don't need to be noted.

#address=address 
#regex=regex

It's good practice to use parentheses for making code more Python3 compatible, and also you could use str.format instead of str() and concatenation. Also again I'd put an empty line after this line to separate this block of code from another.

print ("Started {}".format(start_time_split))

Setting the filename is a long line, try breaking it up over multiple lines for readability. I'd recommend format again here. I'd also recommend using the {0:06d} syntax that will pad your random int with zeroes so that filenames have equal numbers of characters at the start for neater display together. Don't forget that if you did change to from random import randint you just need to use randint() here.

filename= "{:06d}_{}_us_link_classifier.txt".format(
 randint(1,100000), int(time.time()))

Your output.write would be neater using a join, like this:

output.write("\t".join(["ID", "ClientID", "URL", "Address", "word", "zip", "zipmatch"])
output.write("\n")

That allows you to add new values easier and makes the list of headers more readable.

When setting your web variable you forget to strip line[1] if it does start with 'http' (as the ternary condition value isn't kept in any way). It'd be easier to by strip line[1] first and then do your shorter ternary expression.

line[1] = line[1].strip()
web = line[1] if line[1].startswith("http") else "http://"+line[1]

You should also be adding some whitespace to these variable declarations

id1 = line[0]
web = line[1] if line[1].strip().startswith("http") else "http://"+line[1].strip()
id2 = line[2] 
ad = "0"
word = "None"

And add some whitespace in your parameter lists:

objReq = requests.get(web, timeout=30, verify=False, allow_redirects=True, headers=header)
htmlStr = objReq.content
htmlStr=re.sub("\s+", " ", htmlStr)

I'm not sure if this is in your code or just the transcript here but you forgot to indent after with here:

if outputs:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]

Here's the overall formatting changes I'd make to your script:

import datetime
import json
import re
import sys
import time
import traceback
import requests
import pyodbc
from ftplib import FTP
from random import randint
# Loading regex files
start_time_split = datetime.datetime.now()
with open('server_details.json') as json_data:
 server_details = json.load(json_data)
# SQL SERVER CREDENTIALS AND DB NAME
SERVER = server_details["SERVER"]
DATABASE = server_details["DATABASE"]
UID = server_details["UID"]
PWD = server_details["PWD"]
ADDRESS = server_details["ADDRESS"]
ZIP = server_details["ZIP"]
ZIP_WORD = server_details["ZIP_WORD"]
FTP_IP = server_details["FTP_IP"]
FTP_USER = server_details["FTP_USER"]
FTP_PWD = server_details["FTP_PWD"]
FTP_LOCATION = server_details["FTP_LOCATION"]
def ProcessFile(fileReadStr):
 try:
 header = {'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:39.0) Gecko/20100101 Firefox/39.0'}
 with open(ADDRESS) as f:
 address="|".join([a.strip() for a in f])
 with open(ZIP) as f:
 regex="|".join([a.strip() for a in f])
 with open(ZIP_WORD) as f:
 zip_word=[a.strip() for a in f.readlines()]
 print ("Started {}".format(start_time_split))
 filename= "{:06d}_{}_us_link_classifier.txt".format(
 randint(1,100000), int(time.time()))
 with open(filename,"w") as output:
 output.write("ID\tClientID\tURL\tAddress\tword\tzip\tzipmatch\n")
 outputs = []
 for line in fileReadStr:
 id1 = line[0]
 line[1] = line[1].strip()
 web = line[1] if line[1].startswith("http") else "http://" + line[1]
 id2 = line[2] 
 ad = "0"
 word = "None" 
 try:
 objReq = requests.get(web, timeout=30, verify=False,
 allow_redirects=True, headers=header)
 htmlStr = objReq.content
 htmlStr = re.sub("\s+", " ", htmlStr)
 htmlStr = re.sub("<!--.*?-->", " ", htmlStr)
 htmlStr = re.sub("<head.*?>.*?</head>", " ", htmlStr)
 htmlStr = re.sub("<.*?>", " ", htmlStr) 
 htmlStr = re.sub("\s+", " ", htmlStr)
 address_match = re.search(address, htmlStr, re.IGNORECASE)
 if address_match :
 ad = "1"
 word = address_match.group(0).strip()
 address_zip = re.findall(regex, htmlStr)
 zip_match = [el for sub in address_zip for el in sub if el and el.split()[0] in zip_word]
 if zip_match:
 zp = "1"
 zip_match = zip_match[0]
 else:
 zip_match = "None"
 outputs.append("\t".join([str(id1), str(id2), web, ad, word, zp, zip_match]))
 if len(outputs) >= 150:
 with open(filename,"a") as output:
 output.write("\n".join(outputs)+"\n")
 del outputs[:]
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error.txt","a") as ded:
 ded.write('\t'.join(str(id1), web, str(e)) + "\n")
 if outputs:
 with open(filename, "a") as output:
 output.write("\n".join(outputs) + "\n")
 del outputs[:]
 with open(filename) as infile:
 ftp = ftplib.FTP(FTP_IP,FTP_USER ,FTP_PWD ) 
 ftp.storbinary('STOR '+FTP_LOCATION+str(filename), infile) 
 ftp.quit()
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_process_error.txt","a") as ded:
 ded.write(str(e))
if __name__ == '__main__':
 try:
 access_string = 'DRIVER={SQL Server};SERVER='+SERVER+';DATABASE='+DATABASE+';UID='+UID+';PWD='+PWD
 cnxn = pyodbc.connect(access_string, autocommit=True)
 cursor = cnxn.cursor()
 print (sys.argv[1]+'\n'+sys.argv[2])
 exec1 = server_details["QUERY"]
 cursor.execute(exec1, (sys.argv[1], sys.argv[2]))
 rows = cursor.fetchall()
 cnxn.close()
 ProcessFile(rows)
 end_time_split = datetime.datetime.now()
 elasped_time1 = end_time_split-start_time_split
 print ("Total time:{}".format(elasped_time1))
 print ("Done")
 except Exception,e:
 print e
 print traceback.print_exc()
 with open("us_error_starting_process_error.txt","a") as ded:
 ded.write(str(e))

Thanks for your time and your points but will this affect the running time of the script
@VigneshKalai It wont affect the performance, I'm not that experienced with using regex to help that. But it does make it easier to read and refactor things so you can spot parts for improvement.

sunny sunnysunny 1,8451 gold badge13 silver badges29 bronze badges · Answer 2 · 2015-08-20 12:48:04Z

I strongly agree with @SuperBiasedMan's comments about formatting. His reformatting makes your code much more readable. Here are a few more comments:

Production vs development

If you are using this code in production, you probably don't want to print the traceback to the user. Printing the error message should be enough. I'd cut the error handling down to this:

except Exception,e:
 print e
 with open("us_error_starting_process_error.txt","a") as ded:
 ded.write(str(e))

Also you might consider some variable you can set to print or not depending on when the code is being used in production or not. For example something like

if __name__ == '__main__':
 TO_PRINT = True
else:
 TO_PRINT = False

And then:

if TO_PRINT:
 print...

Formatting

It's good that you wrote this module so that the script can be run or imported, but you should go one step further and declare a main() function (read more about it here). You can move all your try...except code into that.

Similarly, I spotted at least one typo:

elasped_time1

You want to avoid typos because it makes it harder for users to remember how to type your variables as well as just being annoying to replicate a misspelling.

Use 'with' to control resource use/closing

It's always better to format your Python code so that you don't need to remember to release resources. That way if you don't, no problem. I'd rewrite this part of your new main() function like this:

 with cnxn = pyodbc.connect(access_string,autocommit=True)
 cursor = cnxn.cursor()
 print sys.argv[1]+'\n'+sys.argv[2]
 exec1=server_details["QUERY"]
 cursor.execute(exec1,(sys.argv[1],sys.argv[2]))
 rows = cursor.fetchall()

Stack Exchange Network

URL pattern matching

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

URL pattern matching

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions