5
\$\begingroup\$

This is now my new code implementing the recommendations from my previous question.

import os
import time
import mysql.connector
import MySQLdb
import ConfigParser
import base64
import logging
from logging.handlers import TimedRotatingFileHandler
import sys
import ast
import smtplib
from sendmail import send_mail
"""
TO DO:
1. Retry failure upload when mysql disconnects
"""
currdir = os.getcwd()
curr_datetime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
"""function that reads the config file"""
def read_Config(section, option):
 conf_file = currdir + "\\Config\\config.ini"
 config = ConfigParser.ConfigParser()
 config.read(conf_file)
 sections = config.sections()
 def ConfigSectionMap(section):
 dict1 = {}
 options = config.options(section)
 for option in options:
 try:
 dict1[option] = config.get(section, option)
 if dict1[option] == -1:
 # print("skip: %s" % option)
 LOGGER.debug("skip: %s" % option)
 except:
 deb = "exception on %s!" % option
 LOGGER.debug(deb)
 write_log(wyko_no,'DEBUG',deb,cur_datetime)
 dict1[option] = None
 return dict1
 res = ConfigSectionMap(section)[option]
 return res
"""read configuration file"""
app_name = read_Config('application','app_name')
FILE_PATH = read_Config('application','file_path')
LOG_PATH = currdir + "\\" + read_Config('application','log_path') + "\\"
USER = read_Config('database','user')
PASSWORD = read_Config('database','password')
PASSWORD = base64.decodestring(PASSWORD) ##decode password, remove D and L from the string
HOST = read_Config('database','host')
DB = read_Config('database','db')
DB_PORT = read_Config('database','port')
RECEIVERS = ast.literal_eval(read_Config('email','receivers'))
SENDER = read_Config('email','sender')
SMTP = read_Config('email','smtp')
PORT = read_Config('email','port')
"""set up logger"""
LOGGER = logging.getLogger(app_name)
LOG_FILE = app_name + '.log'
LOG_PATH = LOG_PATH + LOG_FILE
LOGGER.setLevel(logging.DEBUG)
hdlr = TimedRotatingFileHandler(LOG_PATH, when="d", interval=1, backupCount=1)
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s: %(message)s')
hdlr.setFormatter(formatter)
LOGGER.addHandler(hdlr) 
sh = logging.StreamHandler(sys.stdout)
sh.setLevel(logging.DEBUG)
sh.setFormatter(formatter)
LOGGER.addHandler(sh)
"""insert csv to database"""
def insert_csv(file,filename):
 try:
 cnx = mysql.connector.connect(user = USER,
 password = PASSWORD,host = HOST,
 database = DB, port = DB_PORT)
 cursor = cnx.cursor() 
 with open(file, 'rb') as f:
 thedata = f.read()
 curr_datetime = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
 sql = "INSERT INTO wyko_file(wyko_file,file_name,date_inserted) VALUES (%s,%s,%s)"
 cursor.execute(sql,(thedata,filename,curr_datetime))
 if cursor.rowcount != 0:
 LOGGER.info("Transfer to database succesful!")
 cnx.commit() 
 except (MySQLdb.OperationalError, MySQLdb.ProgrammingError), e:
 LOGGER.error(e)
 send_mail(SENDER,RECEIVERS,e)
 finally:
 cursor.close()
 cnx.close()
"""main loop that watches the directory""" 
def main():
 try:
 for dirpath, dirnames, files in os.walk(FILE_PATH):
 for i in files:
 file = dirpath+i
 try:
 LOGGER.info("Transferring file: " +i)
 insert_csv(file,i)
 except Exception, e:
 LOGGER.error(e)
 send_mail(SENDER,RECEIVERS,e)
 finally:
 os.remove(file)
 LOGGER.info('File successfully removed from '+FILE_PATH+"\n"+"-"*80)
 except Exception, e:
 LOGGER.error(e)
 send_mail(SENDER,RECEIVERS,e)
 # break
if __name__ == "__main__":
 LOGGER.info("Application starts running\n"+"-"*80)
 main()

But upon deploying this application on a client, I have encountered an issue that this parser is eating up the resources on the client computer and causes applications to hang/lag. I think the code needs optimizing.

The parser is currently deployed on a Windows XP OS with 2GB of RAM.

asked Oct 31, 2015 at 3:59
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Parsing configuration file

There are a few problems with your read_Config function:

  • ConfigParser objects are akin to dictionaries of dictionaries: there is no need to build dictionaries out of them, they already behave as such with a few extra functionnalities;
  • read_Config is called (and thus the file parsed) for each parameter you want to retrieve out of the file: this is a waste of resources, read the file once and return the parser object to extract parameters out of it;
  • read method of ConfigParser behave the same as open, it accepts relative path, no need to use getcwd for that;
  • read method of ConfigParser actually accept a list of filepaths (a single string is turned into a list of 1 element) and silently handle missing files (rationale in the docs); your code will fail if the file is not handled so I suggest using

    with open(<path>) as config_file:
     config.readfp(config_file)
    

    instead to get a more descriptive error if anything goes wrong.

Too much "constants"

A lot of your constants are dependant of the result of your read_Config function.

At this point you may want to introduce a class in your design to improve readability and reusability. The various "constants" initialized out of read_Config will become the state (attributes) of your object.

Exception handling

You made a good point of handling specific exceptions in the method reviewed in your previous question but forgot to apply the advice in your read_Config function.

Also you should prefer the new style of variable naming in except clause using the as keyword: e.g. except Exception as e. This is the only way to write it in Python 3 and yours is considered deprecated.

Avoid repetitions

Almost all your except performs the same operation. For now it’s only two lines of code but someday you might want to improve them. Maybe that the send_mail function only uses str on your exception before sending the mail, leaving out the whole stacktrace with usefull informations and at some point you’ll want to improve that.

Use a function for that: any change will be applied anywhere for free.

Bugs?

  • What is this write_log function in read_Config? You neither define it nor import it.
  • In main you use finally to remove files whether or not they got into the database. You may want to use else instead to remove only if the sql insert succeded.
  • You removed a while True from your last question, is it expected?

Misc

  • Not sure which way is better but you can get your curr_daytimes using datetime.now().isoformat(' ') instead of using time;
  • Why are you using ast to parse the list of RECEIVERS? Can't you write it as a string of recipients separated by semi-colons and split it in your code?
  • Docstrings are "the string written the line right after a function declaration, if any" (not before).
  • smtplib, SMTP and PORT are never used.

Proposed alternative

import os
from datetime import datetime
from sys import stdout
import logging
import mysql.connector
import MySQLdb
import base64
from ConfigParser import ConfigParser
from sendmail import send_mail
def parse_config(config_filename):
 """function that reads the config file"""
 config = ConfigParser()
 with open(config_filename) as config_file:
 config.readfp(config_file)
 return config
class FolderWatcher(object):
 def __init__(self, config_filename):
 """Read configuration file and store important data out of it"""
 config = parse_config(config_filename)
 app_name = config.get('application', 'app_name')
 self.logger = logging.getLogger(app_name)
 self.logger.setLevel(logging.DEBUG)
 formatter = logging.Formatter(
 '%(asctime)s - %(name)s - '
 '%(levelname)s: %(message)s')
 log_path = config.get('application', 'log_path')
 handler = logging.handlers.TimedRotatingFileHandler(
 os.path.join(log_path, app_name + '.log'),
 when="d", interval=1, backupCount=1)
 handler.setFormatter(formatter)
 self.logger.addHandler(handler)
 handler = logging.StreamHandler(stdout)
 handler.setLevel(logging.DEBUG)
 self.logger.addHandler(handler)
 self.watched_folder = config.get('application','file_path')
 self.mysql_credentials = {
 'user': config.get('database','user'),
 ##decode password, remove D and L from the string
 'password': base64.decodestring(config.get('database','password')),
 'host': config.get('database','host'),
 'database': config.get('database','db'),
 'port': config.get('database','port'),
 }
 self.mail_sender = config.get('email', 'sender')
 # You’ll need to modify your config file to get this line to work
 self.mail_receivers = map(str.strip,
 config.get('email', 'receivers').split(';'))
 self.logger.info("Application {} starts running".format(app_name))
 self.logger.info("-"*80)
 def manage_exception(self, exception):
 self.logger.error(exception)
 send_mail(self.mail_sender, self.mail_receivers, exception)
 def insert_csv(self, file_path, filename):
 """insert csv to database"""
 try:
 cnx = mysql.connector.connect(**self.mysql_credentials)
 cursor = cnx.cursor()
 with open(file_path, 'rb') as f:
 data = f.read()
 curr_datetime = datetime.now().isoformat(' ')
 sql = ("INSERT INTO wyko_file(wyko_file,file_name,date_inserted)"
 " VALUES (%s,%s,%s)")
 cursor.execute(sql, (data, filename, curr_datetime))
 if cursor.rowcount != 0:
 self.logger.info("Transfer to database succesful!")
 cnx.commit()
 except (MySQLdb.OperationalError, MySQLdb.ProgrammingError) as e:
 self.manage_exception(e)
 finally:
 cursor.close()
 cnx.close()
 def watch(self):
 """main loop that watches the directory""" 
 try:
 for root, dirs, files in os.walk(self.watched_folder):
 for filename in files:
 file_path = os.path.join(root, filename)
 try:
 self.logger.info("Transferring file: "+filename)
 self.insert_csv(file_path, filename)
 except Exception as e:
 self.manage_exception(e)
 else:
 os.remove(file)
 self.logger.info(
 'File {} successfully removed'.format(file_path))
 except Exception as e:
 self.manage_exception(e)
if __name__ == "__main__":
 watcher = FolderWatcher(os.path.join('Config', 'config.ini'))
 watcher.watch()
answered Nov 1, 2015 at 9:39
\$\endgroup\$
1
  • \$\begingroup\$ Thank you very much for the detailed explanation. I was also thinking of creating a class for it, but i dont know where to start. The code looks cleaner now. @Mathias \$\endgroup\$ Commented Nov 2, 2015 at 0:36
2
\$\begingroup\$

Use batches

One of the issue I can see is that you are handling CSV file line-by-line. You should consider batching your requests to ensure you are sending queries to MySQL batched for better performance. You may be surprised with performance improvement.

See here - also search how to send queries to MySQL in batches: https://stackoverflow.com/a/5526937/2075157

A few comments

  • Don't hard-code path-separators even if you are planning to run on windows only. See: https://docs.python.org/2/library/os.html#os.pathsep

    conf_file = currdir + "\Config\config.ini"

  • Use format function "exception on {0}!".format(option) instead of % when formatting strings for output - this is industry standard.

Minor

  • formatting is off and not-consistent: like space after comma-separator between method arguments is not consistent - choose one standard and follow it.

Good parts

  • logging is done great
answered Oct 31, 2015 at 23:26
\$\endgroup\$
0

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.