1
\$\begingroup\$

I have written the below code while helping one of the colleague over the SO and this code working as desired, but I feel there may be some improvement specially on the string formatting syntax's which can be better reviewed by experts here.

Code snippet:

#!/python3/bin/python
import pandas as pd
pd.set_option('expand_frame_repr', True)
########################################################################################
## Importing SMTP for the e-mail distribution ###
########################################################################################
from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart
import smtplib
from smtplib import SMTP
import requests
import time
import glob
import os
from os import path
import shutil
# Set the timestamp for file based on the timestamp_src and timestamp_dst
timestamp_src = time.strftime("%Y%m%d")
timestamp_dst = time.strftime("%Y%m%d-%H:%M:%S")
# Set the Destination file Variable
src = "/home/user2/.working_scripts/hpe_out/"
dst = "/home/user2/script_logs/"
file_name = "hpeCriticalAlert"
outfile = (f"{src}/{file_name}.{timestamp_src}.csv")
csv_files = [f_extn for f_extn in os.listdir(src) if f_extn.endswith(".csv") and path.isfile(path.join(src, f_extn))]
for file in csv_files:
 shutil.move(path.join(f"{src}/{file}"), f"{dst}/{file}-{timestamp_dst}.log")
session = requests.Session()
url = "https://synergy.example.com/rest/login-sessions"
credentials = {"userName": "administrator", "password": "myPass"}
headers = {"accept": "application/json",
 "content-type": "application/json",
 "x-api-version": "120",
 }
response = session.post(url, headers=headers, json=credentials, verify=False)
session_id = response.json()["sessionID"]
#############################################################################################
url = "https://synergy.example.com/rest/resource-alerts"
headers = {"accept": "application/json",
 "content-type": "text/csv",
 "x-api-version": "2",
 "auth": session_id,
 }
# stream=True on the request avoids reading the content at once into memory for large responses.
response = session.get(url, headers=headers, verify=False, stream=True)
with open(outfile, 'wb') as f:
 for chunk in response.iter_content(chunk_size=1024):
 f.write(chunk)
#################################################################################################
for file in glob.glob(f"{outfile}"):
#for file in glob.glob(f"{src}/{file_name}.*.csv"):
 df = pd.read_csv(file)
##################################################################################################
recipients = ['[email protected]']
mail_receivers = [elem.strip().split(',') for elem in recipients]
msg = MIMEMultipart()
msg['Subject'] = "HTML OneView"
msg['From'] = '[email protected]'
#########################################################################
html = """\
<html>
 <head>
 <style>
 table, th, td {{font-size:9pt; border:1px solid black; border-collapse:collapse; text-align:left; background-color:LightGray;}}
 th, td {{padding: 5px;}}
 </style>
 </head>
 <body>
 Dear Team,<br><br>
 Please Find the Report!<br><br>
 {0} <br><br>
 Kind regards.<br>
 Synergy Global Dashboard.
 </body>
</html>
 """.format(df.to_html(index=False))
part1 = MIMEText(html, 'html')
msg.attach(part1)
server = smtplib.SMTP('smtp.example.com')
server.sendmail(msg['From'], mail_receivers, msg.as_string())
asked Dec 16, 2020 at 19:57
\$\endgroup\$
0

1 Answer 1

2
\$\begingroup\$

Since this looks like a script that might get onto some production server at some point, I'm going to give some recommendations from that point of view and how I'd do it if I were to implement this for my employer.

In this review I'll mostly focus on the project structure, requests, exceptions and credentials management. I'll first paste the code that I came up with and then I'm going to explain in detail why did I do most of the changes.

# config.py
from getpass import getpass
import time
TIMESTAMP_SRC = time.strftime("%Y%m%d")
TIMESTAMP_DST = time.strftime("%Y%m%d-%H:%M:%S")
DST = "/home/user2/script_logs/"
SRC = "/home/user2/.working_scripts/hpe_out/"
FILE_NAME = "hpeCriticalAlert"
OUTFILE = f"{SRC}/{FILE_NAME}.{TIMESTAMP_SRC}.csv"
BASE_URL = "https://synergy.example.com/rest"
USERNAME = "administrator"
PASSWORD = getpass("Input your password: ")
# EMAIL CONFIG
SMTP_SERVER = 'smtp.example.com'
FROM = '[email protected]'
TO = ['[email protected]', '[email protected]']
SUBJECT = 'HTML OneView'
EMAIL_TEMPLATE = """\
<html>
 <head>
 <style>
 table, th, td {{font-size:9pt; border:1px solid black; border-collapse:collapse; text-align:left; background-color:LightGray;}}
 th, td {{padding: 5px;}}
 </style>
 </head>
 <body>
 Dear Team,<br><br>
 Please Find the Report!<br><br>
 {} <br><br>
 Kind regards.<br>
 Synergy Global Dashboard.
 </body>
</html>"""
# main.py
import os
import shutil
import smtplib
from email.message import EmailMessage
import pandas as pd
pd.set_option('expand_frame_repr', True)
import requests
from .config import (
 BASE_URL,
 DST,
 OUTFILE,
 PASSWORD,
 SRC,
 TIMESTAMP_DST,
 USERNAME,
 SUBJECT,
 FROM,
 TO,
 EMAIL_TEMPLATE,
 SMTP_SERVER,
)
class FileMoveFailure(Exception):
 pass
class SynergyRequestFailure(Exception):
 pass
class SessionIdRetrievalFailure(SynergyRequestFailure):
 pass
class ResourceAlertsRetrievalFailure(SynergyRequestFailure):
 pass
def move_csv_files():
 for csv_file in os.listdir(SRC):
 if csv_file.endswith(".csv") and os.path.isfile(os.path.join(SRC, csv_file)):
 try:
 shutil.move(
 os.path.join(f"{SRC}/{csv_file}"), 
 f"{DST}/{csv_file}-{TIMESTAMP_DST}.log"
 )
 except OSError as os_error:
 raise FileMoveFailure(
 f'Moving file {csv_file} has failed: {os_error}'
 )
def get_session_id(session):
 try:
 response = session.post(
 url=f"{BASE_URL}/login-sessions",
 headers={
 "accept": "application/json",
 "content-type": "application/json",
 "x-api-version": "120",
 },
 json={
 "userName": USERNAME,
 "password": PASSWORD
 },
 verify=False
 )
 except requests.exceptions.RequestException as req_exception:
 # you should also get this logged somewhere, or at least
 # printed depending on your use case
 raise SessionIdRetrievalFailure(
 f"Could not get session id: {req_exception}"
 )
 json_response = response.json()
 if not json_response.get("sessionID"):
 # always assume the worse and do sanity checks & validations
 # on fetched data
 raise KeyError("Could not fetch session id")
 return json_response["sessionID"]
def get_resource_alerts_response(session, session_id):
 try:
 return session.get(
 url=f"{BASE_URL}/resource-alerts",
 headers={
 "accept": "application/json",
 "content-type": "text/csv",
 "x-api-version": "2",
 "auth": session_id,
 },
 verify=False,
 stream=True
 )
 except requests.exceptions.RequestException as req_exception:
 # you should also get this logged somewhere, or at least
 # printed depending on your use case
 raise ResourceAlertsRetrievalFailure(
 f"Could not fetch resource alerts: {req_exception}"
 )
def resource_alerts_to_df(resource_alerts_response):
 with open(OUTFILE, 'wb') as f:
 for chunk in resource_alerts_response.iter_content(chunk_size=1024):
 f.write(chunk)
 return pd.read_csv(OUTFILE)
def send_email(df):
 server = smtplib.SMTP(SMTP_SERVER)
 msg = EmailMessage()
 msg['Subject'], msg['From'], msg['To'] = SUBJECT, FROM, TO
 msg.set_content("Text version of your html template")
 msg.add_alternative(
 EMAIL_TEMPLATE.format(df.to_html(index=False)),
 subtype='html'
 )
 server.send_message(msg)
def main():
 move_csv_files()
 session = requests.Session()
 session_id = get_session_id(session)
 resource_alerts_response = get_resource_alerts_response(session, session_id)
 df = resource_alerts_to_df(resource_alerts_response)
 send_email(df)
if __name__ == '__main__':
 main()

config.py

The first major thing that I did was to split apart the configuration data from the main script. This will make your life easier when it comes to modifying the values since they're altogether.

Another very important and sensitive thing to notice is the PASSWORD field. Do NOT, EVER, expose credentials in plain text. There are so many cases where this happened at big companies and resulted in loss of millions of $$ due to hackers exploiting this...

There are multiple ways of avoiding this. None is perfect but it's definitely better than storing passwords in plain-text:

  • Ask for user's input;
  • Store them as ENV VARIABLES;
  • Use some secrets management tool like Ansible Vault

This config file doesn't necessarily have to be a .py file. It can be a yaml, ini, .env, json file. I made it like this for convenience.

main.py

Honestly, when I first saw your script, I remembered how I used to write code when I started programming. It's messy, you understand it at the moment of writing and it kinda does the job you need. But when it comes to maintaining it, it's really, really tedious and frustrating: all those useless comments, unused imports, no spacing, nothing... You wouldn't even know where to start next time you have to add some new feature. Because of this you'd probably need to completely rewrite it.

Improvements brought by my implementation:

  • code is definitely easier to read and understand, even if I haven't added any docstrings.
  • it's a lot easier to modify any part of the code since it's now broken into smaller pieces. You can read more about the SRP principle here. I honestly could've done better on this one but for a first review, it'll do.

What is still to be improved (by you)

  • add some logging
  • add docstrings / type hints
  • chose better names for your variables (SRC, DST and so on aren't to descriptive)
  • create tests to ensure your code works (I haven't really tested my implementation but it should give you an overall idea)

1. Imports

As you can see I don't have any unused imports and it's really easy to recognise stdlib modules, 3rd party modules and local modules because of the grouping:

# stdlib
import os
import shutil
import smtplib
from email.message import EmailMessage
# 3rd party modules
import pandas as pd
pd.set_option('expand_frame_repr', True)
import requests
# local config
from .config import (
 BASE_URL,
 DST,
 OUTFILE,
 PASSWORD,
 SRC,
 TIMESTAMP_DST,
 USERNAME,
 SUBJECT,
 FROM,
 TO,
 EMAIL_TEMPLATE,
 SMTP_SERVER,
)

2. Exceptions

I've seen people creating a new exceptions.py file where they create custom exceptions for different use-cases. I just let them in main.py. Having your own subclasses lets you treat distinct exception cases separately. (such as the difference between not having permissions to run a report, and the report execution failing).

3. CSV files

Since you only want to move the .csv files to a specific location it's not necessary to build a list and iterate against it again to move each file. Instead, do it in one run and also raise an exception if something happens:

4. Requests

It's always a good practice to catch any specific exception that might be raised. I've only handled the generic RequestException which is going to catch any request exception but you might want to treat some exceptions separately: Connection errors, timeout errors, etc.

Also, always make sure the data you receive is what you expect. I've seen cases where certain APIs where changed by some provider without announcing us (the developers) or without stating in their documentation otherwise. This usually breaks everything and takes some time figuring out what the root cause is.

5. Others

I think the rest of it is pretty straight forward. I'm not saying my solution is perfect but it's a better starting point.

Enjoy! :)

answered Dec 16, 2020 at 22:25
\$\endgroup\$
1
  • \$\begingroup\$ Alex, This is a nice one, however i have to keep the password into the code, is there a way if we can use a kind of Ansible vault, i'm not pretty familiar with using class but will get into it. \$\endgroup\$ Commented Dec 17, 2020 at 7:00

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.