Pandas/Tkinter GUI Excel report generator

Question 1

I built a GUI tool that takes excel files and outputs a finished report to help automate a report at work. It was a fantastic learning experienced and I feel much more comfortable with pandas and python, but I am very aware of some bad programming practices I have included.

I am self taught, and a little stuck on what I should focus on how to improve my program, I'm just looking to be pointed in the right direction.

Most of the program is wrapped into one "Try/Except" block. I'm really not experienced enough to know what else I can do to improve it, aside from implementing some DRY and wrapping part of the program into a function so its not repeated for multiple files.

Could a pandas guru or python guru give me a nudge in the right direction? I'm hitting a block on best practices to implement.

import Tkinter
import pandas as pd
import xlsxwriter
class simpleapp_tk(Tkinter.Tk):
 def __init__(self,parent):
 Tkinter.Tk.__init__(self,parent)
 self.parent = parent
 self.initialize()
 def initialize(self):
 self.grid() # Create the grid
 #Label for entering filename
 label_one = Tkinter.Label(self, text="Please Enter a Filename:",anchor="w",fg="black",
 font="Times_New_Roman 10 bold") 
 label_one.grid(column=0,row=1,sticky='W')
 #Label for entering filename
 label_one2 = Tkinter.Label(self, text="Please Enter a 2nd Filename:",anchor="w",fg="black",
 font="Times_New_Roman 10 bold") 
 label_one2.grid(column=0,row=2,sticky='W') 
 #Example label
 label_two = Tkinter.Label(self, text="RXInfoAug1.xlsx",anchor="center",fg="black",
 font="Times_New_Roman 10")
 label_two.grid(column=1,row=3,sticky='EW')
 #Creating label area for user information.
 label_three = Tkinter.Label(self, text="Please select User:",anchor='w',fg='black',
 font="Times_New_Roman 10 bold")
 label_three.grid(column=0,row=4,sticky='EW')
 #Label four is for EXAMPLE text
 label_four = Tkinter.Label(self,text="Example:",anchor='w',fg='black',
 font="Times_New_Roman 10 bold")
 label_four.grid(column=0,row=3,sticky='W')
 #Label five is below Generate RX report, spits out result information.
 label_five = Tkinter.Label(self,text="----------Result Info----------",anchor='center',
 fg='black',font="Times_New_Roman 10 bold")
 label_five.grid(column=0,row=6,sticky='EW',columnspan=2)
 #Label six is return information after clicking the button
 self.labelVariable_six = Tkinter.StringVar() #have to create a variable to return
 label_six = Tkinter.Label(self,text="None",anchor='center',fg='white',bg="blue",
 font='Times_New_Roman 10 bold',
 textvariable=self.labelVariable_six) #notice the return variable
 label_six.grid(column=0,row=8,sticky='EW',columnspan=2)
 #Label seven is more return information, pertaining to after you click the button.
 self.labelVariable_seven = Tkinter.StringVar() #creating the return variable
 label_seven = Tkinter.Label(self,text="None",anchor='center',fg='white',bg='blue',
 font='Times_New_Roman 10 bold',
 textvariable=self.labelVariable_seven)
 label_seven.grid(column=0,row=9,sticky='EW',columnspan=2)
 #Entry Variable creation and location set.
 self.entryVariable = Tkinter.StringVar()
 self.entry = Tkinter.Entry(self,textvariable=self.entryVariable) #NOTICE the entry var name
 self.entry.grid(column=1,row=1,sticky='EW')
 #Entry Variable2 creation and location set.
 self.entryVariable2 = Tkinter.StringVar()
 self.entry2 = Tkinter.Entry(self,textvariable=self.entryVariable2) #NOTICE the entry var name
 self.entry2.grid(column=1,row=2,sticky='EW') 
 #Creating the variable of which user used by option menu
 self.var = Tkinter.StringVar(self)
 self.var.set("User")
 #Creating a drop down menu for user selection
 self.option_menu = Tkinter.OptionMenu(self, self.var, "Corey","Warwick","Jane", "Jacque")
 self.option_menu.grid(column=1,row=4,sticky='EW')
 #Button to generate reports - When clicked runs OnButtonClick
 button = Tkinter.Button(self,text="Generate RX Report",
 font="Times_New_Roman 12 bold",
 command=self.OnButtonClick)
 button.grid(column=0,row=6,sticky="EW",columnspan=2)
 #Layout management area
 self.grid_columnconfigure(0,weight=1)
 self.grid_columnconfigure(1,weight=1)
 #Layout Manager resizes if you adjust window
 self.resizable(True,True)
 self.update()
 #sets canvas size (Length, depth)
 self.geometry("320x170+300+300")
 self.entry.focus_set()
 self.entry.selection_range(0, Tkinter.END)
 self.entry2.selection_range(0, Tkinter.END)
def OnButtonClick(self):
 UserName = self.var.get()
 FileName = self.entryVariable.get() 
 if len(self.entryVariable2.get()) > 0:
 FileName2 = self.entryVariable2.get()
 else:
 FileName2 = self.entryVariable.get()
 if UserName == "User":
 self.labelVariable_six.set("Please select a User before Continuing")
 self.labelVariable_seven.set("") 
 elif UserName == "Corey":
 FilePath = "\cashley\Desktop\\"
 self.labelVariable_six.set("Welcome Corey, enjoy the report.")
 self.labelVariable_seven.set("Output file is located on desktop")
 writer = pd.ExcelWriter('C:\\Users\\cashley\\Desktop\\RX_STATS.xlsx')
 elif UserName == "Warwick":
 FilePath = "\wbarlow.GMSPRIMARY\Desktop\\"
 self.labelVariable_six.set("Welcome Warwick, always great to see you")
 self.labelVariable_seven.set("Output file is RX_STATS located on desktop")
 writer = pd.ExcelWriter('C:\\Users\\wbarlow.GMSPRIMARY\\Desktop\\RX_STATS.xlsx')
 elif UserName == "Jane":
 FilePath = "\janen.GMSPRIMARY\Desktop\\"
 self.labelVariable_six.set("Welcome Jane, enjoy the report")
 self.labelVariable_seven.set("Output file is RX_STATS located on desktop")
 writer = pd.ExcelWriter('C:\\Users\\janen.GMSPRIMARY\\Desktop\\RX_STATS.xlsx')
 elif UserName == "Jacque":
 FilePath = "\jacquea\Desktop\\"
 self.labelVariable_six.set("Welcome Jacquea, enjoy the report")
 self.labelVariable_seven.set("Output file is RX_STATS located on desktop")
 writer = pd.ExcelWriter('C:\\Users\\jacquea\\Desktop\\RX_STATS.xlsx')
 try:
 df = pd.read_excel("C:\Users" + FilePath + FileName)
 #DF_C Was added in so you can bring two files in to compare the two.
 df_C = pd.read_excel("C:\Users" + FilePath + FileName2)
 df.columns = ['GROUP','MEMBER_NAME','DRUG','CLAIM_DATE','SUB_QUANT','DAYS_SUPPLY','COPAY_AMT','TOTAL_COST']
 df_C.columns = ['GROUP','MEMBER_NAME','DRUG','CLAIM_DATE','SUB_QUANT','DAYS_SUPPLY','COPAY_AMT','TOTAL_COST']
 df['GROUP'] = df['GROUP'].astype(str)
 df_C['GROUP'] = df_C['GROUP'].astype(str)
 total_expense = df['TOTAL_COST'].sum()
 Y = total_expense * .80
 df['PERCENT'] = (df['TOTAL_COST'] / total_expense).round(4)
 #Bad DRY here but metrics for analysis for df_C
 total_expense_C = df_C['TOTAL_COST'].sum()
 Y_C = total_expense_C * .80
 df_C['PERCENT'] = (df_C['TOTAL_COST'] / total_expense_C).round(4) 
 emp_dict = {
 'FAKELASTNAME, FAKEFIRSTNAME': 'REDACTED1',
 'BOB, FAKE': 'REDACTED2',
 'SARAH, FAKE': 'REDACTED3',
 }
 df['MEMBER_NAME'].replace(emp_dict, inplace=True)
 df.loc[df.DRUG.isin(['LATUDA TAB 20MG', 'LATUDA TAB 60MG','LATUDA TAB 40MG']), 'DRUG'] = 'LATUDA TAB'
 df.loc[df.DRUG.isin(['HUMIRA START KIT 40MG PEN', 'HUMIRA PEN INJ 40MG/0.8','HUMIRA KIT 40MG SYN']), 'DRUG'] = 'HUMIRA'
 df.loc[df.DRUG.isin(['ENBREL SCLIK SYR 50MG/ML', 'ENBREL SYR 50MG/ML','ENBREL SYR 25/0.5ML']), 'DRUG'] = 'ENBREL'
 df.loc[df.DRUG.isin(['TRULICITY(4) PEN 1.5/0.5', 'TRULICITY(4) PEN 0.75/0.5']), 'DRUG'] = 'TRULICITY'
 df_C['MEMBER_NAME'].replace(emp_dict, inplace=True)
 df_C.loc[df_C.DRUG.isin(['LATUDA TAB 20MG', 'LATUDA TAB 60MG','LATUDA TAB 40MG']), 'DRUG'] = 'LATUDA TAB'
 df_C.loc[df_C.DRUG.isin(['HUMIRA START KIT 40MG PEN', 'HUMIRA PEN INJ 40MG/0.8','HUMIRA KIT 40MG SYN']), 'DRUG'] = 'HUMIRA'
 df_C.loc[df_C.DRUG.isin(['ENBREL SCLIK SYR 50MG/ML', 'ENBREL SYR 50MG/ML','ENBREL SYR 25/0.5ML']), 'DRUG'] = 'ENBREL'
 df_C.loc[df_C.DRUG.isin(['TRULICITY(4) PEN 1.5/0.5', 'TRULICITY(4) PEN 0.75/0.5']), 'DRUG'] = 'TRULICITY' 
 df2 = df['TOTAL_COST'].groupby(df['MEMBER_NAME']).sum().nlargest(10)
 df3 = df['TOTAL_COST'].groupby(df['DRUG']).sum().nlargest(20)
 df4 = df['MEMBER_NAME'].value_counts().nlargest(20)
 df5 = df['PERCENT'].groupby(df['MEMBER_NAME']).sum().nlargest(20)
 df7 = df[df['TOTAL_COST'] > 0]
 df8 = df['TOTAL_COST'].groupby(df['MEMBER_NAME']).sum().nlargest(1000)
 df13 = df['TOTAL_COST'].groupby(df['GROUP']).sum().nlargest(5)
 count = 0
 total = 0
 for row in df8:
 if total < Y:
 total = total + row
 count += 1
 WW = count
 #Count of how many Members are in the file. It combines duplicate individuals with nunique.
 Ind_count = df['MEMBER_NAME'].nunique()
 # Count of non zero individuals in the file.
 Ind_count_nonzero = df7['MEMBER_NAME'].nunique()
 # Finding what 20 percent of the total is
 twenty_percent1 = Ind_count_nonzero * .20
 twenty_percent = round(twenty_percent1,0)
 row_count = df.shape[0]
 df10 = df['PERCENT'].groupby(df['MEMBER_NAME']).sum().nlargest(WW)
 group_dict = {
 '2006':'GROUPNAME1', 
 '2033':'GROUPNAME2', 
 '2041':'GROUPNAME3',
 }
 df['GROUP'].replace(group_dict, inplace=True)
 df_C['GROUP'].replace(group_dict, inplace=True)
 df15 = df_C[~df_C.MEMBER_NAME.isin(df.MEMBER_NAME)] 
 pivot_C = pd.pivot_table(df15, index=['MEMBER_NAME','GROUP'],values=['TOTAL_COST'], aggfunc='sum') 
 pivot_C2 = pivot_C.sort_values(by=['TOTAL_COST'],ascending=False)
 pv_C = pivot_C2.head(20)
 pv_C2 = pv_C.reset_index()
 df13 = df['TOTAL_COST'].groupby(df['GROUP']).sum().nlargest(5)
 df3 = df3.reset_index()
 df2 = df2.reset_index()
 df4 = df4.reset_index()
 df5 = df5.reset_index()
 df10 = df10.reset_index()
 df13 = df13.reset_index()
 pivot2 = pd.pivot_table(df, index=['MEMBER_NAME','GROUP'],values=['TOTAL_COST'], aggfunc='sum')
 pivot3 = pivot2.sort_values(by=['TOTAL_COST'],ascending=False)
 pv3 = pivot3.head(20)
 pv4 = pv3.reset_index()
 writer = pd.ExcelWriter('C:\\Users' + FilePath + 'RX_STATS.xlsx')
 #writer = pd.ExcelWriter('C:\\Users\\wbarlow.GMSPRIMARY\\Desktop\\RX_STATS.xlsx')
 pv4.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=0,startrow=10)
 pv_C2.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=0,startrow=35)
 df3.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=4,startrow=10)
 df4.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=7,startrow=10)
 df10.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=10,startrow=10)
 df13.to_excel(writer, sheet_name='Summary',index=False,header=True,startcol=4,startrow=0)
 df.to_excel(writer, sheet_name='Raw Data',index=False,header=True)
 df_C.to_excel(writer, sheet_name='Compare Data',index=False,header=True)
 worksheet1 = writer.sheets['Summary']
 worksheet2 = writer.sheets['Raw Data']
 worksheet3 = writer.sheets['Compare Data']
 workbook = writer.book
 format1 = workbook.add_format({'num_format':'$#,###.##'})
 format6 = workbook.add_format({'num_format':'$#,###.##','bold':True})
 format2 = workbook.add_format({'bold': True})
 format5 = workbook.add_format({'bold': True,'border':1,'align':'center'})
 format3 = workbook.add_format({'bold': True, 'align': 'center'})
 format4 = workbook.add_format({'num_format':'0.00%'})
 merge_format = workbook.add_format({
 'bold':1,
 'border':1,
 'align':'center'})
 worksheet1.write('A1', "Caremark File Statistics", format2)
 worksheet1.write('A6', "Individual Count:", format2)
 worksheet1.write('A3', "Individual Non-Zero Count:", format2)
 worksheet1.write('A4', "Twenty percent of all individuals:", format2)
 worksheet1.write('A5', "Count representing eighty percent:", format2)
 worksheet1.write('A2', "RX Total Expense:", format2)
 worksheet1.write('B6', Ind_count, format2)
 worksheet1.write('B3', Ind_count_nonzero, format2)
 worksheet1.write('B4', twenty_percent, format2)
 worksheet1.write('B5', WW, format2)
 worksheet1.write('B2', total_expense, format6)
 worksheet1.conditional_format('C12:C31',{'type':'cell',
 'criteria':'>=',
 'value':1,
 'format':format1})
 worksheet1.conditional_format('F12:F31',{'type':'cell',
 'criteria':'>=',
 'value':1,
 'format':format1})
 worksheet1.conditional_format(11,11,11+WW,11,{'type':'cell',
 'criteria':'>=',
 'value':0,
 'format':format4})
 worksheet1.conditional_format('F2:F6',{'type':'cell',
 'criteria':'>=',
 'value':0,
 'format':format1})
 worksheet2.conditional_format(1,8,row_count,8,{'type':'cell',
 'criteria':'>=',
 'value':0,
 'format':format4})
 worksheet3.conditional_format(1,8,row_count,8,{'type':'cell',
 'criteria':'>=',
 'value':0,
 'format':format4})
 worksheet1.conditional_format('C37:C56',{'type':'cell',
 'criteria':'>=',
 'value':1,
 'format':format1}) 
 worksheet1.set_column(0,0,32)
 worksheet1.set_column(1,1,35)
 worksheet1.set_column(2,2,12)
 worksheet1.set_column(3,3,5)
 worksheet1.set_column(4,4,36)
 worksheet1.set_column(5,5,12)
 worksheet1.set_column(6,6,5)
 worksheet1.set_column(7,7,30)
 worksheet1.set_column(8,8,12)
 worksheet1.set_column(9,9,5)
 worksheet1.set_column(10,10,30)
 worksheet1.set_column(11,11,12)
 worksheet2.set_column(0,0,35)
 worksheet2.set_column(1,1,30)
 worksheet2.set_column(2,2,35)
 worksheet2.set_column(3,3,20)
 worksheet2.set_column(4,4,13)
 worksheet2.set_column(5,5,13)
 worksheet2.set_column(6,6,13)
 worksheet2.set_column(7,7,13)
 worksheet2.set_column(8,8,9)
 worksheet3.set_column(0,0,35)
 worksheet3.set_column(1,1,30)
 worksheet3.set_column(2,2,35)
 worksheet3.set_column(3,3,20)
 worksheet3.set_column(4,4,13)
 worksheet3.set_column(5,5,13)
 worksheet3.set_column(6,6,13)
 worksheet3.set_column(7,7,13)
 worksheet3.set_column(8,8,9) 
 worksheet1.merge_range('A9:C9','Top Twenty Expense Individuals',merge_format)
 worksheet1.merge_range('E9:F9','Top Twenty RX Prescriptions', merge_format)
 worksheet1.merge_range('H9:I9','Top Twenty Prescription Fills Per Person', merge_format)
 worksheet1.merge_range('K9:L9','Individuals Representing 80% of Total Cost', merge_format)
 worksheet1.merge_range('A34:C34','New Individuals with Largest RX Totals', merge_format)
 worksheet1.write('H11', 'MEMBER_NAME', format5)
 worksheet1.write('I11','COUNT',format5)
 worksheet1.write('E1','LARGEST RX GROUPS',format5)
 worksheet1.write('A36','MEMBER NAME', format5)
 writer.save()
 self.entry.focus_set()
 self.entry.selection_range(0, Tkinter.END)
 except:
 self.labelVariable_six.set("Confirm User is accurate.")
 self.labelVariable_seven.set("Confirm your filename.")
if __name__ == "__main__":
 app = simpleapp_tk(None)
 app.title('WB Caremark Cleaner')
 app.mainloop()

Question 2

Welcome to Code Review! I suspect that emp_dict and group_dict are supposed to be initialized with some data? Could you fill in some fake sample entries?

Question 3

I previously just removed that data for security reasons, I edited some placeholder text there now. Thanks for the welcome also! This is my first post here, never reached out for pier review, I feel like I need to in order to develop into a better developer.

Question 4

Looks like your indentation got lost in the initialize function; can you edit that?

Question 5

fixed sorry about that

Question 6

To get this working for Python 3, I needed to change:

import Tkinter
class simpleapp_tk(Tkinter.Tk):

to

import tkinter
class simpleapp_tk(tkinter.Tk):

So just a lower case for tkinter (a global search/replace). Warwick, you have a lot of code in one big class!
The creation and formatting of the worksheet can be extracted into another function external to the simpleapp_tk class.

For example, if you were going to do that, the code inside the onButtonClick should look like:

 result = make_me_a_spreadsheet(file_name, file_name2)
 if result:
 self.entry.focus_set()
 self.entry.selection_range(0, tkinter.END)
 else:
 self.label_variable_six.set("Confirm User is accurate.")
 self.label_variable_seven.set("Confirm your filename.")

With the start of the excel creation function starting like:

def make_me_a_spreadsheet(filename_a, filename_b):
 try:
 df = pd.read_excel(filename_a) # r"C:\Users" + FilePath + file_name)
 # ... etc

(I just did a cut/paste myself), and of course, the end of the spreadsheet creation function being like this:

 writer.save()
 return True
except:
 return None

the return None means that if there was an error creating the file inside the large try/except block, it will return an empty value to the GUI class. With result being nothing, the GUI will display the correct message (the same as you currently have it).

Please have a go at making this change, and don't forget to make a backup of your code before changing it.

You should encounter a minor bug or two, such as:

writer = pd.ExcelWriter(r'C:\Users' + FilePath + 'RX_STATS.xlsx')

However I'm confident that you will be able to figure out any bugs. If you make this change, your code will be a smaller step towards being better. You will have separated the creation of the xlsx file from the GUI. In the future, if there is a problem when creating the spreadsheet, you know to look only at the make_me_a_spreadsheet function - and not at the GUI class. Much easier to track down bugs!

You can follow this methodology to improve your code, that is, extracting specific steps/actions into a separate function (with passing values in and passing them out). Please see some of the other Python examples people offer on Code Review to get more ideas.

Good luck, hope this helps a little bit!

Question 7

Thank you so much for your input, it is a massive help, ill get started on implementing these changes.

C. Harley C. HarleyC. Harley 1,6537 silver badges8 bronze badges · Accepted Answer · 2018-08-03 09:30:47Z

To get this working for Python 3, I needed to change:

import Tkinter
class simpleapp_tk(Tkinter.Tk):

to

import tkinter
class simpleapp_tk(tkinter.Tk):

So just a lower case for tkinter (a global search/replace). Warwick, you have a lot of code in one big class!
The creation and formatting of the worksheet can be extracted into another function external to the simpleapp_tk class.

For example, if you were going to do that, the code inside the onButtonClick should look like:

 result = make_me_a_spreadsheet(file_name, file_name2)
 if result:
 self.entry.focus_set()
 self.entry.selection_range(0, tkinter.END)
 else:
 self.label_variable_six.set("Confirm User is accurate.")
 self.label_variable_seven.set("Confirm your filename.")

With the start of the excel creation function starting like:

def make_me_a_spreadsheet(filename_a, filename_b):
 try:
 df = pd.read_excel(filename_a) # r"C:\Users" + FilePath + file_name)
 # ... etc

(I just did a cut/paste myself), and of course, the end of the spreadsheet creation function being like this:

 writer.save()
 return True
except:
 return None

the return None means that if there was an error creating the file inside the large try/except block, it will return an empty value to the GUI class. With result being nothing, the GUI will display the correct message (the same as you currently have it).

Please have a go at making this change, and don't forget to make a backup of your code before changing it.

You should encounter a minor bug or two, such as:

writer = pd.ExcelWriter(r'C:\Users' + FilePath + 'RX_STATS.xlsx')

However I'm confident that you will be able to figure out any bugs. If you make this change, your code will be a smaller step towards being better. You will have separated the creation of the xlsx file from the GUI. In the future, if there is a problem when creating the spreadsheet, you know to look only at the make_me_a_spreadsheet function - and not at the GUI class. Much easier to track down bugs!

You can follow this methodology to improve your code, that is, extracting specific steps/actions into a separate function (with passing values in and passing them out). Please see some of the other Python examples people offer on Code Review to get more ideas.

Good luck, hope this helps a little bit!

Thank you so much for your input, it is a massive help, ill get started on implementing these changes.

Stack Exchange Network

Pandas/Tkinter GUI Excel report generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Pandas/Tkinter GUI Excel report generator

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions