Please review my code and give some improvement suggestions.
I wrote code but not to the best standards. I think this site gives some valuable suggestions.
This code is for extracting .xlsx
or CSV files into a list of dicts and changes the values of the dict objects. After that, it converts that to a JSON file.
excelparser.py
import pandas as pd
class ExcelParser:
def __init__(self):
self.config = []
def extract(self, fileName):
try:
raw_excel=pd.read_excel(fileName,sheet_name=-1,
na_values= None,keep_default_na=False)
return [x for k, v in raw_excel.items()
for x in v[v.columns.intersection(self.config)].to_dict(orient='records')]
except:
raw_excel=pd.read_csv(fileName,na_values= None,keep_default_na=False)
return raw_excel[self.config].to_dict(orient='records')
fc.py
from excel_parser import ExcelParser
from bank_writer_base import BankDataWriterBase
import json
class FreechargeImp(BankDataWriterBase):
def __init__(self, input_path, output_path):
self.bankObj = BankDataWriterBase(output_path)
self.input_path = input_path
def extract_json(self):
conf = ["BNPL Activated", "Mobile Number"]
obj = ExcelParser()
obj.config = conf
new_list = obj.extract(self.input_path)
final_list = []
for i in range(len(new_list)):
val = {}
if new_list[i]["BNPL Activated"] == "No":
val = {"App_id": "", "mobile_No": new_list[i]["Mobile Number"],
"loan_type": 33, "bank_id": 114, "latest_status": "Rejected", "bank_appid": "", "pan_no": "", "cust_downloadedapp": {"date": ""}, "cust_completedapp": {"date": ""}, "rejected": {"date": "", "rejection_reason": ""}, "sanctioned": {"amount": "", "date": "", "tenure": "", "interest_rate": ""}, "offer_accepted": {"date": ""}, "disbursed": {"amount": "", "los_id": "", "date": ""}}
if new_list[i]["BNPL Activated"] == "Yes":
val = {"App_id": "", "mobile_No": new_list[i]["Mobile Number"],
"loan_type": 33, "bank_id": 114, "latest_status": "disbursed", "bank_appid": "", "pan_no": "", "cust_downloadedapp": {"date": ""}, "cust_completedapp": {"date": ""}, "rejected": {"date": "", "rejection_reason": ""}, "sanctioned": {"amount": "", "date": "", "tenure": "", "interest_rate": ""}, "offer_accepted": {"date": ""}, "disbursed": {"amount": "", "los_id": "", "date": ""}}
final_list.append(val)
self.bankObj.write_file(json.dumps(final_list, indent=4))
bankwriterbase.py
import json
class BankDataWriterBase:
def __init__(self, input_path, output_path):
self.input_path = input_path
self.output_path = output_path
def write_file(self, data, file):
with open(self.output_path, 'w') as f:
f.write(data)
main.py
from fc import FreechargeImp
from bank_writer_base import BankDataWriterBase
input_path = input("Enter Input path = ")
output_path = input("Enter output path = ")
obj = FreechargeImp(input_path, output_path)
obj.extract_json()
I have also attached the class diagram here
Class Diagram
1 Answer 1
Documentation
The PEP 8 style guide recommends
adding docstrings for classes and functions. While extract
is a good name for a function, a docstring would be helpful in understanding what is being extracted and what is being returned by the function. These 2 lines look impressive, but I find it hard to understand what the statement is doing:
return [x for k, v in raw_excel.items()
for x in v[v.columns.intersection(self.config)].to_dict(orient='records')]
It's been 3 years since you posted the code, and I would not be surprised if you did not remember what it does. Adding comments may help.
Naming
PEP 8 recommends snake_case for function and variable names.
bankObj
would be bank_obj
. In fact, "obj" is too generic. Perhaps bank_writer
would be more specific.
FreechargeImp
: "Imp" is vague. Choose something more specific.
Layout
There are 2 very long lines in the FreechargeImp
class which detract from readability.
The black program can be used to automatically
reformat the code to make it easier to understand.
Also, those 2 long lines look to have a lot of the same code. You could reduce the repetition by using a variable.
Efficiency
The 2 if
statements in the extract_json
function are mutually exclusive and can be combined into an if/elif
:
if new_list[i]["BNPL Activated"] == "No":
...
elif new_list[i]["BNPL Activated"] == "Yes":
This makes code intent clearer, and it improves efficiency since the 2nd check is not needed if the first is true.
Explore related questions
See similar questions with these tags.
files int list of dicts
,concert that
. Try reversing the direction of translation. Take results with a grain of salt.) \$\endgroup\$