I have a class that imports the FX Rates produced by another department.
This works as intended, which is to return a Pandas DataFrame with all FX Rates by month.
The DataFrame it returns is then used by another 5 classes, that basically import other files and do some formatting and calculations with their own columns, always using the FX Rates from the FxRates class.
I run this code in a Jupyter Notebook.
I want to know if:
- Code is refactored enough or over-refactored
- Is it good practice to update the instance variables as I have done so?
- Is there anything else that stands out as being bad practice?
class FxRates:
def __init__(self, reporting_date):
self.reporting_date = reporting_date
self.path = self.get_path()
def get_path(self):
"""Get path for the Fx Rates."""
content_list = listdir(os.path.join(os.getcwd(), 'Data'))
file_path = os.path.join(
os.getcwd(),
'Data',
list(filter(lambda x: 'FX rates' in x, content_list))[0]
)
return file_path
def reporting_date_check(self):
"""
Check if date input follows the criteria below:
31/12/yyyy, 31/03/yyyy, 30/06/yyyy, 30/09/yyyy
"""
accepted_dates = [
'31/12',
'31/03',
'30/06',
'30/09'
]
# Check if first 5 characters match accepted_dates
if self.reporting_date[:5] in accepted_dates:
reporting_date = pd.to_datetime(self.reporting_date,
dayfirst=True)
self.reporting_date = reporting_date
else:
# If not, raise ValueError
raise ValueError(
"""reporting_date does not match one of the following:
31/12/yyyy, 31/03/yyyy, 30/06/yyyy, 30/09/yyyy"""
)
def import_excel_rates(self):
"""Import FX Rates in Excel file from Group."""
rates = pd.read_excel(self.path,
sheet_name='historic rates',
skiprows=2,
header=None,
usecols='B:O',
skipfooter=1)
return rates
def EWI_check(self, rates):
"""
Check if the reporting month already has FX Rates defined.
If not, copy FX Rates from previous month.
"""
# For EWI we need to use FX Rates from 1 month before
if pd.isnull(rates.iloc[0, self.reporting_date.month]):
print("""
\n########## Warning ##########:
\nThere are no FX Rates for {0}/{1}.
\nFX Rates being copied from {2}/{3}.\n""".format(
rates.columns[self.reporting_date.month],
self.reporting_date.year,
rates.columns[self.reporting_date.month - 1],
self.reporting_date.year
))
# Copy FX Rates from previous month
rates.iloc[:, self.reporting_date.month] = \
rates.iloc[:, self.reporting_date.month - 1]
else:
pass
return rates
def import_rates(self):
"""
Import Group Fx rates into a Pandas Dataframe
"""
# Check if reporting date is correct
self.reporting_date_check()
# Import FX Rates in Excel file
rates = self.import_excel_rates()
# Set column headers manually
rates.columns = ['ISO Code',
'December ' + str(self.reporting_date.year - 1),
'January',
'February',
'March',
'April',
'May',
'June',
'July',
'August',
'September',
'October',
'November',
'December']
# Set ISO Code as Index
rates.index = rates['ISO Code'].values
rates.drop('ISO Code', axis=1, inplace=True)
# Check if we have FX Rates for the reporting month
# If not, copy from last month
return self.EWI_check(rates)
1 Answer 1
A note on terminology
"Refactored enough" is only meaningful relative to the code's current state and how it was before; but "over-refactored" is kind of meaningless. I guess the only time that idea could even be applicable is if refactoring occupied too much time or too many corporate resources. Maybe you mean over-abstracted, but that's conjecture. Anyway.
Type hints
reporting_date
could stand to get a type hint, likely : str
given your later usage of this variable.
Pathlib
Consider replacing listdir
, os.path.join
and os.getcwd
with pathlib.Path
equivalents, which are typically better-structured and have nice object representations for paths.
Parsing
Don't store the string representation of reporting_date
. Do something in the constructor like self.reporting_date = self.parse_date(reporting_date)
, where the latter is a static method to replace your current reporting_date_check
. This method would not mutate member variables and would simply return the date once it's figured that out.
Sets
accepted_dates = [
'31/12',
'31/03',
'30/06',
'30/09'
]
should be a class static, initialized via set literal - something like
class FxRates:
ACCEPTED_DATES = {
'31/12',
'31/03',
'30/06',
'30/09',
}
That said, the approach is a little backwards. You should not do string comparison on the string-formatted date. Parse it first, then do validation on its integer parts after. The accepted dates above can turn into a set of 2-tuples, (day, month)
.
Heredocs
This:
print("""
\n########## Warning ##########:
\nThere are no FX Rates for {0}/{1}.
...
is problematic. You're packing a bunch of whitespace in there that you shouldn't. One solution is to move the string to a global constant to avoid the indentation; you should also replace the explicit \n
with actual newlines in the string. Another solution is to keep the text where it is but replace it with a series of implicit-concatenated strings, one per line, i.e.
print(
"########## Warning ##########:\n"
"There are no FX Rates for {0}/{1}.\n"
# ...
-
\$\begingroup\$ Thanks for the answer! And about the refactoring, I definitely meant "over-abstracted". I've read so much about functions that "do one thing, and one thing only" that now I'm paranoid about putting 2 actions in the same function. So, in your opinion, does the code above have a good balance between what each function does? \$\endgroup\$Victor Correra– Victor Correra2020年06月18日 07:06:56 +00:00Commented Jun 18, 2020 at 7:06
-
\$\begingroup\$ Seems pretty good to me \$\endgroup\$Reinderien– Reinderien2020年06月18日 12:11:32 +00:00Commented Jun 18, 2020 at 12:11