Input checking Read CSV, with date filtering and readability in a dataframe loaderresampling
I have written the following function to read multiple .csvCSV files into pandas dataframes. Depending on the use case, the user can pass optional resampling (frequency and method) and/or a date range (start/end). For both options I'd like to check if both keywords were given and raise erroserrors if not.
My issue is that reading the csvCSV files can potentially take quite a bit of time and it's quite frustrating to get the value error 5 minutes after you've called the function. I could duplicate the ifif
statements at the top of the function. However I'd like to know if there is a more readable way or a best practice that avoids having the same if elseif
/else
statements multiple times.
def file_loader(filep,freq:str = None, method: str =None ,d_debut: str =None,d_fin: str =None):
df= pd.read_csv(filep,\
index_col=0,infer_datetime_format=True,parse_dates=[0],\
header=0,names=['date',filep.split('.')[0]])\
.sort_index()
if d_debut is not None:
if d_fin is None:
raise ValueError("Please provide an end timestamp!")
else:
df=df.loc[ (df.index >= d_debut) & (df.index <= d_fin)]
if freq is not None:
if method is None:
raise ValueError("Please provide a resampling method for the given frequency eg. 'last' ,'mean'")
else:
## getattr sert à appeler ...resample(freq).last() etc avec les kwargs en string ex: freq='1D' et method ='mean'
df= getattr(df.resample(freq), method)()
return df
Input checking and readability in a dataframe loader
I have written the following function to read multiple .csv files into pandas dataframes. Depending on the use case, the user can pass optional resampling (frequency and method) and/or a date range (start/end). For both options I'd like to check if both keywords were given and raise erros if not.
My issue is that reading the csv files can potentially take quite a bit of time and it's quite frustrating to get the value error 5 minutes after you've called the function. I could duplicate the if statements at the top of the function. However I'd like to know if there is a more readable way or a best practice that avoids having the same if else statements multiple times.
def file_loader(filep,freq:str = None, method: str =None ,d_debut: str =None,d_fin: str =None):
df= pd.read_csv(filep,\
index_col=0,infer_datetime_format=True,parse_dates=[0],\
header=0,names=['date',filep.split('.')[0]])\
.sort_index()
if d_debut is not None:
if d_fin is None:
raise ValueError("Please provide an end timestamp!")
else:
df=df.loc[ (df.index >= d_debut) & (df.index <= d_fin)]
if freq is not None:
if method is None:
raise ValueError("Please provide a resampling method for the given frequency eg. 'last' ,'mean'")
else:
## getattr sert à appeler ...resample(freq).last() etc avec les kwargs en string ex: freq='1D' et method ='mean'
df= getattr(df.resample(freq), method)()
return df
Read CSV, with date filtering and resampling
I have written the following function to read multiple CSV files into pandas dataframes. Depending on the use case, the user can pass optional resampling (frequency and method) and/or a date range (start/end). For both options I'd like to check if both keywords were given and raise errors if not.
My issue is that reading the CSV files can potentially take quite a bit of time and it's quite frustrating to get the value error 5 minutes after you've called the function. I could duplicate the if
statements at the top of the function. However I'd like to know if there is a more readable way or a best practice that avoids having the same if
/else
statements multiple times.
def file_loader(filep,freq:str = None, method: str =None ,d_debut: str =None,d_fin: str =None):
df= pd.read_csv(filep,\
index_col=0,infer_datetime_format=True,parse_dates=[0],\
header=0,names=['date',filep.split('.')[0]])\
.sort_index()
if d_debut is not None:
if d_fin is None:
raise ValueError("Please provide an end timestamp!")
else:
df=df.loc[ (df.index >= d_debut) & (df.index <= d_fin)]
if freq is not None:
if method is None:
raise ValueError("Please provide a resampling method for the given frequency eg. 'last' ,'mean'")
else:
## getattr sert à appeler ...resample(freq).last() etc avec les kwargs en string ex: freq='1D' et method ='mean'
df= getattr(df.resample(freq), method)()
return df
Input checking and readability in a dataframe loader
I have written the following function to read multiple .csv files into pandas dataframes. Depending on the use case, the user can pass optional resampling (frequency and method) and/or a date range (start/end). For both options I'd like to check if both keywords were given and raise erros if not.
My issue is that reading the csv files can potentially take quite a bit of time and it's quite frustrating to get the value error 5 minutes after you've called the function. I could duplicate the if statements at the top of the function. However I'd like to know if there is a more readable way or a best practice that avoids having the same if else statements multiple times.
def file_loader(filep,freq:str = None, method: str =None ,d_debut: str =None,d_fin: str =None):
df= pd.read_csv(filep,\
index_col=0,infer_datetime_format=True,parse_dates=[0],\
header=0,names=['date',filep.split('.')[0]])\
.sort_index()
if d_debut is not None:
if d_fin is None:
raise ValueError("Please provide an end timestamp!")
else:
df=df.loc[ (df.index >= d_debut) & (df.index <= d_fin)]
if freq is not None:
if method is None:
raise ValueError("Please provide a resampling method for the given frequency eg. 'last' ,'mean'")
else:
## getattr sert à appeler ...resample(freq).last() etc avec les kwargs en string ex: freq='1D' et method ='mean'
df= getattr(df.resample(freq), method)()
return df