How to extend Pandas Dataframe to create custom class where it is initialized with a csv filename?

Question 1

I want to create a special type of object, based on the pandas.DataFrame object, which will always be created based on a particular input file type.

I have been able to design a class that can be created the same way as a normal DataFrame, i.e.:

class CustomDF(pd.DataFrame):
 ...
Obj = CustomDF({'a':[1,2],'b':[3,4]})

But I want to change the initialization behaviour to accept a csv filename and import it. I know Pandas allows this using:

df = pd.read_csv(filename)

But I can't get it to work within my new class when I do:

class CustomDF(pd.DataFrame):
 def __init__(self, filename):
 self = pd.read_csv(filename)

And although there is no error when I create an object with this class, I do get the error 'CustomDF' object has no attribute '_data' when trying to access it.

I have tried changing self = pd.read_csv(filename) to self._data = pd.read_csv(filename) or self.data = pd.read_csv(filename) but this doesn't have any effect.

What is the proper way to accomplish this? Is there a better approach to doing this same thing?

Question 2

I tried your code and the error I got was

main:4: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

This basically means that pandas has an issue when you do the following step

self.data = pd.read_csv(filename)

The reason is that your CustomDF is a dataframe. Think of a dataframe df. When you df.someColumnName, you get the values of that column. when you try to do something like df.someColumnName = something, you are trying to create a new column. You cannot create a column like that.

I removed the inheritance of pd.DataFrame in CustomDF and it works fine.

import pandas as pd 
class CustomDF():
 def __init__(self, filename):
 self.data = pd.read_csv(filename)
csdf = CustomDF("breast_cancer_wisconsin.csv")

Question 3

Thanks for looking into that. I tried the solution and the main problem now would be that if I do df_custom = CustomDF('./myfile.csv'), then df_custom is not my object like I would hope, but I would have to access it using df_custom.data instead, right?

Question 4

Yes, you are right. All the pd.DataFrame properties will be available in df_custom.data instead.

Question 5

OK, so I guess I'm wondering, how can I get df_custom to be the object I am initializing? That's why I thought I would need to use inheritance in this case.

Clock Slave Clock Slave 8,00516 gold badges77 silver badges122 bronze badges · Accepted Answer · 2018-11-29 06:42:00Z

I tried your code and the error I got was

main:4: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

This basically means that pandas has an issue when you do the following step

self.data = pd.read_csv(filename)

The reason is that your CustomDF is a dataframe. Think of a dataframe df. When you df.someColumnName, you get the values of that column. when you try to do something like df.someColumnName = something, you are trying to create a new column. You cannot create a column like that.

I removed the inheritance of pd.DataFrame in CustomDF and it works fine.

import pandas as pd 
class CustomDF():
 def __init__(self, filename):
 self.data = pd.read_csv(filename)
csdf = CustomDF("breast_cancer_wisconsin.csv")

Thanks for looking into that. I tried the solution and the main problem now would be that if I do df_custom = CustomDF('./myfile.csv'), then df_custom is not my object like I would hope, but I would have to access it using df_custom.data instead, right?
Yes, you are right. All the pd.DataFrame properties will be available in df_custom.data instead.
OK, so I guess I'm wondering, how can I get df_custom to be the object I am initializing? That's why I thought I would need to use inheritance in this case.

CollectivesTM on Stack Overflow

How to extend Pandas Dataframe to create custom class where it is initialized with a csv filename?

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

CollectivesTM on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related