1

I want to create a special type of object, based on the pandas.DataFrame object, which will always be created based on a particular input file type.

I have been able to design a class that can be created the same way as a normal DataFrame, i.e.:

class CustomDF(pd.DataFrame):
 ...
Obj = CustomDF({'a':[1,2],'b':[3,4]})

But I want to change the initialization behaviour to accept a csv filename and import it. I know Pandas allows this using:

df = pd.read_csv(filename)

But I can't get it to work within my new class when I do:

class CustomDF(pd.DataFrame):
 def __init__(self, filename):
 self = pd.read_csv(filename)

And although there is no error when I create an object with this class, I do get the error 'CustomDF' object has no attribute '_data' when trying to access it.

I have tried changing self = pd.read_csv(filename) to self._data = pd.read_csv(filename) or self.data = pd.read_csv(filename) but this doesn't have any effect.

What is the proper way to accomplish this? Is there a better approach to doing this same thing?

asked Nov 29, 2018 at 6:31

1 Answer 1

1

I tried your code and the error I got was

main:4: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access

This basically means that pandas has an issue when you do the following step

self.data = pd.read_csv(filename)

The reason is that your CustomDF is a dataframe. Think of a dataframe df. When you df.someColumnName, you get the values of that column. when you try to do something like df.someColumnName = something, you are trying to create a new column. You cannot create a column like that.

I removed the inheritance of pd.DataFrame in CustomDF and it works fine.

import pandas as pd 
class CustomDF():
 def __init__(self, filename):
 self.data = pd.read_csv(filename)
csdf = CustomDF("breast_cancer_wisconsin.csv")
answered Nov 29, 2018 at 6:42

3 Comments

Thanks for looking into that. I tried the solution and the main problem now would be that if I do df_custom = CustomDF('./myfile.csv'), then df_custom is not my object like I would hope, but I would have to access it using df_custom.data instead, right?
Yes, you are right. All the pd.DataFrame properties will be available in df_custom.data instead.
OK, so I guess I'm wondering, how can I get df_custom to be the object I am initializing? That's why I thought I would need to use inheritance in this case.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.