2
\$\begingroup\$

I've got something really simple this time where I'm mapping pandas' Series to dataclasses with a oneliner helper function (as I have several models):

import pandas as pd
from typing import Any
from dataclasses import dataclass, fields
def create_dataclass(data: pd.Series, factory: Any) -> Any:
 return factory(**{f.name: data[f.name] for f in fields(factory)})

I call it like this:

@dataclass
class Person:
 first_name: str
 last_name: str
@dataclass
class Employee(Person):
 company: str
def create_employees(data: pd.DataFrame) -> List[Employee]:
 return [create_dataclass(r, Employee) for i, r in data.iterrows()]

Do you think it still could be more pythonic?

asked Mar 25, 2023 at 20:06
\$\endgroup\$

2 Answers 2

3
\$\begingroup\$

Looks very pythonic to me.

Thumbs up, LGTM, ship it!


Ok, fine, I have a few minor remarks.

Maybe the Any annotations could be finessed a bit to be more informative? Or maybe just drop the -> Any:.


 return factory(**{f.name: data[f.name] for f in fields(factory)})

The ** double star is as pythonic as it gets. But notice that what we really care about is name. So perhaps

from operator import attrgetter
 ...
 return factory(**{name: data[name] for name in map(attrgetter('name'), fields(factory))})

Hmmm, not sure that longer works out to a win. Prolly better to keep the code as-is.


 ... for _, r in data.iterrows()]

nit: Prefer row over r. Whatever.

Like I said, ship it.

answered Mar 26, 2023 at 1:40
\$\endgroup\$
1
\$\begingroup\$

This looks great, exactly what I need, I'm stealing it. :)

I'm using this in a base class PandasClass, with the functions that create dataclass instances as classmethods. In this way, any dataclass that inherits from PandasClass gets the create_employees (and similar) "for free".

from typing import List, Self
@dataclass
class PandasClass
 @classmethod
 def create_dataclass(cls, row: pd.Series) -> Self:
 return cls(**{f.name: row[f.name] for f in fields(cls)})
 @classmethod
 def create_dataclass_list(cls, dataframe: pd.DataFrame) -> List[Self]:
 return [cls.create_dataclass(row) for _, row in dataframe.iterrows()]

So in the OP's example, if Person inherits from PandasClass

class Person(PandasClass): 
...

then we can call Employee.create_dataclass_list(df) as the original create_employees(df).

answered May 28, 2024 at 17:40
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.