Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

only interchange necessary columns #4286

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
alexcjohnson merged 5 commits into plotly:master from MarcoGorelli:dont-convert-everything
Jul 21, 2023

Conversation

@MarcoGorelli
Copy link
Contributor

@MarcoGorelli MarcoGorelli commented Jul 19, 2023

Trying to address this comment: #3901 (comment)

If it's not a wide plot, then only interchange the columns which are needed

@MarcoGorelli MarcoGorelli marked this pull request as ready for review July 19, 2023 17:13
Copy link
Contributor Author

🤔 bit confused by the CI failure, it fails on "install chrome driver"?

Copy link
Collaborator

@LiamConnors would you mind making another "pin chrome" PR in this repo? (Not in this PR but so this and other PRs can update and succeed again!)

LiamConnors reacted with thumbs up emoji

Copy link
Contributor

Wow, I'm excited that someone is biting the bullet on this one, thank you!

Would be nice for this to be reused for the jankier to_pandas() path as well if possible too, for the shorter term :)

Copy link
Member

@LiamConnors would you mind making another "pin chrome" PR in this repo? (Not in this PR but so this and other PRs can update and succeed again!)

Opened a PR here to fix it: #4288 @alexcjohnson

alexcjohnson reacted with rocket emoji

Copy link
Contributor Author

MarcoGorelli commented Jul 19, 2023
edited
Loading

Would be nice for this to be reused for the jankier to_pandas() path as well if possible too, for the shorter term :)

Sure, but there's no guarantee of what the API to do that would be, right? Before having called to_pandas, the object could in theory be anything, with any API to select columns by name. (I might be missing something though, sorry)

df_pandas = df_not_pandas.to_pandas()
args["data_frame"] = df_pandas
args["data_frame"] = df_not_pandas.__dataframe__()
columns = args["data_frame"].column_names()
Copy link
Collaborator

@alexcjohnson alexcjohnson Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we be 100% sure anything returned by __dataframe__() will have column_names and select_columns_by_name methods? If there's any chance an object will come in with either of these missing we should fall back on interchanging the whole thing up front.

Copy link
Contributor Author

@MarcoGorelli MarcoGorelli Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

@alexcjohnson alexcjohnson Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I know they're in the spec, but I also know not everyone follows a spec to the letter 😉

MarcoGorelli reacted with laugh emoji
Copy link
Contributor Author

@MarcoGorelli MarcoGorelli Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can help highlight shortcomings in their implementation then 😉 I tried it out with polars and it works fine there

Copy link
Collaborator

@alexcjohnson alexcjohnson Jul 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we'll know how to respond when we see:
AttributeError: 'MyDataFrame' object has no attribute 'select_columns_by_name'
And in principle you're right that it's not our problem, but we'll be the ones responding to the issue and having to tell our users "don't use this dataframe directly until they fix it." Whereas if we caught this case explicitly we could emit a warning like "This dataframe only partially implements the dataframe interchange protocol. Falling back on a slower full-copy algorithm" so it wouldn't affect usage in px, only performance, and it would be clear where the issue needs to be raised.

Copy link
Contributor Author

@MarcoGorelli MarcoGorelli Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for explaining - OK I've added a condition so it'll only use select_columns_by_name if that attribute is present

Copy link
Collaborator

@alexcjohnson alexcjohnson Jul 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. I took a look at also adding a fallback for missing column_names and that would be pretty awkward... but if someone has a partial implementation of the protocol presumably column_names is an easy piece so would get included early, whereas select_columns_by_name could be trickier. So let's leave it as you have it now. Thanks!

Copy link
Contributor Author

@MarcoGorelli MarcoGorelli Jul 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah if they don't have column_names then from_dataframe wouldn't work either, as it uses that internally

https://github.com/pandas-dev/pandas/blob/92792ec063031ae41443dabeb9d12f8daaac3ef1/pandas/core/interchange/from_dataframe.py#L112

Copy link
Contributor

Sure, but there's no guarantee of what the API to do that would be, right? Before having called to_pandas, the object could in theory be anything, with any API to select columns by name. (I might be missing something though, sorry)

Heh, no, I think it's me that's forgotten that this is exactly why we have the data-interchange protocol, you're right ;)

alexcjohnson and MarcoGorelli reacted with laugh emoji

Copy link
Collaborator

@alexcjohnson alexcjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💃 Great work @MarcoGorelli, lovely tests!

Reviewers

@alexcjohnson alexcjohnson alexcjohnson approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /