Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

TYP: how to annotate DataFrame.__getitem__ #46616

Open
Labels
Needs DiscussionRequires discussion from core team before further action Typingtype annotations, mypy/pyright type checking
@twoertwein

Description

I was about to create a PR but I realized that annotating DataFrame.__getitem__ might be impossible without making some simplifications.

There are two problems with __getitem__:

  • We allow any Hashable key (one would expect that this always returns a Series) but slice (is Hashable) returns a DataFrame
  • Columns can be a multiindex, df["a"] can return a DataFrame.

The MS stubs seems to make two assumptions: 1) columns can only be of type str (and maybe a few more types - but not Hashable) and 2) multiindex doesn't exist. In practice, this will cover almost all cases.

I don't think there is a solution for the multiindex issue. Even if we make DataFrame generic to carry the type of the column index, there is no Not[Multiindex] type, so we will always end up with incompatible & overlapping overloads.

The Hashable issue can partly be addressed:

# cover most common cases that return a Series
@overloads
def __getitem__(self, key :Scalar) -> Series:
 ...
# cover most common cases that return a DataFrame
@overloads
def __getitem__(self, key : list[HashableT] | np.ndarray | slice | Index | Series) -> DataFrame:
 ...
# everything else
@overloads
def __getitem__(self, key : Hashable) -> Any: # or Series | DataFrame (but might create many errors, typshed also uses Any in some cases to avoid unions)
 ...

Do you see a way to cover all cases of __getitem__ and if not which assumptions are you willing to make? @simonjayhawkins @Dr-Irv

Metadata

Metadata

Assignees

No one assigned

    Labels

    Needs DiscussionRequires discussion from core team before further action Typingtype annotations, mypy/pyright type checking

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

      Relationships

      None yet

      Development

      No branches or pull requests

      Issue actions

        AltStyle によって変換されたページ (->オリジナル) /