Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

BUG (string dtype): comparison of string column to mixed object column fails #60228

Open
Assignees
Labels
Bug Numeric OperationsArithmetic, Comparison, and Logical operations StringsString extension data type and string data
Milestone
@jorisvandenbossche

Description

At the moment you can freely compare with mixed object dtype column:

>>> ser_string = pd.Series(["a", "b"])
>>> ser_mixed = pd.Series([1, "b"])
>>> ser_string == ser_mixed
0 False
1 True
dtype: bool

But with the string dtype enabled (using pyarrow), this now raises an error:

>>> pd.options.future.infer_string = True
>>> ser_string = pd.Series(["a", "b"])
>>> ser_mixed = pd.Series([1, "b"])
>>> ser_string == ser_mixed
...
File ~/scipy/repos/pandas/pandas/core/arrays/arrow/array.py:510, in ArrowExtensionArray._box_pa_array(cls, value, pa_type, copy)
...
--> 510 pa_array = pa.array(value, from_pandas=True)
...
ArrowInvalid: Could not convert 'b' with type str: tried to convert to int64

This happens because the ArrowEA tries to convert the other operand to Arrow as well, which fails for mixed types.

In general, I think our rule is that == comparison never fails, but then just gives False for when values are not comparable.

Metadata

Metadata

Labels

Bug Numeric OperationsArithmetic, Comparison, and Logical operations StringsString extension data type and string data

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

    AltStyle によって変換されたページ (->オリジナル) /