Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

BUG: Fix incorrect FutureWarning for logical ops on pyarrow bool Series (#62260) #62290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

Copy link

@Tarun2605 Tarun2605 commented Sep 7, 2025
edited
Loading

This pull request introduces support for Kleene's three-valued logic (handling True, False, and NA) in pandas logical operations on arrays containing missing values. The main changes include new helper functions to safely evaluate boolean logic with missing values, modifications to the logical operation implementation, and updates to tests to reflect the new behavior.

Enhancements to logical operations with missing values:

  • Added is_nullable_bool, safe_is_true, and alignOutputWithKleene helper functions in pandas/core/ops/array_ops.py to enable elementwise logical operations that correctly handle NA values using Kleene logic.
  • Modified logical_op in pandas/core/ops/array_ops.py to use Kleene logic when both operands are boolean arrays (possibly with NA), ensuring correct propagation of unknowns.

Test updates for new logic:

  • Updated expected results in TestDataFrameLogicalOperators in pandas/tests/frame/test_logical_ops.py to match the new Kleene logic semantics, where logical operations with NA now return NA or propagate True/False according to Kleene's rules.
  • Adjusted the test_logical_with_nas test to expect results consistent with Kleene logic, ensuring that logical operations involving NA and True/False yield the correct outcomes.- [x] closes BUG: Incorrect Future warning using a logical operation between two pyarrow boolean series #62260
  • tests added / passed
  • passes pre-commit run --all-files
  • whatsnew entry

What does this PR change?

This pull request updates the Arrow extension array implementation in pandas to improve how missing values are handled during logical operations on bool.

Previously, operations like & or | between two bool series was not following Kleene's Logic.

How was this fixed?

  • Implemented my own alignWithKleene function to align the output of the logical operation with Kleene principle

Tarun2605 and others added 10 commits September 7, 2025 18:11
the dosctring check required me to return a true or false bool only
...manual pre-commit hooks (pull_request)Failing after 16m
@simonjayhawkins simonjayhawkins added Bug Numeric Operations Arithmetic, Comparison, and Logical operations Arrow pyarrow functionality labels Sep 10, 2025
Copy link
Author

Kindly review my PR for improving the logical operation of arrays and aligning them with Kleene's Principles. Please tell me if any issues. Finally heading out to touch some grass :)

Copy link
Author

Tarun2605 commented Sep 11, 2025
edited
Loading

@simonjayhawkins Kindly review my PR and let me know 👍 (If in testing phase or not)

Copy link
Contributor

This Wikipedia article may be useful to understand the changes in this PR: https://en.wikipedia.org/wiki/Three-valued_logic#Kleene_and_Priest_logics

It's also important to point-out that this is a breaking change and should have an entry in doc/source/whatsnew/v3.0.0.rst.

Copy link
Contributor

Also, bool[pyarrow] already implements 3VL

import pandas as pd
index = list("FUT")
a = pd.Series([False, None, True], index=index, dtype="bool[pyarrow]")
t = pd.Series([True] * 3, index=index, dtype = "bool[pyarrow]")
u = pd.Series([None] * 3, index=index, dtype = "bool[pyarrow]")
f = pd.Series([False] * 3, index=index, dtype = "bool[pyarrow]")
print("negation")
print(~a)
methods = ["__and__", "__or__", "__xor__"]
for method in methods:
 print(method)
 fn = getattr(a, method)
 observed = pd.DataFrame(dict(F=fn(f), U=fn(u), T=fn(t)), index=index)
 print(observed)

Output

negation
F True
U <NA>
T False
dtype: bool[pyarrow]
__and__
 F U T
F False False False
U False <NA> <NA>
T False <NA> True
__or__
 F U T
F False <NA> True
U <NA> <NA> True
T True True True
__xor__
 F U T
F False <NA> True
U <NA> <NA> <NA>
T True <NA> False

Copy link
Author

@Alvaro-Kothe Yes I agree, the pyarrow implementation follows the kleene principle whereas bool does not. Thank you for attaching the wiki article 👍 The core members of pandas lib are the ones who fill the whatsnew right?

Copy link
Contributor

@Tarun2605 Usually, whoever creates the pull request should fill the whatsnew

Copy link
Author

@Alvaro-Kothe Ohhhh!! Could you please tell me where do I fill it out? Thank you so much btw man

Copy link
Contributor

Utilize your own discretion or await guidance from a core member.

Copy link
Author

Okkk

Copy link
Author

@simonjayhawkins Kindly review my PR or have another member review it please

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tarun2605 for the PR.

This PR is changing tested behavior.

#62260 (comment) states

For the result on main / pandas 3.0, this actually seems correct to me.

so i'm not expecting to see any changed behavior other than w.r.t the warning

FutureWarning: Operation between non boolean Series with different indexes will no longer return a boolean result in a future version. Cast both Series to object type to maintain the prior behavior.

Copy link
Author

Tarun2605 commented Sep 16, 2025
edited
Loading

Thanks @Tarun2605 for the PR.

This PR is changing tested behavior.

#62260 (comment) states

For the result on main / pandas 3.0, this actually seems correct to me.

so i'm not expecting to see any changed behavior other than w.r.t the warning

FutureWarning: Operation between non boolean Series with different indexes will no longer return a boolean result in a future version. Cast both Series to object type to maintain the prior behavior.

Thank you sob much for your reply.
Yes, I was initially set out to change that only but then i noticed how kleene's principle was not being followed bool arrays. So should we let this inconsistency go with null values?

Copy link
Author

Also, bool[pyarrow] already implements 3VL

import pandas as pd
index = list("FUT")
a = pd.Series([False, None, True], index=index, dtype="bool[pyarrow]")
t = pd.Series([True] * 3, index=index, dtype = "bool[pyarrow]")
u = pd.Series([None] * 3, index=index, dtype = "bool[pyarrow]")
f = pd.Series([False] * 3, index=index, dtype = "bool[pyarrow]")
print("negation")
print(~a)
methods = ["__and__", "__or__", "__xor__"]
for method in methods:
 print(method)
 fn = getattr(a, method)
 observed = pd.DataFrame(dict(F=fn(f), U=fn(u), T=fn(t)), index=index)
 print(observed)

Output

negation
F True
U <NA>
T False
dtype: bool[pyarrow]
__and__
 F U T
F False False False
U False <NA> <NA>
T False <NA> True
__or__
 F U T
F False <NA> True
U <NA> <NA> True
T True True True
__xor__
 F U T
F False <NA> True
U <NA> <NA> <NA>
T True <NA> False

As nicely stated these examples will break or give wrong values when the array d type is converted to bool. Thats the inconsistency i was trying to solve.
Thank you

Copy link
Member

Thats the inconsistency i was trying to solve.

If this PR is not addressing the FutureWarning then please remove the link to that issue.

Can you open an specific issue to discuss this proposal or link to an existing issue rather than us discussing on a PR. PRs can get closed or go stale and the discussion then has less visibility.

Copy link
Author

Thats the inconsistency i was trying to solve.

If this PR is not addressing the FutureWarning then please remove the link to that issue.

Can you open an specific issue to discuss this proposal or link to an existing issue rather than us discussing on a PR. PRs can get closed or go stale and the discussion then has less visibility.

I see, ok sure I will open an issue and have discussion and then raise my PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Reviewers

@simonjayhawkins simonjayhawkins simonjayhawkins requested changes

Assignees
No one assigned
Labels
Arrow pyarrow functionality Bug Numeric Operations Arithmetic, Comparison, and Logical operations
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

BUG: Incorrect Future warning using a logical operation between two pyarrow boolean series

AltStyle によって変換されたページ (->オリジナル) /