gh-116738: Use PyMutex for lzma module #140711

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

yoney wants to merge 1 commit into python:main

from yoney:ft_lzma

Open

gh-116738: Use PyMutex for lzma module #140711

yoney wants to merge 1 commit into python:main from yoney:ft_lzma

+69 −33

Conversation

@yoney

Copy link

Contributor

@yoney yoney commented Oct 28, 2025 •

edited

Loading

Similar to #140555, the main goal was to review the lzma module for free-threading. The methods already use a lock, which makes them thread-safe in a free-threaded build. I replaced PyThread_acquire_lock with PyMutex. PyMutex releases the GIL when the thread is parked. This change removes some macros and allocation handling code.

cc: @mpage @colesbury @emmatyping

Issue: Audit all built-in modules for thread safety #116738

@yoney


 pythongh-116738 : Use PyMutex for lzma module

ef23321

@bedevere-app bedevere-app bot mentioned this pull request

Oct 28, 2025

Audit all built-in modules for thread safety #116738

Open

@yoney yoney marked this pull request as ready for review

October 28, 2025 16:39

@bedevere-app bedevere-app bot added the awaiting review label

Oct 28, 2025

ashm-dev

ashm-dev reviewed

Oct 28, 2025

View reviewed changes

Lib/test/test_free_threading/test_lzma.py

def worker():

# it should return empty bytes as it buffers data internally

data = lzc.compress(INPUT)

self.assertEqual(data, b"")

Copy link

Contributor

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assertion self.assertEqual(data, b"") is flaky. In free-threaded mode, compress() may return data chunks non-deterministically due to race conditions in internal buffering.

Copy link

Contributor Author

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashm-dev Thanks for your comment. I’m trying to verify/test the mutex is protecting the internal state and buffering, so there shouldn’t be a race condition. Could you please explain which race condition you mean? That would help me understand your point better.

Copy link

Member

@ZeroIntensity ZeroIntensity Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashm-dev Are you using ChatGPT or another LLM to review for you? If so, please don't -- it's not helpful. If not, please try to be clearer in your responses.

ashm-dev

ashm-dev reviewed

Oct 28, 2025

View reviewed changes

Lib/test/test_free_threading/test_lzma.py

def worker():

data = lzd.decompress(compressed, chunk_size)

self.assertEqual(len(data), chunk_size)

output.append(data)

Copy link

Contributor

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output.append(data) without synchronization causes race conditions in free-threaded mode, potentially losing data or corrupting the list.

Copy link

Contributor Author

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

output.append(data) without synchronization causes race conditions in free-threaded mode, potentially losing data or corrupting the list.

@ashm-dev list is thread safe in free-threaded build.

Copy link

Contributor

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the free-threaded build, list operations use internal locks to avoid crashes, but thread safety isn’t guaranteed for concurrent mutations — see Python free-threading HOWTO.

Copy link

Contributor Author

@yoney yoney Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may be some misunderstanding, could you please check the list test code below?

cpython/Lib/test/test_free_threading/test_list.py

Lines 20 to 28 in a3ce2f7

def test_racing_iter_append(self):

l = []

barrier = Barrier(NTHREAD + 1)

def writer_func(l):

barrier.wait()

for i in range(OBJECT_COUNT):

l.append(C(i + OBJECT_COUNT))

@ashm-dev I’ve tried to address all your comments, but some are still unclear to me. Could you please clarify or resolve them? Thank you!

ashm-dev

ashm-dev reviewed

Oct 28, 2025

View reviewed changes

Lib/test/test_free_threading/test_lzma.py

def worker():

data = lzd.decompress(compressed, chunk_size)

self.assertEqual(len(data), chunk_size)

Copy link

Contributor

@ashm-dev ashm-dev Oct 28, 2025 •

edited

Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.assertEqual(len(data), chunk_size) is wrong. decompress() may return less than max_length bytes.

Copy link

Contributor Author

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashm-dev I agree that decompress() can return less than max_length if there isn’t enough input. In this test, I’m providing input that should produce at least max_length bytes. Is there anything else I might be missing? If I give enough valid input, is there any reason why lzma wouldn’t return max_length?

There are other tests making similar assumptions.

cpython/Lib/test/test_lzma.py

Lines 164 to 169 in ce4b0ed

# Feed first half the input

len_ = len(COMPRESSED_XZ) // 2

out.append(lzd.decompress(COMPRESSED_XZ[:len_],

max_length=max_length))

self.assertFalse(lzd.needs_input)

self.assertEqual(len(out[-1]), max_length)

@mpage mpage requested review from colesbury, emmatyping and mpage

October 28, 2025 20:21

@mpage mpage added the skip news label

Oct 28, 2025

@kumaraditya303 kumaraditya303 added the topic-free-threading label

Oct 29, 2025

Labels

awaiting review skip news topic-free-threading

5 participants

@yoney @ZeroIntensity @ashm-dev @mpage @kumaraditya303

Uh oh!

gh-116738: Use PyMutex for lzma module #140711

Are you sure you want to change the base?

gh-116738: Use PyMutex for lzma module #140711

Conversation

@yoney yoney commented Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@ZeroIntensity ZeroIntensity Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@ashm-dev ashm-dev Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

@yoney yoney Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

@ashm-dev ashm-dev Oct 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

@yoney yoney Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

@yoney yoney commented Oct 28, 2025 •

edited

Loading

@ashm-dev ashm-dev Oct 28, 2025 •

edited

Loading