Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[mono] Fix lock-free mempool chunk under-allocation by 8 bytes#129843

Merged
pavelsavara merged 3 commits into
dotnet:main from
pavelsavara:mono-lockfree-mempool-chunk-align
Jun 25, 2026
Merged

[mono] Fix lock-free mempool chunk under-allocation by 8 bytes #129843
pavelsavara merged 3 commits into
dotnet:main from
pavelsavara:mono-lockfree-mempool-chunk-align

Conversation

@pavelsavara

@pavelsavara pavelsavara commented Jun 25, 2026

Copy link
Copy Markdown
Member

Summary

lock_free_mempool_chunk_new (src/mono/mono/metadata/memory-manager.c) under-sizes each chunk by 8 bytes, which can trip the assert in lock_free_mempool_alloc0 and abort the runtime.

The sizing loop reserves sizeof(LockFreeMempoolChunk) (24 bytes on 64-bit):

size = mono_pagesize ();
while (size - sizeof (LockFreeMempoolChunk) < GINT_TO_UINT(len))
 size += mono_pagesize ();
...
chunk->mem = ALIGN_PTR_TO ((char*)chunk + sizeof (LockFreeMempoolChunk), 16); // -> offset 32
chunk->size = ((char*)chunk + size) - chunk->mem; // = size - 32

But chunk->mem is then aligned up to 16 bytes, so the data starts at offset 32, and the usable chunk->size is size - 328 bytes less than the loop guaranteed. For a request whose length is congruent to -24 (mod pagesize) — e.g. 4072 on 4 KB pages, 16360 on 16 KB pages — the fresh chunk (pos == 0) has chunk->size < len, so the assert in lock_free_mempool_alloc0:

g_assert (chunk->pos + size <= GINT_TO_UINT(chunk->size));

fires and aborts the runtime with a SIGABRT.

Why it showed up as a flaky eventpipe crash

This pool is used only on the async AOT unwind / exception-info decode path (mono_aot_get_unwind_info / decode_exception_debug_info, the if (async) branch), which is driven by the EventPipe SampleProfiler stack walk. The crash therefore appeared intermittently as a SIGABRT during the tracing/eventpipe/providervalidation and rundownvalidation tests on the Mono LLVM full-AOT (x64) leg — it only triggers when the profiler asynchronously decodes a method whose unwind / EH-info blob happens to be exactly a boundary size. The code is pre-existing (it dates back many years) and is not specific to any recent change.

Fix

-	while (size - sizeof (LockFreeMempoolChunk) < GINT_TO_UINT(len))
+	while (size - ALIGN_TO (sizeof (LockFreeMempoolChunk), 16) < GINT_TO_UINT(len))

Reserve the 16-byte-aligned header size (32) in the sizing loop so the usable chunk->size always covers the request. mono_valloc returns page-aligned memory, so chunk->mem is always exactly chunk + 32; reserving ALIGN_TO(sizeof(LockFreeMempoolChunk), 16) makes the loop's guarantee match the chunk's real capacity.

Evidence

1. Deterministic reproduction of the under-sizing (standalone simulation of the exact chunk_new math):

sizeof(LockFreeMempoolChunk)=24
pagesize=4096 UNDERSIZED len=4072 chunk_size=4064 deficit=8
pagesize=4096 UNDERSIZED len=8168 chunk_size=8160 deficit=8
pagesize=16384 UNDERSIZED len=16360 chunk_size=16352 deficit=8

2. Reproduced the exact assert in a real Mono LLVM full-AOT runtime. An instrumented build plus a one-shot probe that requests pagesize-24 bytes from the real allocator (with the SampleProfiler active so the async path is live) reproduces the CI signature exactly:

lfm_chunk_new UNDERSIZED len=4072 allocsize=4096 chunk_size=4064 deficit=8
* Assertion at .../src/mono/mono/metadata/memory-manager.c, condition 'pos + size <= chunk->size' not met
Got a SIGABRT while executing native code.

3. After the fix (same probe and SampleProfiler workload, instrumentation kept only to observe):

PROBE pagesize=4096 probe_len=4072 chunk_size=8160 assert=OK
...
DONE (corerun exit 0, no assert)

The boundary request now gets the next page-multiple (8192 - 32 = 8160 >= 4072), and the SampleProfiler workload runs clean with all async allocations satisfied — including boundary-crossing ones (e.g. size=8208 -> chunk_size=12256).

Note

This pull request was authored with the assistance of GitHub Copilot.

lock_free_mempool_chunk_new sized each chunk by reserving
sizeof(LockFreeMempoolChunk) (24 bytes on 64-bit), but chunk->mem is then
aligned up to 16 bytes, so the usable region starts at offset 32 and a
single-page chunk only has pagesize-32 usable bytes. A request whose length
is congruent to -24 (mod pagesize) - e.g. 4072 on 4 KB pages, 16360 on 16 KB
pages - produced a freshly allocated chunk whose size (pagesize-32) is smaller
than the requested length, tripping
 g_assert (chunk->pos + size <= GINT_TO_UINT(chunk->size));
in lock_free_mempool_alloc0 and aborting the runtime.
This pool is used on the async AOT unwind/exception-info decode path
(mono_aot_get_unwind_info / decode_exception_debug_info with async==TRUE),
which is driven by the EventPipe SampleProfiler stack walk, so the crash
showed up intermittently as a SIGABRT during eventpipe tracing tests under
Mono LLVM full-AOT on x64.
Fix: reserve the 16-byte-aligned header size in the sizing loop so the chunk
always has room for the request.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adjusts the lock-free mempool chunk sizing logic in Mono so the chunk-capacity guarantee matches the allocator’s actual usable payload after the chunk header is pointer-aligned. This prevents lock-free allocations from tripping the g_assert in lock_free_mempool_alloc0 when allocations land on specific page-boundary sizes.

Changes:

  • Update the chunk-sizing loop in lock_free_mempool_chunk_new to reserve a 16-byte-aligned header size (ALIGN_TO(sizeof(LockFreeMempoolChunk), 16)) instead of the raw sizeof(...).
  • Ensure the computed chunk->size (derived from chunk->mem after ALIGN_PTR_TO(..., 16)) is always sufficient for the requested allocation size.

@pavelsavara pavelsavara enabled auto-merge (squash) June 25, 2026 10:48
@pavelsavara pavelsavara merged commit 3b98259 into dotnet:main Jun 25, 2026
77 checks passed
@pavelsavara pavelsavara deleted the mono-lockfree-mempool-chunk-align branch June 25, 2026 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

Copilot code review Copilot Copilot left review comments
@lewing lewing lewing approved these changes
@BrzVlad BrzVlad BrzVlad approved these changes
@thaystg thaystg Awaiting requested review from thaystg thaystg is a code owner
@steveisok steveisok Awaiting requested review from steveisok steveisok is a code owner
@vitek-karas vitek-karas Awaiting requested review from vitek-karas vitek-karas is a code owner
@mdh1418 mdh1418 Awaiting requested review from mdh1418
@davidnguyen-tech davidnguyen-tech Awaiting requested review from davidnguyen-tech

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

AltStyle によって変換されたページ (->オリジナル) /