musl/src/malloc/malloc.c, branch master musl - an implementation of the standard library for Linux-based systems move oldmalloc to its own directory under src/malloc 2020年06月03日T23:23:02+00:00 Rich Felker dalias@aerifal.cx 2020年06月03日T23:22:12+00:00 384c0131ccda2656dec23a0416ad3f14101151a7 this sets the stage for replacement, and makes it practical to keep oldmalloc around as a build option for a while if that ends up being useful. only the files which are actually part of the implementation are moved. memalign and posix_memalign are entirely generic. in theory calloc could be pulled out too, but it's useful to have it tied to the implementation so as to optimize out unnecessary memset when implementation details make it possible to know the memory is already clear.
this sets the stage for replacement, and makes it practical to keep
oldmalloc around as a build option for a while if that ends up being
useful.
only the files which are actually part of the implementation are
moved. memalign and posix_memalign are entirely generic. in theory
calloc could be pulled out too, but it's useful to have it tied to the
implementation so as to optimize out unnecessary memset when
implementation details make it possible to know the memory is already
clear.
move __expand_heap into malloc.c 2020年06月03日T23:17:19+00:00 Rich Felker dalias@aerifal.cx 2020年06月03日T23:17:19+00:00 eaa0f2496700c238e7e3c112d36445f3aee06ff1 this function is no longer used elsewhere, and moving it reduces the number of source files specific to the malloc implementation.
this function is no longer used elsewhere, and moving it reduces the
number of source files specific to the malloc implementation.
fix unbounded heap expansion race in malloc 2020年06月02日T23:39:37+00:00 Rich Felker dalias@aerifal.cx 2020年06月02日T21:37:14+00:00 3e16313f8fe2ed143ae0267fd79d63014c24779f this has been a longstanding issue reported many times over the years, with it becoming increasingly clear that it could be hit in practice. under concurrent malloc and free from multiple threads, it's possible to hit usage patterns where unbounded amounts of new memory are obtained via brk/mmap despite the total nominal usage being small and bounded. the underlying cause is that, as a fundamental consequence of keeping locking as fine-grained as possible, the state where free has unbinned an already-free chunk to merge it with a newly-freed one, but has not yet re-binned the combined chunk, is exposed to other threads. this is bad even with small chunks, and leads to suboptimal use of memory, but where it really blows up is where the already-freed chunk in question is the large free region "at the top of the heap". in this situation, other threads momentarily see a state of having almost no free memory, and conclude that they need to obtain more. as far as I can tell there is no fix for this that does not harm performance. the fix made here forces all split/merge of free chunks to take place under a single lock, which also takes the place of the old free_lock, being held at least momentarily at the time of free to determine whether there are neighboring free chunks that need merging. as a consequence, the pretrim, alloc_fwd, and alloc_rev operations no longer make sense and are deleted. simplified merging now takes place inline in free (__bin_chunk) and realloc. as commented in the source, holding the split_merge_lock precludes any chunk transition from in-use to free state. for the most part, it also precludes change to chunk header sizes. however, __memalign may still modify the sizes of an in-use chunk to split it into two in-use chunks. arguably this should require holding the split_merge_lock, but that would necessitate refactoring to expose it externally, which is a mess. and it turns out not to be necessary, at least assuming the existing sloppy memory model malloc has been using, because if free (__bin_chunk) or realloc sees any unsynchronized change to the size, it will also see the in-use bit being set, and thereby can't do anything with the neighboring chunk that changed size.
this has been a longstanding issue reported many times over the years,
with it becoming increasingly clear that it could be hit in practice.
under concurrent malloc and free from multiple threads, it's possible
to hit usage patterns where unbounded amounts of new memory are
obtained via brk/mmap despite the total nominal usage being small and
bounded.
the underlying cause is that, as a fundamental consequence of keeping
locking as fine-grained as possible, the state where free has unbinned
an already-free chunk to merge it with a newly-freed one, but has not
yet re-binned the combined chunk, is exposed to other threads. this is
bad even with small chunks, and leads to suboptimal use of memory, but
where it really blows up is where the already-freed chunk in question
is the large free region "at the top of the heap". in this situation,
other threads momentarily see a state of having almost no free memory,
and conclude that they need to obtain more.
as far as I can tell there is no fix for this that does not harm
performance. the fix made here forces all split/merge of free chunks
to take place under a single lock, which also takes the place of the
old free_lock, being held at least momentarily at the time of free to
determine whether there are neighboring free chunks that need merging.
as a consequence, the pretrim, alloc_fwd, and alloc_rev operations no
longer make sense and are deleted. simplified merging now takes place
inline in free (__bin_chunk) and realloc.
as commented in the source, holding the split_merge_lock precludes any
chunk transition from in-use to free state. for the most part, it also
precludes change to chunk header sizes. however, __memalign may still
modify the sizes of an in-use chunk to split it into two in-use
chunks. arguably this should require holding the split_merge_lock, but
that would necessitate refactoring to expose it externally, which is a
mess. and it turns out not to be necessary, at least assuming the
existing sloppy memory model malloc has been using, because if free
(__bin_chunk) or realloc sees any unsynchronized change to the size,
it will also see the in-use bit being set, and thereby can't do
anything with the neighboring chunk that changed size.
restore lock-skipping for processes that return to single-threaded state 2020年05月22日T21:45:47+00:00 Rich Felker dalias@aerifal.cx 2020年05月22日T21:45:47+00:00 8d81ba8c0bc6fe31136cb15c9c82ef4c24965040 the design used here relies on the barrier provided by the first lock operation after the process returns to single-threaded state to synchronize with actions by the last thread that exited. by storing the intent to change modes in the same object used to detect whether locking is needed, it's possible to avoid an extra (possibly costly) memory load after the lock is taken.
the design used here relies on the barrier provided by the first lock
operation after the process returns to single-threaded state to
synchronize with actions by the last thread that exited. by storing
the intent to change modes in the same object used to detect whether
locking is needed, it's possible to avoid an extra (possibly costly)
memory load after the lock is taken.
don't use libc.threads_minus_1 as relaxed atomic for skipping locks 2020年05月22日T21:39:57+00:00 Rich Felker dalias@aerifal.cx 2020年05月22日T03:32:45+00:00 e01b5939b38aea5ecbe41670643199825874b26c after all but the last thread exits, the next thread to observe libc.threads_minus_1==0 and conclude that it can skip locking fails to synchronize with any changes to memory that were made by the last-exiting thread. this can produce data races. on some archs, at least x86, memory synchronization is unlikely to be a problem; however, with the inline locks in malloc, skipping the lock also eliminated the compiler barrier, and caused code that needed to re-check chunk in-use bits after obtaining the lock to reuse a stale value, possibly from before the process became single-threaded. this in turn produced corruption of the heap state. some uses of libc.threads_minus_1 remain, especially for allocation of new TLS in the dynamic linker; otherwise, it could be removed entirely. it's made non-volatile to reflect that the remaining accesses are only made under lock on the thread list. instead of libc.threads_minus_1, libc.threaded is now used for skipping locks. the difference is that libc.threaded is permanently true once an additional thread has been created. this will produce some performance regression in processes that are mostly single-threaded but occasionally creating threads. in the future it may be possible to bring back the full lock-skipping, but more care needs to be taken to produce a safe design.
after all but the last thread exits, the next thread to observe
libc.threads_minus_1==0 and conclude that it can skip locking fails to
synchronize with any changes to memory that were made by the
last-exiting thread. this can produce data races.
on some archs, at least x86, memory synchronization is unlikely to be
a problem; however, with the inline locks in malloc, skipping the lock
also eliminated the compiler barrier, and caused code that needed to
re-check chunk in-use bits after obtaining the lock to reuse a stale
value, possibly from before the process became single-threaded. this
in turn produced corruption of the heap state.
some uses of libc.threads_minus_1 remain, especially for allocation of
new TLS in the dynamic linker; otherwise, it could be removed
entirely. it's made non-volatile to reflect that the remaining
accesses are only made under lock on the thread list.
instead of libc.threads_minus_1, libc.threaded is now used for
skipping locks. the difference is that libc.threaded is permanently
true once an additional thread has been created. this will produce
some performance regression in processes that are mostly
single-threaded but occasionally creating threads. in the future it
may be possible to bring back the full lock-skipping, but more care
needs to be taken to produce a safe design.
move declarations for malloc internals to malloc_impl.h 2018年09月12日T18:34:28+00:00 Rich Felker dalias@aerifal.cx 2018年09月06日T20:32:49+00:00 55a1c9c89028f8930e5f65fe5484fa7ba0e18853
reintroduce hardening against partially-replaced allocator 2018年04月20日T02:22:11+00:00 Rich Felker dalias@aerifal.cx 2018年04月20日T02:19:29+00:00 b4b1e10364c8737a632be61582e05a8d3acf5690 commit 618b18c78e33acfe54a4434e91aa57b8e171df89 removed the previous detection and hardening since it was incorrect. commit 72141795d4edd17f88da192447395a48444afa10 already handled all that remained for hardening the static-linked case. in the dynamic-linked case, have the dynamic linker check whether malloc was replaced and make that information available. with these changes, the properties documented in commit c9f415d7ea2dace5bf77f6518b6afc36bb7a5732 are restored: if calloc is not provided, it will behave as malloc+memset, and any of the memalign-family functions not provided will fail with ENOMEM.
commit 618b18c78e33acfe54a4434e91aa57b8e171df89 removed the previous
detection and hardening since it was incorrect. commit
72141795d4edd17f88da192447395a48444afa10 already handled all that
remained for hardening the static-linked case. in the dynamic-linked
case, have the dynamic linker check whether malloc was replaced and
make that information available.
with these changes, the properties documented in commit
c9f415d7ea2dace5bf77f6518b6afc36bb7a5732 are restored: if calloc is
not provided, it will behave as malloc+memset, and any of the
memalign-family functions not provided will fail with ENOMEM.
return chunks split off by memalign using __bin_chunk instead of free 2018年04月20日T00:56:26+00:00 Rich Felker dalias@aerifal.cx 2018年04月20日T00:56:26+00:00 72141795d4edd17f88da192447395a48444afa10 this change serves multiple purposes: 1. it ensures that static linking of memalign-family functions will pull in the system malloc implementation, thereby causing link errors if an attempt is made to link the system memalign functions with a replacement malloc (incomplete allocator replacement). 2. it eliminates calls to free that are unpaired with allocations, which are confusing when setting breakpoints or tracing execution. as a bonus, making __bin_chunk external may discourage aggressive and unnecessary inlining of it.
this change serves multiple purposes:
1. it ensures that static linking of memalign-family functions will
pull in the system malloc implementation, thereby causing link errors
if an attempt is made to link the system memalign functions with a
replacement malloc (incomplete allocator replacement).
2. it eliminates calls to free that are unpaired with allocations,
which are confusing when setting breakpoints or tracing execution.
as a bonus, making __bin_chunk external may discourage aggressive and
unnecessary inlining of it.
move malloc implementation types and macros to an internal header 2018年04月19日T22:44:17+00:00 Rich Felker dalias@aerifal.cx 2018年04月19日T22:43:05+00:00 23389b1988b061e8487c316893a8a8eb77770a2f
revert detection of partially-replaced allocator 2018年04月19日T19:25:48+00:00 Rich Felker dalias@aerifal.cx 2018年04月19日T19:25:48+00:00 618b18c78e33acfe54a4434e91aa57b8e171df89 commit c9f415d7ea2dace5bf77f6518b6afc36bb7a5732 included checks to make calloc fallback to memset if used with a replaced malloc that didn't also replace calloc, and the memalign family fail if free has been replaced. however, the checks gave false positives for replacement whenever malloc or free resolved to a PLT entry in the main program. for now, disable the checks so as not to leave libc in a broken state. this means that the properties documented in the above commit are no longer satisfied; failure to replace calloc and the memalign family along with malloc is unsafe if they are ever called. the calloc checks were correct but useless for static linking. in both cases (simple or full malloc), calloc and malloc are in a source file together, so replacement of one but not the other would give linking errors. the memalign-family check was useful for static linking, but broken for dynamic as described above, and can be replaced with a better link-time check.
commit c9f415d7ea2dace5bf77f6518b6afc36bb7a5732 included checks to
make calloc fallback to memset if used with a replaced malloc that
didn't also replace calloc, and the memalign family fail if free has
been replaced. however, the checks gave false positives for
replacement whenever malloc or free resolved to a PLT entry in the
main program.
for now, disable the checks so as not to leave libc in a broken state.
this means that the properties documented in the above commit are no
longer satisfied; failure to replace calloc and the memalign family
along with malloc is unsafe if they are ever called.
the calloc checks were correct but useless for static linking. in both
cases (simple or full malloc), calloc and malloc are in a source file
together, so replacement of one but not the other would give linking
errors. the memalign-family check was useful for static linking, but
broken for dynamic as described above, and can be replaced with a
better link-time check.

AltStyle によって変換されたページ (->オリジナル) /