-
Notifications
You must be signed in to change notification settings - Fork 8k
Make some parts of _zend_mm_heap read-only at runtime #14570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I've talked to @bwoebi about this before and some profilers like DataDog depend on this.
Unfortunately this will break some useful functionality:
- https://github.com/arnaud-lb/php-memory-profiler uses the custom callbacks to hook into memory allocations, and I would expect most memory profilers to do that. Having ZEND_MM_CUSTOM enabled by default allows profilers to work on standard php builds. Proposing ZendMM Observer API #11758 proposed an alternative API for profilers, but it also relies on function pointers in the zend_mm_heap struct. One other way to hook into the allocator would be to override the _emalloc/_efree functions and their variants, but this seems much more involved, less easy to maintain, and I don't know if this is practicable. Maybe @bwoebi @realFlowControl have some insights on this?
- This will also break
USE_ZEND_ALLOC=0
, which is sometimes useful in non-debug builds for debugging/analysis
Instead of disabling the functionality we could mangle the function pointers with a secret. I believe that the glibc does this to protect atexit()
callbacks, but I don't know how effective this is. This would be defeated by learning the secret, but we could argue that if they have a primitive to leak the secret, they could find the address of an other function pointer as well.
Alternatively, we could store the custom callbacks in a separate readonly memory region. In zend_mm_init(), allocate two pages for the zend_mm_heap + the function pointers. mprotect() the second page. Unprotect when necessary, e.g. in zend_mm_set_custom_handlers(). As an additional benefit this page would double as a guard page.
We might want to protect zend_mm_heap.storage
as well, as it also points to a struct of function pointers.
I thought about having some parts of zend_mm_heap
made read-only, like done in other allocators like hardened_malloc, but it sounded like complexity and maybe performance impact.
But I agree, having function pointers as well as other read-only fields like shadow_key
, limit
, pid
, ... would be interesting.
I don't think disabling ZEND_MM_CUSTOM is a good thing for all the reasons Arnaud outlined. Overriding the _emalloc/_efree needs platform dependent work and is not easy.
The proposal to make the parts of the head readonly sounds more reasonable. After all, these need to only be initializaed once per GINIT, so that sounds pretty okay I think.
I thought about having some parts of
zend_mm_heap
made read-only, like done in other allocators like hardened_malloc, but it sounded like complexity and maybe performance impact.But I agree, having function pointers as well as other read-only fields like
shadow_key
,limit
,pid
, ... would be interesting.
I think if we only store custom_heap
and storage
in the readonly page, there will be no overhead as we never update these fields in the common case, and this will keep the change simple: We need to protect the page in zend_mm_init()
, which is called once per process/thread, and then we never need to unprotect it unless zend_mm_set_custom_handlers()
is used.
We can leave the use_custom_heap
boolean in zend_mm_heap
as changing it doesn't help exploitation, and I think moving it farther from other fields may have performance implications.
[...]
We can leave theuse_custom_heap
boolean inzend_mm_heap
as changing it doesn't help exploitation, and I think moving it farther from other fields may have performance implications.
The Datadog profiler relies on the use_custom_heap
boolean to be the first element in the zend_mm_heap
, in order to bypass side effects when installing a custom handler to observe, but forwarding the calls back to the same heap. I started some work to fix these side effects and/or give APIs that allow extension developer to work around this. As of now, we rely on this field being where it is and being writeable.
I changed the code to go the read-only way :)
d91c592
to
120967f
Compare
120967f
to
96cbe36
Compare
We have various other function pointers not living in .rodata. Examples are zend_execute_ex
and object->handlers
(with another level of indirection). zend_execute_ex
could be addressed in a similar manner, but I don't see how object->handlers
can. Does it make sense to fix some of those but leave others? Are some particularly problematic, compared to others?
@iluuu1994 These pointers in zend_mm_heap are a particularly interesting for attackers because they can learn their address trivially. We can protect them very effectively and without overhead, so it seems beneficial to do so.
An other dangerous one, similar to object handlers, is the array dtor function, as attackers have plenty of ways to place arrays on the heap. It would be great if we could protect those too. I'm not sure how to, either. Mangling/xor would increase difficulty a bit, but I expect a performance regression especially for objects.
96cbe36
to
d25588c
Compare
I suppose, this can't be compatible with USE_ZEND_ALLOC_HUGE_PAGES
.
I suppose, this can't be compatible with USE_ZEND_ALLOC_HUGE_PAGES.
Why not? It'll simply use a bit more memory.
d25588c
to
25a3663
Compare
c1dd1ee
to
44bc6fd
Compare
3aca506
to
9044d3e
Compare
9044d3e
to
7bd296d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM apart from a few minor comments
As [presented at OffensiveCon 2024](https://youtu.be/dqKFHjcK9hM?t=1622), having trivially callable writeable function pointers at the top of the heap makes it straightforward to turn a limited write into an arbitrary code execution. Disabling ZEND_MM_HEAP by default isn't doable, as it's used by a couple of profilers, so we're making some parts of `_zend_mm_heap` read-only at runtime instead: this will prevent the custom heap functions pointers from being hijacked, as well as the custom storage ones. We don't put the shadow_key there, since it has a performance impact, and an attacker able to precisely overwrite it is likely already able to read it anyway.
7bd296d
to
7d35e89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
@dstogov do you want to take an other look?
zend_alloc.c was especially optimized for performance. It's especially stored heap
as a part of the first chunk, to improve data locality and CPU cache usage, for an ability to fit into a into a single huge page and to reduce TLB pressure.
This patch changes some key design ideas. At least it should be benchmarked with callgrind (with cache simulation) and on real hardware.
From a quick review, I don't see anything terrible. (I'm going to be on vacation next two weeks).
This patch changes some key design ideas. At least it should be benchmarked with callgrind (with cache simulation) and on real hardware.
Any recommendation on how to properly benchmark php?
For valgrind, check how https://github.com/php/php-src/blob/master/benchmark/benchmark.php does it. The symfony demo and wordpress benchmarks are probably the most relevant in this case.
jvoisin@chernabog 14:58 (gmp_bench *) ~/dev/php-src time php ./benchmark/benchmark.php false sapi/cgi/php-cgi
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench.jit' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo.jit' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
{
"Zend\/bench.php": {
"instructions": "3602853467"
},
"Zend\/bench.php JIT": {
"instructions": "3602853424"
},
"Symfony Demo 2.2.3": {
"instructions": "1049217139"
},
"Symfony Demo 2.2.3 JIT": {
"instructions": "1049216240"
}
}
real 9m35.845s
user 9m28.012s
sys 0m4.266s
jvoisin@chernabog 15:07 (gmp_bench *) ~/dev/php-src
jvoisin@chernabog 15:09 (master *) ~/dev/php-src time php ./benchmark/benchmark.php false sapi/cgi/php-cgi
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.bench.jit' '--' 'sapi/cgi/php-cgi' '-T1' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/Zend/bench.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=disable' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:clear'
> '/usr/bin/php' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/bin/console' 'cache:warmup'
> 'valgrind' '--tool=callgrind' '--dump-instr=yes' '--callgrind-out-file=/home/jvoisin/dev/php-src/benchmark/profiles/callgrind.out.symfony-demo.jit' '--' 'sapi/cgi/php-cgi' '-T50,50' '-d max_execution_time=0' '-d opcache.enable=1' '-d opcache.jit=tracing' '-d opcache.jit_buffer_size=128M' '-d opcache.validate_timestamps=0' '/home/jvoisin/dev/php-src/benchmark/repos/symfony-demo-2.2.3/public/index.php'
{
"Zend\/bench.php": {
"instructions": "3603167523"
},
"Zend\/bench.php JIT": {
"instructions": "3603167510"
},
"Symfony Demo 2.2.3": {
"instructions": "1051339479"
},
"Symfony Demo 2.2.3 JIT": {
"instructions": "1051339825"
}
}
real 9m38.145s
user 9m30.166s
sys 0m4.276s
jvoisin@chernabog 15:20 (master *) ~/dev/php-src
quick'n'dirty testing doesn't show anything incredible performance-wise
As the person who has created the Valgrind benchmark: It's shown not to be super reliable in the past. Some changes were more or much less profitable/expensive than Valgrind suggested. I would suggest running a benchmark with php-cgi -T{warmup},{repeat} -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php
instead. You might also want to disable turbo boost or similar mechanisms. I also usually use taskset
and exclude a CPU core from the scheduler.
As the person who has created the Valgrind benchmark: It's shown not to be super reliable in the past. Some changes were more or much less profitable/expensive than Valgrind suggested. I would suggest running a benchmark with
php-cgi -T{warmup},{repeat} -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php
instead. You might also want to disable turbo boost or similar mechanisms. I also usually usetaskset
and exclude a CPU core from the scheduler.
Can this be documented somewhere? Ideally in a README.md file in the benchmark/
folder :/
More (ghetto) benchmarks, impact in the noise levels:
jvoisin@chernabog 15:56 (master *) ~/dev/php-src time sapi/cgi/php-cgi -T100,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null
real 2m52.627s
user 2m37.837s
sys 0m13.672s
jvoisin@chernabog 15:59 (master *) ~/dev/php-src
jvoisin@chernabog 16:02 (disable_custom_heap_default *) ~/dev/php-src time sapi/cgi/php-cgi -T100,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null
real 2m53.594s
user 2m38.454s
sys 0m13.856s
jvoisin@chernabog 16:05 (disable_custom_heap_default *) ~/dev/php-src
@jvoisin Thanks! For the warmup to be effective, you should look at the time printed by php-cgi itself (it goes to stderr). But it doesn't look like there will be a big change either way, which is good!
With a bigger warmup time:
jvoisin@chernabog 16:05 (disable_custom_heap_default *) ~/dev/php-src time sapi/cgi/php-cgi -T1000,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null
real 3m44.438s
user 3m24.555s
sys 0m18.260s
jvoisin@chernabog 16:15 (disable_custom_heap_default *) ~/dev/php-src
jvoisin@chernabog 16:21 (master) ~/dev/php-src time sapi/cgi/php-cgi -T1000,3000 -d opcache.enable=1 benchmark/repos/symfony-demo-2.2.3/public/index.php 2>/dev/null >/dev/null
real 3m43.036s
user 3m23.905s
sys 0m17.655s
jvoisin@chernabog 16:24 (master) ~/dev/php-src
@jvoisin I think you misunderstood. You should not use time
, but look at the stderr of php-cgi
. Anyway, it's unlikely to make a difference.
the last benchmark shows 0.6% real time slowdown, including initialization and warm-up. Without them the slowdown should be probably even more significant.
@dstogov These slowdowns come to a point where I wonder whether we should make these #if ZEND_ALLOC_HARDENED
and such.
And possibly not enable by default? :-/
@dstogov These slowdowns come to a point where I wonder whether we should make these
#if ZEND_ALLOC_HARDENED
and such.And possibly not enable by default? :-/
I would agree. We already introduced ZEND_MM_HEAP_PROTECTION
, but it's currently enabled by default.
IMO, opt-in security is useless.
@iluuu1994 I do agree, but all we're doing here is some hardening after an exploit was already found, nothing insurmountable for attackers. It's not actually fixing gaping holes in PHPs security or fully preventing exploit classes.
And the price to pay for that is high in terms of performance. So it probably should be actually disabled by default and users/distributions which provide hardened build. Like Gentoo would be probably a good target for these, who also compile with mitigations like stack-clash-protection by default.
@dstogov can you please share how you did your benchmarks so I can them them here as well and try to reduce the performance impact?
@dstogov can you please share how you did your benchmarks so I can them them here as well and try to reduce the performance impact?
I didn't bench this. I used your numbers. 3m44.438s / 3m43.036s = 224.438 / 223.036 = 1.00628 ~= 0.6% slowdown
It's better to use execution time provided by php-cgi. The same command, just see "Execution Time: ..." at stderr tail. It excludes startup/shutdown and warm up time.
Uh oh!
There was an error while loading. Please reload this page.
(削除) PHP's heap implementation is the one that virtually everybody uses: it's fast, it's there by default, it works, ...The only major ever I've found of custom heap implementation is phpdbg but it looks dispensable at best. Some other debuggers and profilers might use it, and that's alright, but I don't think that this feature should be enabled by default. (削除ここまで)
(削除) Disabling ZEND_MM_CUSTOM will allow to save a couple of bytes (yay), but the main goal is to close a low-hanging exploitation vector: as presented at OffensiveCon 2024, having trivially callable writeable function pointers at the top of the heap makes it straightforward to turn a limited write into an arbitrary code execution. (削除ここまで)As presented at OffensiveCon 2024, having trivially
callable writeable function pointers at the top of the heap makes it straightforward to turn a limited write into an arbitrary code execution.
Disabling ZEND_MM_HEAP by default isn't doable, as it's used by a couple of profilers, so we're making some parts of
_zend_mm_heap
read-only at runtime instead: this will prevent the custom heap functions pointers from being hijacked.cc @arnaud-lb