status of gcj's boehm collector?
Boehm, Hans
hans_boehm@hp.com
Wed Dec 5 12:04:00 GMT 2001
> -----Original Message-----
> From: Adam Megacz [mailto:gcj@lists.megacz.com]
>> Okay, please double check my reasoning here; if this works, I may have
> time to implement it (as a compile-time option) in late February.
>> 1. For GCJ-compiled applications, enable
> generational/incremental GC with
> a page-protecting write barrier.
This shouldn't be on by default. But I'll add a way to turn it on through
an environment variable. There should probably be a way to turn it on
directly from Java code. Currently you need to call GC_enable_incremental()
from C.
The reasons you don't want it as the default are:
- I'd want to be convinced that it improves throughput in most cases. I
suspect it doesn't.
- It doesn't interact terribly well with the parallel collector. (I think
nothing really breaks. But the parallel collector ignores any time limit
and basically runs to completion. Since it's fairly expensive to start and
stop the parallel GC, it's not completely clear that's wrong.) Currently
you probably want to enable one or the other. Thus I conjecture you usually
want incremental GC off on a multiprocessor. Pause times should also scale
reasonably well with number of processors.
>> 2. Since all system calls will be made via natXXX.cc in libgcj, add
> code to switch page protection for the target page off just before
> system calls, and then switch it back on afterwards (if it was
> enabled beforehand, of course).
We've been there. The wrappers to do approximately that are currently in
the GC. Unfortunately, to do this correctly you need to acquire a lock or
provide some other form of synchronization. Currently the allocation lock
is held while the system call is executed. This is very unfortunate if the
call blocks.
There are also CNI issues.
We could try to fix the locking scheme, acquire the lock briefly, and keep a
list of pages currently being written by system calls. But with CNI, this
remains a hack. We'd either have to intercept all system calls (which
doesn't quite work since ioctl is extensible by device driver writers), or
require CNI code to do the wrapping. I'm inclined to proceeed along the
lines Jeff suggested instead:
1) Make sure hash synchronization works on interesting platforms. (For
win32, I think much of the effort there will be finding a fast way to
compute a thread id, probably by porting the Linux code. Or does win32 also
use a segment register to hold some sort of thread pointer?)
2) CNI code and the runtime are only allowed to make system calls that write
to static data, stack memory, malloced objects or pointerfree objects in the
heap. The latter is guaranteed to include primitive arrays. (We refuse to
really turn on incremental GC unless hash synchronization is available.)
This still restricts CNI, but in a way that I think is easier for a
programmer to remember, and probably automatically satisfied in most cases.
3) Change the collector so pointerfree objects are not protected. I will
take care of that.
One remaining concern I have with both schemes is that we can't get dirty
bit information on static data or thread stacks. That means we can't get GC
pause times to below the time required to scan those. (On win32 that
usually includes the malloc heap, though that might be fixable.) I suspect
the only practical way around that is dirty bit support in the kernel.
>> 3. On every occurrence of incremental GC,
>> if (numIncrementalCollections++ > THRESHHOLD) {
> fullHeapScan();
> numIncrementalCollections = 0;
>> } else {
> incrementalScan();
>> }
>That's all there. It's a bit more complicated than that, in that the full
collection frequency can be adjusted dynamically.
> 4. When System.gc() is manually invoked by the application, always
> perform a full heap scan (and reset numIncrementalCollections).
I think that's already what happens. (The other theory is that you should
ignore it completely, since that speeds up the average application. I'm not
sure that's a bad theory either. It's probably best to just to discourage
use of System.gc(), so that it doesn't matter.)
Hans
More information about the Java
mailing list