status of gcj's boehm collector?

Tue Dec 4 20:58:00 GMT 2001

Jeff Sturm <jsturm@one-point.com> writes:
> > Hrm, so in the failure case, the GC suspects that a page has not been
> > modified, when in fact it has. Worst case, you get a memory
> > leak.

> Worst case, you prematurely free a live object, because you may miss a
> new pointer stored in the older generation.

Crap. Okay, now I understand. I keep forgetting that mark-and-sweep is
a negative operation... you don't know what's bad until you've scanned
everything that's good.
> One solution discussed before is to modify the collector to allocate
> pointer-free objects on pages that won't be write-protected. That
> should be good enough for libgcj when hash synchronization is enabled.
> (Without hash synchronization even primitive arrays aren't pointer-free.)

Ah, very neat...
Does gcj's boehm gc use the class pointer on heap objects to determine
what parts of them are pointers and what parts aren't?
> There's an easier way if you can tolerate a slight overhead in I/O. Wrap
> calls to read() so they modify a local (stack-allocated) buffer, then copy
> the buffer to heap memory. I think it would be safe even for
> multithreaded use, and not require changes to the collector.

Is read() is the only system call capable of modifying pointers on the
heap? When would you read() a pointer value? Data acquired through
read() usually comes from another process (or machine), which wouldn't
be capable of generating meaningful pointers into your address
space... unless you sent that data to the other process, but you can't
do that in Java.
Anyways, if read() is the only call that can "silently dirty" a page,
you could just ignore this and then ask the application author to
periodically call fullHeapScan() (or do it automatically if it isn't
done after $THRESHHOLD occurrences of the incremental GC).
 - a