dynamic library cost (was RE: libtool, java woes)
Corey Minyard
minyard@acm.org
Thu Apr 12 18:27:00 GMT 2001
"Boehm, Hans" <hans_boehm@hp.com> writes:
> > -----Original Message-----
> > From: Corey Minyard [ mailto:minyard@acm.org ]
> >
> > The only advantage I can see to a _Jv_ function is you could make it
> > weak and then be able to use different GCs with the same GCJ library.
> > That's convenient for me, but I'm not sure of the general benefit.
> > You would have to handle all the accurate scanning of object and
> > arrays and initialization, as part of the "generic" interface, too.
> > That would be my ideal, but it would probably take some time to get
> > there and I'm not sure if it's worth it.
> I'm not sure how your collector currently scans objects. I'm trying to
> isolate the collector library from libgcj sufficiently that the libgcj code
> doesn't need to include GC private header files and hence no longer depends
> on offsets in GC-private data structures. If this all works, eventually
> different versions of our GC should work with the same libgcj. So there
> will be more of a well-defined interface between the two. I suspect that a
> fundamentally similar collector might be able to use the same interface, but
> you'd be stuck with the same object layout descriptors, etc. And the
> interface probably isn't sufficient for something like a mostly copying
> collector.
>> I don't have a clear picture how much it would cost to generalize the
> interface. My guess is that separating the current GC from libgcj to even
> this small extent is likely to be feasible only because the GC traversal
> rarely needs to call back into libgcj, and thus we can afford to add a
> little overhead there.
I could see how a copying collector would be a big problem. I haven't
gone beyond theory on those, though, so I don't know a lot about them.
My accurate marking interface is somewhat different. I don't have a
concept of a mark stack. Or, more accurately, it's not passed in to
the marking routine. It's more of a Baker's treadmill with a single
grey list. The grey list is kept on a page basis, so it can
essentially contain all pages being managed. Actually, I haven't
really looked to see what the mark stack is, I'm assuming it's sort of
like a grey list.
It also has the ability to scan only parts of the object. So, if you
have a very large object and it gets updated, only the page that was
updated gets re-marked. And large objects get marked a page at a time so
a really large object doesn't blow the collectors real-time slice.
I'm also planning to add the ability to only write barrier pages that
have pointers. One of my users wants to allocate a 512MB array of
bytes to work with! With pointers just at the beginning, it would be
a big waste to write barrier the whole thing. But that's probably
another interface change.
The version on my web page is somewhat dated, if you want to have a
look I can put a newer version up there.
>> Should we consider passing just the size? That wouldn't give the allocator
> any opportunity to handle different types differently. It still seems
> marginally cleaner to me to pass the vtable, given that at least in my case,
> the collector eventually looks at the mark descriptor stored there.
To be truly generic, you would probably want the size and an id that
references the marking routine to use (much like your standard
interface). That way, if a compiler wanted to have multiple marking
routines for different types of objects, it could (thinking about
Objective-C and the like). Of course, then you have to have special
ones for bitmap marking, etc.
> This brings up the issue of pointerfree objects. They're very special to
> our collector, in that they don't have to be touched during tracing, which
> saves memory and potentially disk traffic. Assuming hash synchronization, I
> suspect they consitute a significant fraction of objects in some
> applications (e.g. the SPEC raytracing benchmark?) It would be nice to have
> the compiler issue a different allocation call for those. This is the only
> case I can think of in which I would otherwise consider interpreting parts
> of the vtable in the allocator.
I thought all objects at least had to mark their class object and sync
info data structure (or maybe that's what you mean by hash
synchronization?). I agree that pointer free objects are a big win if
you can get there, but if we want to implement class removal when a
class becomes unused that will be difficult. Maybe we could do
refcounts? But then every object would have to have a finalizer, and
that would be worse.
-Corey
More information about the Java
mailing list