Interesting paper on Supporting Binary Compatibility with Static Compilation
Bryce McKinlay
bryce@waitaki.otago.ac.nz
Sun Aug 18 22:24:00 GMT 2002
Jeff Sturm wrote:
>On 2002年8月12日, Andrew Haley wrote:
>>>> > As a data point, when I build my CMS app with gcj I have 7 DSOs totalling
>> > about 190,000 load-time relocations (not counting libgcj.so). Some of
>> > these are resolved lazily, most are in .data and cannot be. Startup
>> > time is about 2 seconds on sparc-solaris and initial memory footprint
>> > around 28MB. Not too impressive, compared with 1 second and 15MB for the
>> > JRE.
>>>>That's weird, because IME interpreted Java takes forever to start
>>because of lazy class loading.
>>>>>>Yes, but... there are some 1900 classes in my app, plus another 1326 in
>libgcj.jar.
>>With the JRE, I see just 296 classes loaded initially, and 395 when it
>reaches steady-state.
>>With gcj, I have to wait for ld.so to link ~3200 classes before anything
>happens.
>>Suppose the compiled class metadata were free of pointers instead. No
>relocations, except lazy function calls. The metadata could then be
>constant and loaded into .rodata. Some advantages:
>
When thinking about the layout of the class and binary compatiblity
structures I've worked on the assumption that non-symbolic, private
relocations within the same binary object are much cheaper than symbolic
relocations (like the vtable ones) which need to be looked up globally.
If that isn't really the case then I guess we'd need to re-think the design.
However on Linux, libgcj's startup time is still insignificant compared
to the Hotspot VM, especially with the most recent glibc versions. Other
OSs like Mac OS X support prelinking which basically eliminates the need
for runtime relocations (at the cost of waiting around while prebindings
get updated every time you install an updated), so it isn't really an
issue there either.
With binary compatibility we can make the class metadata pretty small.
All that really needs to be there is:
class name
super-class name
access flags
methods metadata (note: cannot avoid function pointers here unless we
used dlsym() calls?)
fields metadata
possibly, a lock field for use during initialization (could avoid this
with a hash table or something)
everything else (ie the actual java.lang.Class object) can be
constructed at runtime. In this case references to classes would change
to go through something like a _Jv_GetClass call with a table of
locally-referenced classes. This wouldnt really add any overhead because
classes need to be checked for initialization in these situations
anyway, and the class pointer's existance in the table would guarantee
that it has been initialized. As we discussed a while back this would
require a read barrier on alpha etc in order to be MP-safe, however.
While it would possibly allow us to make the metadata completely static
(not just the strings), saving some startup time, I'm not convinced,
given the fairly small size of the structure above, that it would be
worthwhile due to:
a) extra metadata size (in the binary) due to loss of merged utf8consts
b) extra code complexity in compiler and libgcj to deal with the
metadata format not being arranged in nice simple pointers
>c) GC would have a far smaller root set.
>>This is a major contributor to collection times. The GC must scan ~6MB of
>static data per collection, almost none of which contains any heap
>pointers.
>
With this scheme, and with class fields being part of the class objects,
static data wouldn't need to be scanned at all. Class objects would be
on the heap so everything would be reachable from the stacks.
regards
Bryce.
More information about the Java
mailing list