Java: indirect virtual dispatch for binary compatibility

Bryce McKinlay bryce@waitaki.otago.ac.nz
Mon Dec 10 22:04:00 GMT 2001


One current problem with GCJ is that it is constrained by C++ ABI for 
field access and method calls. The C++ ABI does not support java's rules 
for binary compatibility, so it is impossible to create a 
dynamically-linked application binary with GCJ and expect it to continue 
working without recompilation if any underlying classes (such as those 
in libgcj itself) are modified such that the virtual methods table and 
field layouts are different.
As an example, consider the following classes:
class A
{
 void a() { System.out.println ("a"); }
}
public class B
{
 public static void main(String[] args)
 {
 new A().a();
 }
}
We could compile these separately but link them together into the same 
executable:
$ gcj -c A.java
$ gcj -c B.java
$ gcj A.o B.o --main=B
$ ./a.out
a
Now imagine that we insert a new method, b, into class A, and recompile 
A.java but not B.java. We get the wrong result:
class A
{
 void b() { System.out.println ("b"); }
 void a() { System.out.println ("a"); }
}
$ ./a.out
b
This is because the layout of the virtual table for A has changed, but B 
was compiled with a reference to a fixed offset within A and is not 
aware of the change. B will call the wrong method and crash or 
misbehave. Naturally, this makes it very difficult to evolve shared 
class libraries while retaining compatibility with existing compiled 
applications.
A similar problem exists for fields, which are loaded using fixed 
offsets from an object pointer. If the location of these fields changes, 
any code referencing them will break.
The Java language specification defines a set of "binary compatibility 
rules" which determine what changes to classes should and should not 
cause incompatibility between their binary representations: 
http://java.sun.com/docs/books/jls/second_edition/html/binaryComp.doc.html
Previously, GCJ has simply ignored these rules, but it is desirable, and 
not entirely unreasonable, to implement them.
So...
This patch introduces an "--indirect-dispatch" option which does two things:
1. defers vtable layout until runtime
2. changes the way virtual and interface calls are compiled so that 
vtable offsets are resolved symbolically at runtime
It does this by introducing "offset tables" which are filled out by the 
runtime during class preparation/linking to contain the actual vtable 
offsets for any virtual methods called by code in the compilation unit. 
Rather than compiling calls to reference a fixed offset within the 
vtable, they reference a fixed offset within the offsets table, and load 
from that location the offset within the vtable to find the code pointer 
for the called method.
Basically, "call obj->vtable[X]" becomes "call obj->vtable[otable[Y]]".
Where X and Y are compile-time constants. To give some idea of the extra 
overhead involved, the Java code "System.out.println("Hello")", which 
without indirect dispatch compiles on x86 with -O2 to:
 movl _ZN4java4lang6System3outE, %eax
 movl (%eax), %edx
 movl %eax, (%esp)
 movl _CD_Hello+4, %eax
 movl %eax, 4(%esp)
 call *116(%edx)
With --indirect-dispatch, it becomes:
 movl _ZN4java4lang6System3outE, %eax
 movl (%eax), %edx
 movl %eax, (%esp)
 movl _CD_Hello+4, %eax
 movl %eax, 4(%esp)
 movl otable+4, %eax
 call *(%eax,%edx)
When the same call occurs in succession or in a loop, the compiler moves 
the otable load out of the loop, so the overhead is reduced.
What does this cost? Running the attached "factorial" benchmark, which 
calls an inner recursive factorial function either defined as a static, 
virtual, or interface call, gives these results on a 650Mhz mobile P3:
with -O2:
static factorial: 10.933s
virtual factorial: 6.565s
interface factorial: 13.774s
with -O2 --indirect-dispatch:
static factorial: 10.928s
virtual factorial: 6.368s
interface factorial: 13.023s
So, strangely, this benchmark actually speeds up on x86 using indirect 
dispatch! On PowerPC (7410), there was a small slowdown for the virtual 
and interface calls, but in both cases they still ran substantially 
faster than non-virtuals, even without -fPIC. (Q: Why are non-virtual 
calls so slow?)
For real applications (ie not just micro-benchmarks), my suspicion is 
that there would be some slowdown, but the difference is certainly not 
large.
As for space, binarys typically turn out to be *smaller* with indirect 
dispatch than without it since getting rid of the vtables saves more 
space than is added by the symbols for the offset table. Two new pointer 
fields were added to java.lang.Class (otable and otable_syms), and 3 
utf8 constants are emitted for each of the methods in the offset table. 
A new "index" field was added to the _Jv_Method structure, however its 
size was not increased (apart from extra debugging symbols) since it 
just makes use of space that was previously lost to alignment. This new 
field will also be useful in simplifying/speeding up some existing 
interpreter and reflection code.
This technique retains complete compatibility with the existing ABI. 
That is, a single libgcj.so compiled without indirect dispatch can 
support binaries compiled either with or without indirect dispatch, and 
those compiled with it get all the binary compatibility benefits. And an 
object compiled without indirect dispatch can still call into 
indirect-dispatch compiled object without problems. I would like to make 
binary compatibility the default behaviour eventually, but 
--no-indirect-dispatch etc can be used for maximum performance when 
compatibility is not an issue.
The patch currently only implements indirect virtual and interface 
dispatch. Things that still need to be worked on include:
1. Fields. Should be able to be implemented using a similar offset table 
technique, with the offset being the offset from object pointer rather 
than vtable offsets. For field access within classes in the compilation 
unit, we can make a potential optimization: Since we know that field 
order cannot change without recompiling the unit, we just need to know 
the position within the object that the fields of the class in question 
begin at, which can be stored in the metaclass object or some other 
convenient location. Then, we add it to the statically-known offset from 
that position to access a field. This has the advantage of being able to 
re-use the same value for multiple field accesses, and not having to 
load values from a potentially large table, however it does require 
three adds (this + class->offset + field_offset) rather than two (this + 
otable[X]), so some testing would be required to determine if it is 
definatly a win.
2. Interface calls. The patch already implements a partial solution for 
interface calls, however it is not completely compliant with the binary 
compatibility rules. While methods can be added/deleted/reordered within 
an interface without problems, a method moved to a superinterface will 
not be found by the dispatch mechanism. This is because an interface 
call in GCJ requires the object reference, method offset, and a 
reference to the target interface to complete. A layout like:
interface A { void a(); }
interface B extends A {void b(); }
generates an itable like {A.class, *a, B.class, *b}. If b() were moved 
into A, the itable would now look like {A.class, *a, *b, B.class}. The 
call site has B.class statically compiled into it, and there is no fixed 
offset from B.class to get to b() due to multiple inheritance, so the 
call cannot be completed. One solution would be to generate itables 
which contain all inherited methods for each interface, ie {A.class, *a, 
B.class, *a, *b}. This would add no additional overhead, but I can 
imagine the itables growing unreasonably large with some inheritance 
heirarchies, and this is probibly not what JITs etc are expecting. The 
second solution is to have the otable, for interface calls, contain an 
offset and target interface pair so that both of these are set 
dynamically. This should be a net size win over the first option, and, 
hey, people expect interface calls to be slower anyway ;-)
3. Throwing exceptions at the right time. In order to support 
NoSuchMethodError as per the spec, we can add a special entry in each 
vtable pointing to a throw function. When a method required for an 
otable is found not to exist, the offset to this entry is put in the 
table. The patch doesn't do this yet, a call to a missing method has 
undefined behaviour. For fields, luckily we only have to throw an error 
at link-time so there is no issue there.
4. CNI. The C++ compiler will need to understand the new mechanisms when 
dealing with extern "Java" classes (in order to get the benefits of 
binary compatibility). One issue is how the C++ compiler will register 
any offset tables it needs to be filled out with the runtime, since it 
can't use fields in class objects as the Java compiler does. One 
solution would be to use global constructors much like the class 
registration mechanism already used by java, though we may lose the 
"lazy linking" benefit that Java gets, since otables only need to be 
resolved when a class using them is initialized.
5. Static methods and fields. These are already resolved symbolically, 
but by the dynamic linker not the gcj runtime. Thus they are not really 
binary compatible per the spec since link time errors will occur if they 
are missing rather than Java exceptions. So do we want to implement our 
own static method tables etc? Again, C++ will have to play too.
My thanks to Dachuan Yu who provided much of the initial inspiration for 
this patch as well as an early test implementation. I know that Dachuan 
was looking at running the SpecJVM benchmarks against GCJ, so perhaps he 
could provide some more comprehensive performance figures.
regards
Bryce.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: appeal-bench.tar.gz
Type: application/x-gzip
Size: 3122 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/java/attachments/20011210/80586613/attachment.bin>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gcj-indirect.patch
URL: <http://gcc.gnu.org/pipermail/java/attachments/20011210/80586613/attachment.ksh>


More information about the Java mailing list

AltStyle によって変換されたページ (->オリジナル) /