Java: indirect virtual dispatch for binary compatibility
Bryce McKinlay
bryce@waitaki.otago.ac.nz
Mon Dec 10 22:04:00 GMT 2001
One current problem with GCJ is that it is constrained by C++ ABI for
field access and method calls. The C++ ABI does not support java's rules
for binary compatibility, so it is impossible to create a
dynamically-linked application binary with GCJ and expect it to continue
working without recompilation if any underlying classes (such as those
in libgcj itself) are modified such that the virtual methods table and
field layouts are different.
As an example, consider the following classes:
class A
{
void a() { System.out.println ("a"); }
}
public class B
{
public static void main(String[] args)
{
new A().a();
}
}
We could compile these separately but link them together into the same
executable:
$ gcj -c A.java
$ gcj -c B.java
$ gcj A.o B.o --main=B
$ ./a.out
a
Now imagine that we insert a new method, b, into class A, and recompile
A.java but not B.java. We get the wrong result:
class A
{
void b() { System.out.println ("b"); }
void a() { System.out.println ("a"); }
}
$ ./a.out
b
This is because the layout of the virtual table for A has changed, but B
was compiled with a reference to a fixed offset within A and is not
aware of the change. B will call the wrong method and crash or
misbehave. Naturally, this makes it very difficult to evolve shared
class libraries while retaining compatibility with existing compiled
applications.
A similar problem exists for fields, which are loaded using fixed
offsets from an object pointer. If the location of these fields changes,
any code referencing them will break.
The Java language specification defines a set of "binary compatibility
rules" which determine what changes to classes should and should not
cause incompatibility between their binary representations:
http://java.sun.com/docs/books/jls/second_edition/html/binaryComp.doc.html
Previously, GCJ has simply ignored these rules, but it is desirable, and
not entirely unreasonable, to implement them.
So...
This patch introduces an "--indirect-dispatch" option which does two things:
1. defers vtable layout until runtime
2. changes the way virtual and interface calls are compiled so that
vtable offsets are resolved symbolically at runtime
It does this by introducing "offset tables" which are filled out by the
runtime during class preparation/linking to contain the actual vtable
offsets for any virtual methods called by code in the compilation unit.
Rather than compiling calls to reference a fixed offset within the
vtable, they reference a fixed offset within the offsets table, and load
from that location the offset within the vtable to find the code pointer
for the called method.
Basically, "call obj->vtable[X]" becomes "call obj->vtable[otable[Y]]".
Where X and Y are compile-time constants. To give some idea of the extra
overhead involved, the Java code "System.out.println("Hello")", which
without indirect dispatch compiles on x86 with -O2 to:
movl _ZN4java4lang6System3outE, %eax
movl (%eax), %edx
movl %eax, (%esp)
movl _CD_Hello+4, %eax
movl %eax, 4(%esp)
call *116(%edx)
With --indirect-dispatch, it becomes:
movl _ZN4java4lang6System3outE, %eax
movl (%eax), %edx
movl %eax, (%esp)
movl _CD_Hello+4, %eax
movl %eax, 4(%esp)
movl otable+4, %eax
call *(%eax,%edx)
When the same call occurs in succession or in a loop, the compiler moves
the otable load out of the loop, so the overhead is reduced.
What does this cost? Running the attached "factorial" benchmark, which
calls an inner recursive factorial function either defined as a static,
virtual, or interface call, gives these results on a 650Mhz mobile P3:
with -O2:
static factorial: 10.933s
virtual factorial: 6.565s
interface factorial: 13.774s
with -O2 --indirect-dispatch:
static factorial: 10.928s
virtual factorial: 6.368s
interface factorial: 13.023s
So, strangely, this benchmark actually speeds up on x86 using indirect
dispatch! On PowerPC (7410), there was a small slowdown for the virtual
and interface calls, but in both cases they still ran substantially
faster than non-virtuals, even without -fPIC. (Q: Why are non-virtual
calls so slow?)
For real applications (ie not just micro-benchmarks), my suspicion is
that there would be some slowdown, but the difference is certainly not
large.
As for space, binarys typically turn out to be *smaller* with indirect
dispatch than without it since getting rid of the vtables saves more
space than is added by the symbols for the offset table. Two new pointer
fields were added to java.lang.Class (otable and otable_syms), and 3
utf8 constants are emitted for each of the methods in the offset table.
A new "index" field was added to the _Jv_Method structure, however its
size was not increased (apart from extra debugging symbols) since it
just makes use of space that was previously lost to alignment. This new
field will also be useful in simplifying/speeding up some existing
interpreter and reflection code.
This technique retains complete compatibility with the existing ABI.
That is, a single libgcj.so compiled without indirect dispatch can
support binaries compiled either with or without indirect dispatch, and
those compiled with it get all the binary compatibility benefits. And an
object compiled without indirect dispatch can still call into
indirect-dispatch compiled object without problems. I would like to make
binary compatibility the default behaviour eventually, but
--no-indirect-dispatch etc can be used for maximum performance when
compatibility is not an issue.
The patch currently only implements indirect virtual and interface
dispatch. Things that still need to be worked on include:
1. Fields. Should be able to be implemented using a similar offset table
technique, with the offset being the offset from object pointer rather
than vtable offsets. For field access within classes in the compilation
unit, we can make a potential optimization: Since we know that field
order cannot change without recompiling the unit, we just need to know
the position within the object that the fields of the class in question
begin at, which can be stored in the metaclass object or some other
convenient location. Then, we add it to the statically-known offset from
that position to access a field. This has the advantage of being able to
re-use the same value for multiple field accesses, and not having to
load values from a potentially large table, however it does require
three adds (this + class->offset + field_offset) rather than two (this +
otable[X]), so some testing would be required to determine if it is
definatly a win.
2. Interface calls. The patch already implements a partial solution for
interface calls, however it is not completely compliant with the binary
compatibility rules. While methods can be added/deleted/reordered within
an interface without problems, a method moved to a superinterface will
not be found by the dispatch mechanism. This is because an interface
call in GCJ requires the object reference, method offset, and a
reference to the target interface to complete. A layout like:
interface A { void a(); }
interface B extends A {void b(); }
generates an itable like {A.class, *a, B.class, *b}. If b() were moved
into A, the itable would now look like {A.class, *a, *b, B.class}. The
call site has B.class statically compiled into it, and there is no fixed
offset from B.class to get to b() due to multiple inheritance, so the
call cannot be completed. One solution would be to generate itables
which contain all inherited methods for each interface, ie {A.class, *a,
B.class, *a, *b}. This would add no additional overhead, but I can
imagine the itables growing unreasonably large with some inheritance
heirarchies, and this is probibly not what JITs etc are expecting. The
second solution is to have the otable, for interface calls, contain an
offset and target interface pair so that both of these are set
dynamically. This should be a net size win over the first option, and,
hey, people expect interface calls to be slower anyway ;-)
3. Throwing exceptions at the right time. In order to support
NoSuchMethodError as per the spec, we can add a special entry in each
vtable pointing to a throw function. When a method required for an
otable is found not to exist, the offset to this entry is put in the
table. The patch doesn't do this yet, a call to a missing method has
undefined behaviour. For fields, luckily we only have to throw an error
at link-time so there is no issue there.
4. CNI. The C++ compiler will need to understand the new mechanisms when
dealing with extern "Java" classes (in order to get the benefits of
binary compatibility). One issue is how the C++ compiler will register
any offset tables it needs to be filled out with the runtime, since it
can't use fields in class objects as the Java compiler does. One
solution would be to use global constructors much like the class
registration mechanism already used by java, though we may lose the
"lazy linking" benefit that Java gets, since otables only need to be
resolved when a class using them is initialized.
5. Static methods and fields. These are already resolved symbolically,
but by the dynamic linker not the gcj runtime. Thus they are not really
binary compatible per the spec since link time errors will occur if they
are missing rather than Java exceptions. So do we want to implement our
own static method tables etc? Again, C++ will have to play too.
My thanks to Dachuan Yu who provided much of the initial inspiration for
this patch as well as an early test implementation. I know that Dachuan
was looking at running the SpecJVM benchmarks against GCJ, so perhaps he
could provide some more comprehensive performance figures.
regards
Bryce.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: appeal-bench.tar.gz
Type: application/x-gzip
Size: 3122 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/java/attachments/20011210/80586613/attachment.bin>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: gcj-indirect.patch
URL: <http://gcc.gnu.org/pipermail/java/attachments/20011210/80586613/attachment.ksh>
More information about the Java
mailing list