URLClassloader and native objects

Sal gcj@svf.dreamhost.com
Mon Jun 14 15:52:00 GMT 2004


Andrew Haley wrote:
> Sal writes:
> > Andrew Haley wrote:
> > 
> > > There's no code to allow a jarfile to be separately compiled to a .so
> > > and loaded automagically, though. It's quite tricky to figure out how
> > > to do it.
> > 
> > This is one of the things I would like to get to work, once an elegant 
> > solution is found. In a way its a simpler issue in that the code is 
> > already compiled, we just need a way to tell gcj to grab a hold of it 
> > from within a custom classloader.
>> Tom Tromey suggested that we should create a database of mappings from
> checksum->shared object. So, ClassLoader.defineClass() generates a
> checksum and then finds the appropriate shared library and loads it.
> Most times, the shared library will already be loaded, so it's only
> necessary to return a pointer to th class. The other tool we need is
> one that compiles the jar file and creates the database of mappings.
>> Another approach that bypasses the checksumming is to attach an
> attribute to a jarfile that points to a shared object file that is the
> compiled jarfile. I'm not quite sure what form that attribute would
> take, but JFFS2 has xattr().

This sounds agreeable. The only stipulation I can think of is the 
situation where the .class or .jar containing .class files is not 
available... lets say you want to use GCJ to build the entire 
application natively. Then there isn't a bytecoded .class file to load 
and compute the checksum from.
In these cases we'll need to find a way to have a custom class loader 
still reference the native objects. We could 'force' users to package 
JVM bytecode .class files along with their executable so that the 
checksumming will work but I think there may be a more elegant solution.
For the majority of cases, where we are trying to get existing Java apps 
to run in a GCJ environment, the checksum would work great... as you 
have the .class / .jar files onhand already. Just drop GCJ in place of 
the Sun JVM and the app would run using all native objects. But in the 
situation where the user wants to use GCJ to build a self contained 
executable there are these unique problems.
I think one solution may be to modify the GCJ bytecode verification 
system to accept native code, or references to staticly linked code. It 
may be a radical idea since most typical Java platforms to date will 
*only* accept Java bytecode (from defineClass). But I think if we allow 
this it opens the door up to a native platform without breaking any 
compatibility with typical Java apps. With a system such as this we 
could gain access to native objects without having code duplicated as 
JVM bytecode.
Basically the end result would be that you could (theoretically 
speaking, in reality you probably use a different naming scheme) rename 
a .so to .class, and run the application with GCJ... and the bytecode 
verifier would identify the native code and allow it. Or, if the .so 
wasn't present (object is statically linked) the .class file could 
contain a symbol that references a statically linked object. On the 
other hand if you ran the app under Sun's VM with 'real' .class files it 
would still work, even in the case of custom classloaders.
Would a modification like this to GCJ be acceptible? Or maybe a 
configurable option.
>> > A variation of the issue I'm having, is when the compiled code already 
> > exists but is statically linked into my executable. Everything works 
> > great, and I can even use Class.forName() to grab the object. But my 
> > CCLs are unable to pull this object out, nor load it from the disk 
> > because it is combined into the executable. Do you have any insight on 
> > how I may be able get around this?
>> I don't really know where the problem lies. If your custom class
> loader inherits from VMClassLoader, when it calls LoadClass it will do
> the right thing.

Your previous suggestion would address the problem... but let me explain 
my issue just in case there is a way around it that I missed.
The class will get loaded, but the classloader instance that loads an 
object gets associated with that object. And whenever the object itself 
 instantiates another object, then the system classloader gets the 
request instead of your custom classloader.
For example given a custom classloader 'MyLoader', and classes A, X (in 
a .class file) Y (in a .class file) and Z (statically linked shared 
object) this is the situation when trying to load each via a custom 
classloader:
MyLoader.loadClass("X"); //Custom class loader interprets the request for X
class X { void someMtd(){ ... new Y(); } } // 'MyLoader' gets requests 
for Y also
class X { ... void someMtd2() { ... new Z(); } } //'MyLoader' gets 
request for Z, but delegates it to the system classloader because its a 
native object
class Z { void someMtd() { ... new A(); } } //Request for object 'A' 
bypasses the custom classloader completely, because the calling class Z 
is 'tagged' as having been loaded by the system. The Java security 
models says that future requests from Z goes to Z's classloader.
The result is, A and Z both running from the parent/system classloader, 
and any requests made by them (via 'new') are handled by the System 
classloader. The problem here is that any security 
checking/validation/filtering done by the CCL is bypassed.
This is proper behavior because you are delegating to the system 
classloader... the issue is that delegating to the system classloader is 
the only way to access those objects from within your CCL. So assuming 
that objects A and Z were native objects, and you had no .class file to 
load them from, your CCL becomes inoperative. If the entire application 
depends on the functionality of the CCL, then you cannot use GCJ to 
compile the objects natively, you'll have to run the app in 
non-native/JIT mode (I can't think of the proper term).
I hope it makes sense, I know its all a bit messy.
Hence the need to be able to loadClass("X"), and from within a CCL 
return the native equivalent without delegating the request. The 
checksum matching/attribute bypass solutions you suggest should make 
this possible, in situations where the .class/.jar is available, and 
where it isn't, the other solution I proposed could work.
> > An ugly hack I can think of would be to have defineClass return some 
> > custom data, such that it doesn't define a class with VM bytecode, but 
> > returns some string that references a native object 
> > (ugly_native_hack://foo.bar.classname). We'de have to bypass any 
> > standard bytecode verification in these special cases and pull that 
> > statically linked image out of storage instead. (Of course this is just 
> > speculation, I don't know enough about GCJ architecture to say it would 
> > be possible.) Has this already been accounted for in some design work 
> > previously?
>> This doesn't sound like the right thing to do.

If you could consider some of the previous points... while I agree 
imbedding a text URL like so is probably a bad idea, being able to 
defineClass using some form of native data seems like the only way to 
work around some issues. I am open for any ideas for alternative 
solutions, of course.
> > A third solution might be to have some eternal override. Maybe a 
> > directory with SOs or a configuration file that will list objects. Any 
> > objects in this list, when referenced via defineClass will 'thunk' down 
> > to use a staticly compiled version, or a native .so regardless of what 
> > data the application is trying to define the class with.
>> Yeah, that's more or less Tromey's idea. That's what we'll go with, I
> expect.

Sounds good.
> > I'll start digging through code. GCC is still a bit daunting as I'm new 
> > to it, so it may take a little while before I'll be able to 
> > contribute... if you know of some reading offhand (online or off) to 
> > bring me up to speed that would be great.
>> Don't worry about the compiler. The library (gcc/libjava) is mostly
> Java code.
>> BTW, there are some legal niceties that we'll need to talk about
> before you contribute anything.

Basically, the FSF owns everything right? :) If so this isn't an issue. 
 If you need to talk specifics feel free to drop me a line. (gcj at svf 
dot dreamhost dot com)
- Sal


More information about the Java mailing list

AltStyle によって変換されたページ (->オリジナル) /