VolanoMark findings

Boehm, Hans hans_boehm@hp.com
Wed Jan 29 17:09:00 GMT 2003


That's very roughly consistent with my (slightly dated) SPECjbb experiments. An Itanium profile suggested the top culprits there were:
1. _Jv_MonitorEnter (about 7%)
2. GC_mark_from (probably unavoidable and no worse than on other JVMs with the same heap size)
3. The B-tree access routine in the benchmark
4. _Jv_MonitorExit (about 5%)
5. _Jv_CheckCast
out of line division (> 5% total, in various routines)
memory allocation (not GC)
a few poorly tuned java.lang.String routines
__gettimeofday (about 2% !?)
The most likely sources of possible generally applicable improvement seemed to be:
1. (Selective or partial?) inlining of MonitorEnter/Exit. Being able to remember some state between the two would help appreciably. (But there's a limit, since the compare-and-swap instructions themselves use up a significant fraction of the time on most platforms. It would be great to avoid that in the MonitorExit case, but I think it's quite hard, given our other constraints.)
2. (Selective?) inlining of division.
3. Improvements to some of the String routines. (I may still have a partial patch or two hanging around, though it may be obsolete. I got sidetracked ...)
4. Further shortening of the allocation path.
5. (Almost certainly the most important, though the hardest) Gcc optimizer improvements.
In general, I think these benchmark results are too pessimistic, since other JVMs tend to be tuned for them to a much larger extent than gcj. But they still provide useful information.
Hans
> -----Original Message-----
> From: Andrew Haley [mailto:aph@redhat.com]
> Sent: Wednesday, January 29, 2003 7:34 AM
> To: Anthony Green
> Cc: java@gcc.gnu.org
> Subject: VolanoMark findings
>>> Anthony Green writes:
> > I recently retried building the VolanoMark benchmark found here:
> > http://www.volano.com/brenchmarks.html .
> > 
> > The good news is that it finally builds, and I closed the 
> case against
> > this problem. I have no idea what the magic fix was. 
> IIRC the compiler
> > couldn't handle the exception regions in the obfuscated 
> class files.
> > 
> > The bad news is that IBM's JDK is twice as fast on this 
> benchmark than
> > an optimized gcj build.
>> That's the same as I measured with Embedded CaffieneMark.
>> > My 2.3 GHz P4 gives IBM's 1.4 JDK a score of 12058, while we come
> > in at half that: 6040.
> > 
> > I'm hoping that this may be mostly accounted for bugs. 
> Unfortunately,
> > the VolanoMark is only distributed in .class form, so 
> figuring this out
> > may take some doing.
>> We already know what IBM do to get this perfomance:
>> http://www.research.ibm.com/journal/sj/391/suganuma.html
>> * Method inlining. We do that, but only in special cases.
>> * Exception check elimination. We don't do that.
>> * Common subexpression elimination. We that.
>> * Removal of initialization checks.
>> * Removal of synchronization.
>> Andrew.
>


More information about the Java mailing list

AltStyle によって変換されたページ (->オリジナル) /