Thread local allocation
Boehm, Hans
hans_boehm@hp.com
Wed Feb 13 09:28:00 GMT 2002
One problem is that GC_memory_write_barrier() is currently defined only for
X86 and IA64. And it doesn't need to do much on either of those. It would
need to be more generally defined. Unfortunately, I'm not sure what the
right code is for Alpha and PowerPC. There should be code for it in
linuxthreads, though.
There probably also should be a feature test macro along the lines of
VOLATILE_IS_BARRIER. On platforms like IA64, a volatile store acts as a
release barrier, forcing preceding memory operations to complete before the
store, and removing the need for explicit barriers in most cases. (I think
that's also effectively true on X86, though on IA64 there's both a hardware
and compiler barrier involved. This behavior is consistent with the
proposed Java semantics for volatile. See Bill Pugh's Java memory model web
site for more discussion on a lot of this.)
On Alpha, you also need a read barrier between the load of entry and the
access to its fields in getspecific. Just because the second load is data
dependent on the first doesn't mean they're ordered (unlike on IA64, where
it does).
Hans
> -----Original Message-----
> From: Bryce McKinlay [mailto:bryce@waitaki.otago.ac.nz]
> Sent: Tuesday, February 12, 2002 6:52 PM
> To: Boehm, Hans
> Cc: java@gcc.gnu.org
> Subject: Re: Thread local allocation
>>> Boehm, Hans wrote:
>> >There are two reasons it isn't enabled for *-*-linux*:
> >
> >1) I can only test X86 and Itanium.
> >
> >2) Some of the code, particularly in specific.c, assumes an
> Itanium-like or
> >stronger memory model. In particular, if you look at the code for
> >setspecific(), it would need a memory barrier on Alpha, as would the
> >corresponding inlined getspecific routine.
> >
>> Right. It seems to me that the inlined getspecific (in
> specific.h) would
> not need any barrier because it does not do any writes? So
> how about a
> GC_memory_write_barrier() before:
>> *(volatile tse **)(key -> hash + hash_val) = entry;
>> in setspecific, and:
>> *cache_ptr = entry;
>> in slow_getspecific? I'm not sure about remove_specific.
>>> >I would vote for turning this on on a case-by-case basis, after some
> >(multiprocessor!) testing. I would guess that for a SPARC
> running in TSO
> >mode, it's safe to turn it on as it is. The same is
> probably true for MIPS.
> >For Alpha, it may be safe to turn on if you define
> USE_PTHREAD_SPECIFIC, so
> >that it uses the official pthread-defined thread local
> storage primitives
> >instead of my (faster on X86) hack. Or you would have to
> add the memory
> >barriers. I don't claim to know anything about the PowerPC
> memory model,
> >but I suspect that if it works for Alpha, it probably works
> on PowerPC, too.
> >
>> Hmm. I can't easily test MP PowerPC or Alpha, but I'd like to
> turn it on
> anyway in order to get the performance increase. Perhaps we
> could leave
> it disabled (USE_PTHREAD_SPECIFIC) for alpha since it seems like that
> would be the most problematic one.
>> I might do a test with and without USE_PTHREAD_SPECIFIC to make sure
> that all this is worthwhile, but I suspect it will be on PowerPC. The
> linuxthreads implementation of pthread_self() on PPC looks
> far far worse
> than x86.
>> regards
>> Bryce.
>>
More information about the Java
mailing list