Unicode mangling (was Re: [PATCH] Java: New C++ ABI compatibility changes.)
Alexandre Petit-Bianco
apbianco@cygnus.com
Mon Jan 15 11:10:00 GMT 2001
Jason Merrill writes:
> 1) It doesn't allow for C-like symbols, which have no length specifier.
> This could be fixed by defining some encoding starting with, say, '_U'.
> 2) It doesn't accommodate 32-bit extended characters in C++/C99
> (\UNNNNNNNN). This could be fixed by escaping them with, say, '_L'.
> 3) _NNNN is a valid component of an identifier, complicating the
> demangler intelligence. This could be fixed by also escaping the '_'
> character in affected names. Hmm...it looks like you intend to do
> so in unicode_mangling_length, but don't actually do so in
> append_unicode_mangled_name. We could also just use '__'.
So you basically suggest that __UNNNN be emitted for every unicode
characters that we encounter. __LNNNNNNNN would be emited for 32-bits
extended characters (Java doesn't have to worry about it.)
And Java would be dropping the `U' at the end of the symbol too.
> With these fixes, I think the current scheme is OK. But for targets
> with 8-bit clean binutils, I think it makes a lot of sense to just
> use the UTF8 encoding in the symbol.
That's fine too, but requires coordinated changes in binutils.
./A
More information about the Java
mailing list