Universal Character Names, v2
Martin v. Löwis
martin@v.loewis.de
Fri Nov 29 02:18:00 GMT 2002
Zack Weinberg <zack@codesourcery.com> writes:
> ... which I disagree with. I am rejecting this patch until you
> implement support for Unicode as she is spoke, which means UAX#15
> including normalization, not whatever nonsense is in the C and C++
> standards.
Can you elaborate why you consider this approach technically
superiour?
I have just implemented normalization for Python, and I can tell you
that you will need a significant database, and completion of such an
implementation will take me several weeks.
Apart from the implementation difficulties, I see the following
problems with this requirement:
1. It is underspecified, as UAX#15 leaves a number of alternatives for
language designers:
a) which Unicode version?
b) which normalization form?
2. It extends the languages, by allowing identifiers which must be
rejected in a conforming implementation. Can you propose an
implementation strategy that allows proper implementation of the
-pedantic option in this case?
3. It restricts the languages, by disallowing identifiers that are
allowed in the language definition.
4. It modifies the languages, by treating identifiers as equal which
are not to be treated equal in the language definition.
Regards,
Martin
More information about the Java
mailing list