Universal Character Names, v2

Fri Nov 29 02:18:00 GMT 2002

Zack Weinberg <zack@codesourcery.com> writes:
> ... which I disagree with. I am rejecting this patch until you
> implement support for Unicode as she is spoke, which means UAX#15
> including normalization, not whatever nonsense is in the C and C++
> standards.

Can you elaborate why you consider this approach technically
superiour?
I have just implemented normalization for Python, and I can tell you
that you will need a significant database, and completion of such an
implementation will take me several weeks.
Apart from the implementation difficulties, I see the following
problems with this requirement:
1. It is underspecified, as UAX#15 leaves a number of alternatives for
 language designers:
 a) which Unicode version?
 b) which normalization form?
2. It extends the languages, by allowing identifiers which must be
 rejected in a conforming implementation. Can you propose an
 implementation strategy that allows proper implementation of the
 -pedantic option in this case?
3. It restricts the languages, by disallowing identifiers that are
 allowed in the language definition.
4. It modifies the languages, by treating identifiers as equal which
 are not to be treated equal in the language definition.
Regards,
Martin