Solaris -vs- iconv

Per Bothner per@bothner.com
Tue Apr 3 12:01:00 GMT 2001


Tom Tromey <tromey@redhat.com> writes:
> I just mean that a UCS-2 value fits in an int. It is easy to
> manipulate in C. A UTF-8 encoded character requires buffer
> manipulation and is a pain.

Of course characters should be UCS-2 ints (One could make
an argument for UCS-4, which is what glibc does, but for Java
UCS-2 currently makes more sense. The difference is whether
surrogate characters are treated as two or one characters.
I don't think it matters much.)
But input buffers should I think be UTF-8.
-- 
	--Per Bothner
per@bothner.com http://www.bothner.com/~per/


More information about the Java mailing list

AltStyle によって変換されたページ (->オリジナル) /