UTF-16 not supported?

Tom Tromey tromey@redhat.com
Fri Aug 16 14:40:00 GMT 2002


>>>>> "Suresh" == Suresh Raman <sugansha@yahoo.com> writes:

Suresh> The output of the program should be "hello world", which it is with
Suresh> UTF-8. But with UTF-16 or UTF-16BE, the output is a truncated string
Suresh> "hell" or "hello".
The appended patch fixes your test case for me. It also doesn't cause
any regressions on our test suite (including Mauve).
Does anybody out there have a box with glibc 2.1.3? I'd like to know
if you could run a test to see how this behaves there.
How common is 2.1.3? Is there a distribution still using it? (Even a
somewhat old distribution, if it is still in common use.) If it is
really obsolete then I can just remove all pretense at a workaround...
Tom
Index: ChangeLog
from Tom Tromey <tromey@redhat.com>
	* gnu/gcj/convert/natIconv.cc (write): Handle case where no
	output buffer is too small.
Index: gnu/gcj/convert/natIconv.cc
===================================================================
RCS file: /cvs/gcc/gcc/libjava/gnu/gcj/convert/natIconv.cc,v
retrieving revision 1.13
diff -u -r1.13 natIconv.cc
--- gnu/gcj/convert/natIconv.cc 18 Feb 2002 02:52:44 -0000 1.13
+++ gnu/gcj/convert/natIconv.cc 16 Aug 2002 21:37:39 -0000
@@ -1,6 +1,6 @@
-// Input_iconv.java -- Java side of iconv() reader.
+// natIconv.cc -- Java side of iconv() reader.
 
-/* Copyright (C) 2000, 2001 Free Software Foundation
+/* Copyright (C) 2000, 2001, 2002 Free Software Foundation
 
 This file is part of libgcj.
 
@@ -201,25 +201,39 @@
 inbuf = (char *) temp_buffer;
 }
 
- // If the conversion fails on the very first character, then we
- // assume that the character can't be represented in the output
- // encoding. There's nothing useful we can do here, so we simply
- // omit that character. Note that we can't check `errno' because
- // glibc 2.1.3 doesn't set it correctly. We could check it if we
- // really needed to, but we'd have to disable support for 2.1.3.
 size_t loop_old_in = old_in;
 while (1)
 {
 size_t r = iconv_adapter (iconv, (iconv_t) handle,
 				&inbuf, &inavail,
 				&outbuf, &outavail);
- if (r == (size_t) -1 && inavail == loop_old_in)
+ if (r == (size_t) -1)
 	{
-	 inavail -= 2;
-	 if (inavail == 0)
-	 break;
-	 loop_old_in -= 2;
-	 inbuf += 2;
+	 if (errno == EINVAL)
+	 {
+	 // Incomplete byte sequence at the end of the input
+	 // buffer. This shouldn't be able to happen here.
+	 break;
+	 }
+	 else if (errno == E2BIG)
+	 {
+	 // Output buffer is too small.
+	 break;
+	 }
+	 else if (errno == EILSEQ || inavail == loop_old_in)
+	 {
+	 // Untranslatable sequence. Since glibc 2.1.3 doesn't
+	 // properly set errno, we also assume that this is what
+	 // is happening if no conversions took place. (This can
+	 // be a bogus assumption if in fact the output buffer is
+	 // too small.) We skip the first character and try
+	 // again.
+	 inavail -= 2;
+	 if (inavail == 0)
+		break;
+	 loop_old_in -= 2;
+	 inbuf += 2;
+	 }
 	}
 else
 	break;


More information about the Java mailing list

AltStyle によって変換されたページ (->オリジナル) /