gnu.gcj.convert.UnicodeToBytes and gnu.gcj.convert.Output_UnicodeLittle
Sven de Marothy
sven@physto.se
Wed Apr 20 15:51:00 GMT 2005
Hello Andreas,
First off, I need to adress some of your previous statements.
"in my understanding if Charset.available has UTF-16LE in it the String
constructor should work"
This is a misinterpretation. Charset.available() returns the available
java.NIO charsets, which are NOT necessarily the same as those available for
java.io and java.lang. See:
http://java.sun.com/j2se/1.4.2/docs/guide/intl/encoding.doc.html
This still works for UTF-16LE because it is available in both, but
the problem here was static linking, as stated earlier. The proper thing to use
with the String constructor is "UnicodeLittleUnmarked" and not "UTF-16LE" though.
>But why does the Charset Class return that UTF-16LE is supported ?
Because it's supported by the GCJ NIO encoders.
If you want to use them to convert bytes to chars and create a string
from them you can do it like this:
CharsetDecoder csd = Charset.forName("UTF-16LE").newDecoder();
csd.onMalformedInput(CodingErrorAction.REPLACE);
csd.onUnmappableCharacter(CodingErrorAction.REPLACE);
CharBuffer cbuf = csd.decode(ByteBuffer.wrap(data, offset, count));
csd.reset();
String s = new String(cbuf.array());
(resetting the decoder shouldn't be required here, it's a bug, but it's
fixed in classpath CVS)
But if all you want is UTF16LE, you could easily just skip all this
encoding framework stuff and just exploit the fact that UTF16 is native
to java, e.g. bytes-to-char: (little endian)
char c = (char)(((byte2 & 0xFF) << 8) | (byte1 & 0xFF));
and char-to.bytes:
byte byte1 = (byte)(c & 0xFF);
byte byte2 = (byte)((c >> 8) & 0xFF);
We're currently in the process of moving the String and java.io stuff
over to using the NIO encoders/decoders internally. (In which case
Charset.available() _will_ match those supported by String) but again,
this is not behaviour you should rely on. When this transition is
completed, both java and native converters will be available, too.
/Sven
More information about the Java
mailing list