I need an advice
Martin Kahlert
martin.kahlert@infineon.com
Wed Aug 15 23:59:00 GMT 2001
Hi!
I investigated my problem from
http://gcc.gnu.org/ml/java/2001-05/msg00110.html
a bit deeper and found out this:
(Sorry for the delay, but debugging on that machine is no fun since
a 'gmake && gmake install' lasts for more than half an hour).
My application reads a binary file, transforms the data and builds
String objects by 'new String' out of a byte Array.
This calls
public String (byte[] byteArray, int offset, int count)
from java/lang/String.java
and tries init with the default file encoding from the system property
"file.encoding".
I did not set anything special here, so my Solaris box responds to
$ locale
with
LANG=
LC_CTYPE="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=
The system property "file.encoding" is thus set to "646" in natSystem.cc
This encoding seems to be either incorrect or it cannot be handled,
so i get an exception each time i use new String(byteArray) from my
application. This occures 180000 times and each time i use it,
java::lang::String::init (jbyteArray bytes, jint offset, jint count,
jstring encoding)
from natString.cc tries
gnu::gcj::convert::BytesToUnicode::getDecoder(encoding);
with this wrong or not correctly handled encoding and throws an exception,
because getDecoder's command
decodingClass = Class.forName(className)
does not result in anything usable.
It just creates traffic to my harddisk :-(
Of course the same problem applies the other way round (String --> ByteArray)
So i have some options:
- The quick fix for me was
#define DEFAULT_FILE_ENCODING "8859_1" instead of
#define DEFAULT_FILE_ENCODING file_encoding ()
in natSystem.cc
That's not nice, but it works.
- Set LC_??? to anything useful; i tried a lot of possibilities but nothing
worked (neither UTF-8 nor 8859_1 in LC_ALL or LC_CTYPE)
- Check inside natSystem.cc, if the file.encoding actually works and replace
it by "8859_1" inside file_encoding() if not.
So we always have a working fall back. Does this conform to the java docs?
- Cache the last working encoding in
public String (byte[] byteArray, int offset, int count)
inside String.java.
Nevertheless we must make sure, that changes during
runtime of "file.encoding" get recognized.
- Do the same in natString.cc
java::lang::String::init (jbyteArray bytes, jint offset, jint count,
jstring encoding)
with encoding.
Any words of wisdom here?
Thanks
Martin.
PS: This is all tested using the gcc-20010813 snapshot on a Solaris 5.7 box.
--
The early bird catches the worm. If you want something else for
breakfast, get up later.
More information about the Java
mailing list