Many "Bad file descriptor"s and Socket's break my communication
Martin Egholm Nielsen
martin@egholm-nielsen.dk
Mon Nov 7 19:24:00 GMT 2005
Hi David,
>> I've been using the last four days to narrow in a strange problem in
>> my webserver.
>> I have a multi-threaded webserver running on an embedded ppc405 module
>> running Linux (2.4.2x) and libgcj (3.4.3 - ancient I know). The
>> webserver is running SSL (thanks to Jessie) with keep-alive support.
>> Fine!
>>>> However, due to a typo in my connecting SSL-client, I somehow forced
>> the client/server to create a new socket connection (and thereby start
>> SSL-handshaking) each time the client wanted to talk to the server.
>> (The "somehow" on the client, was registering a new HostnameVerifier
>> and a new SSLSocketFactory on HttpsURLConnection.)
>>>> This consequently made each thread belonging to an old request hang in
>> a "read()" way down in Jessie on a
>> gnu.java.net.PlainSocketImpl$SocketInputStream.
>>>> However, as the sockets were "closed" (or whatever Sun's
>> HttpsURLConnection implementation does when "abandoning" its cached
>> sockets in my client) I would have expected the stuck invocations to
>> "read()" to return -1 more or less immediately afterwards. (As all the
>> faulty examples I've created to reproduce this does!)
>> But no, instead of returning, they just keep being stuck there - but
>> not forever!
>>>> After some 10-20 new connections to my server, _suddenly_ 10-20 stuck
>> read()'s "return" by throwing me IOException's with "Bad file
>> descriptor"!
>> And now, I really don't care if I get "-1" returned cleanly or I am
>> thrown these IOException - that is, _if_ it didn't "brake" the new
>> socket connection triggering the throwing.
>> And for "brake" I have only a vague explanation:
>>>> 1) Mostly I've seen "huge" amounts of data (~200 chars) disappear from
>> the triggering socket's inputstream. Resulting in my request-thread
>> getting stuck in a read() never getting any of the expected data.
>> 2) On seldom occasions I've seen only "small" amount of characters
>> disappear - one char, and sometimes more.
>>>> That was a long story, but I have a smole hope that somebody can guide
>> me further in solving this.
>> Alternatively, I would really like to know how to provoke read() to
>> throw "Bad file descriptor" instead of -1, so that I might have a
>> chance reproducing this with a much smaller example.
>>> It is hard to say exactly what is happening. However it it possible
> that it is caused by the bug fixed by this:
>> http://gcc.gnu.org/ml/java-patches/2005-q1/msg00753.html
>> patch. It could of course be unrelated to that bug, in which case I can
> only wish you luck.
Thanks David, but I actually saw that one when comparing my old
natPlainSocketImplPosix.cc with the one from CVS to see if there were
any obvious fixes. So I updated this from CVS along with
PlainSocketImpl.java...
I've been investigating a bit further, and using "netstat" I've noticed
that all the sockets that results in "bad file descriptor" exceptions,
have a "FIN_WAIT2" state just before.
I'm trying to tune the kernel to avoid getting so many of those, that
may help on the problem - maybe it's a kernel bug instead...
BR,
Martin
More information about the Java
mailing list