318 – wait does not release thread resources on Linux

D issues are now tracked on GitHub. This Bugzilla instance remains as a read-only archive.
Issue 318 - wait does not release thread resources on Linux
Summary: wait does not release thread resources on Linux
Status: RESOLVED FIXED
Alias: None
Product: D
Classification: Unclassified
Component: phobos (show other issues)
Version: D1 (retired)
Hardware: All Linux
: P2 blocker
Assignee: Brad Roberts
URL:
Keywords:
Depends on:
Blocks: 322
Show dependency tree / graph
Reported: 2006年09月02日 13:58 UTC by Mikola Lysenko
Modified: 2014年02月15日 13:29 UTC (History)
0 users

See Also:


Attachments
Proposed fix for phobos 1.x - v1 (3.44 KB, patch)
2007年10月19日 23:52 UTC, Brad Roberts
Details | Diff
patch v2 (12.79 KB, patch)
2007年10月21日 05:54 UTC, Brad Roberts
Details | Diff
Show Obsolete (1) Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this issue.
Description Mikola Lysenko 2006年09月02日 13:58:49 UTC
While wait is supposed to release a thread's resources, it will fail if the thread has already completed. This makes it impossible to use more than 400 threads reliably. Here is an example which demonstrates the problem:
import std.stdio, std.thread;
void main()
{
 for(int i=0; i<80000; i++)
 {
 writefln("Creating thread %d", i);
 Thread t = new Thread({writefln(" Created!"); return 0;});
 t.start;
 for(int x=0; x<1000; x++)
 Thread.yield;
 t.wait;
 writefln(" Finished.");
 }
}
Within a few hundred iterations, this code will likely produce a "failed to start" error. From my testing, this issue only affects Linux.
So far, there are no workarounds.
Comment 1 Sean Kelly 2006年09月02日 14:25:17 UTC
d-bugmail@puremagic.com wrote:
> http://d.puremagic.com/issues/show_bug.cgi?id=318
> 
> Summary: wait does not release thread resources on Linux
> Product: D
> Version: 0.165
> Platform: All
> OS/Version: Linux
> Status: NEW
> Severity: blocker
> Priority: P2
> Component: Phobos
> AssignedTo: bugzilla@digitalmars.com
> ReportedBy: mclysenk@mtu.edu
> 
> 
> While wait is supposed to release a thread's resources, it will fail if the
> thread has already completed. This makes it impossible to use more than 400
> threads reliably. Here is an example which demonstrates the problem:
> 
> 
> import std.stdio, std.thread;
> 
> void main()
> {
> for(int i=0; i<80000; i++)
> {
> writefln("Creating thread %d", i);
> Thread t = new Thread({writefln(" Created!"); return 0;});
> t.start;
> for(int x=0; x<1000; x++)
> Thread.yield;
> t.wait;
> writefln(" Finished.");
> }
> }
> 
> 
> Within a few hundred iterations, this code will likely produce a "failed to
> start" error. From my testing, this issue only affects Linux.
I think line 667 of thread.d should be changed from:
 if (state == TS.RUNNING)
to:
 if (state != TS.INITIAL)
Because it is not only legal to call pthread_join on a thread that has 
run and finished, but calling pthread_join or pthread_detach is required 
for the thread resources to be released. However, it is illegal to call 
pthread_join more than once, and I believe it is also illegal to detach 
a thread that has already been joined, so 'id' should probably be 
cleared after join/detach is called, and this value tested along with 
'state' before performing thread ops.
As an unrelated issue, I just noticed that CloseHandle is not being 
called on the thread handle for Win32, and pthread_detach is not being 
called for Posix. I think this should be done in a thread dtor or the 
equivalent to ensure resources are properly released.
Sean
Comment 2 Chris Miller 2006年11月19日 19:05:14 UTC
> As an unrelated issue, I just noticed that CloseHandle is not being 
> called on the thread handle for Win32, and pthread_detach is not being 
> called for Posix. I think this should be done in a thread dtor or the 
> equivalent to ensure resources are properly released.
> 
Yes, this is very important. This is a huge bug.
Sometimes one uses "throwaway" threads that just do one thing and terminate. Currently, it will cause a huge leak and potential errors.
Comment 3 Brad Roberts 2007年10月19日 23:52:25 UTC
Created attachment 197 [details] 
Proposed fix for phobos 1.x - v1
I've run this through a bit of testing of this diff, both 1.x and 2.x, using the provided example test case and a few variations of my own. (so far just on linux, but I'll test on windows shortly).
I can no longer reproduce the problem. That said, threading problems are notoriously difficult to be sure about. I'd appreciate it if some of you could take a look and hopefully even build your own phobos and do some testing.
I need to think a little bit more about the running -> terminated -> finished transition steps a bit to make sure it's safe in all cases. I really would prefer not to have to make state management synchronized.
Thanks,
Brad
Comment 4 Brad Roberts 2007年10月21日 05:54:04 UTC
Created attachment 198 [details] 
patch v2
Further testing showed race conditions between the gc and the thread library so I went ahead with the conservative approach. I'm not happy with this many sync points, but my test cases no longer show any problems.
Comment 5 Walter Bright 2007年11月03日 21:41:04 UTC
Fixed dmd 1.023 and 2.007


AltStyle によって変換されたページ (->オリジナル) /