homepage

This issue tracker has been migrated to GitHub , and is currently read-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

Author asksol
Recipients asksol, gdb, jnoller
Date 2010年07月14日.07:58:13
SpamBayes Score 0.04943371
Marked as misclassified No
Message-id <1279094296.21.0.17444669581.issue9205@psf.upfronthosting.co.za>
In-reply-to
Content
There's one more thing
 if exitcode is not None:
 cleaned = True
 if exitcode != 0 and not worker._termination_requested:
 abnormal.append((worker.pid, exitcode))
Instead of restarting crashed worker processes it will simply bring down
the pool, right?
If so, then I think it's important to decide whether we want to keep
the supervisor functionality, and if so decide on a recovery strategy.
Some alternatives are:
A) Any missing worker brings down the pool.
B) Missing workers will be replaced one-by-one. A maximum-restart-frequency decides when the supervisor should give up trying to recover
the pool, and crash it.
C) Same as B, except that any process crashing when trying to get() will bring down the pool.
I think the supervisor is a good addition, so I would very much like to keep it. It's also a step closer to my goal of adding the enhancements added by Celery to multiprocessing.pool.
Using C is only a few changes away from this patch, but B would also be possible in combination with my accept_callback patch. It does pose some overhead, so it depends on the level of recovery we want to support.
accept_callback: this is a callback that is triggered when the job is reserved by a worker process. The acks are sent to an additional Queue, with an additional thread processing the acks (hence the mentioned overhead). This enables us to keep track of what the worker processes are doing, also get the PID of the worker processing any given job (besides from recovery, potential uses are monitoring and the ability to terminate a job (ApplyResult.terminate?). See http://github.com/ask/celery/blob/master/celery/concurrency/processes/pool.py 
History
Date User Action Args
2010年07月14日 07:58:16asksolsetrecipients: + asksol, jnoller, gdb
2010年07月14日 07:58:16asksolsetmessageid: <1279094296.21.0.17444669581.issue9205@psf.upfronthosting.co.za>
2010年07月14日 07:58:14asksollinkissue9205 messages
2010年07月14日 07:58:13asksolcreate

AltStyle によって変換されたページ (->オリジナル) /