This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年11月29日 12:33 by kristjan.jonsson, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| gccallback1.patch | kristjan.jonsson, 2010年11月29日 12:33 | review | ||
| gccallback2.patch | kristjan.jonsson, 2010年11月30日 13:16 | review | ||
| gccallback3.patch | kristjan.jonsson, 2010年12月05日 08:45 | review | ||
| gccallback4.patch | kristjan.jonsson, 2012年03月20日 20:05 | review | ||
| gccallback4a.patch | Jim.Jewett, 2012年03月21日 01:31 | Based on krisvale's patch 4; see below | ||
| gccallback.patch | kristjan.jonsson, 2012年04月07日 13:00 | review | ||
| gccallback.patch | kristjan.jonsson, 2012年04月08日 13:45 | review | ||
| Messages (31) | |||
|---|---|---|---|
| msg122795 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年11月29日 12:33 | |
As discussed here: http://mail.python.org/pipermail/python-ideas/2010-November/008813.html: Adding the ability to register callbacks to be invoked before and after garbage collection runs. This can be used to gather run-time statistics such as timing information and frequency of garbage collection runs, and to perform application-specific cleanup of uncollecatable objects from gc.garbage. The first patch is the code as currently in use in our codebase at CCP (ported from 2.7). There is only one callback registered and the callback signature is perhaps a bit lame. Also, no error checking. But it is shown here for reference and as a basis for discussion. |
|||
| msg122840 - (view) | Author: Daniel Stutzbach (stutzbach) (Python committer) | Date: 2010年11月29日 18:32 | |
These functions will be very useful for any long-running program. Thank you for the patch. Would you be willing to write tests and documentation? Would it make more sense for the callback to take a boolean instead of an integer as the first argument? |
|||
| msg122844 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月29日 18:40 | |
Well, as you point out, it would make more sense to have two separate callbacks. Also, PyErr_WriteUnraisable() is better than PyErr_Clear(). Finally, you accidentally recoded the file; it should be kept utf-8, not latin-whatever. |
|||
| msg122869 - (view) | Author: Jim Jewett (jimjjewett) | Date: 2010年11月29日 20:47 | |
I like the idea, but I do quibble with the signature. As nearly as I can tell, you're assuming (a) Only one callback. I would prefer a sequence of callbacks, to make cooperation easier. (This does mean you would need a callback removal, instead of just setting it to None.) (b) The callback will be called once before collecting generations, and once after (with the number of objects that weren't collected). Should these be separate callbacks, rather than the same one with a boolean? And why does it need the number of uncollected objects? (This might be a case where Practicality Beats Purity, but it is worth documenting.) |
|||
| msg122901 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年11月30日 13:16 | |
Hi, as I stated, the original patch was simply our original implementation. Here is a new patch. It is simpler: 1) it exposes a gc.callbacks list where users can register themselves, in the spirit of sys.meta_path 2) One can have multiple callbacks 3) Improve error handling 4) The callback is called with a "phase" argument, currently 0 for start, and 1 for the end. Let's start bikeshedding the calling signature. I like having a single callback, since multiple callables are a nuisance to manage. Once we agree, I'll post a patch for the documentation, and unittests. |
|||
| msg122902 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月30日 13:24 | |
> Let's start bikeshedding the calling signature. I like having a > single callback, since multiple callables are a nuisance to manage. IMO the callback should have a second argument as a dict containing various statistics that we can expand over time. The generation number, for example, should be present. As for the phase number, if 0 means start and 1 means end, you can't decently add another phase anyway (having 2 mean "somewhere between 0 and 1" would be completely confusing). PS : please don't use C++-style comments in your patch |
|||
| msg122903 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年11月30日 13:27 | |
You are right, Antoine. How about a string and a dict? the string can be "start" and "stop" and we can add interesting information to the dict as you suggest. |
|||
| msg122913 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年11月30日 16:10 | |
> You are right, Antoine. > How about a string and a dict? the string can be "start" and "stop" > and we can add interesting information to the dict as you suggest. Looks good to me. |
|||
| msg122914 - (view) | Author: Daniel Stutzbach (stutzbach) (Python committer) | Date: 2010年11月30日 16:32 | |
> How about a string and a dict? the string can be "start" and "stop" > and we can add interesting information to the dict as you suggest. I like where this is headed. How about putting the string in the dict, too? d['phase'] = 'start' |
|||
| msg123224 - (view) | Author: Robert Lehmann (lehmannro) * | Date: 2010年12月03日 10:06 | |
A few issues I'd like to raise: (1) Multiple callback chains. Is there any code in your existing use case of GC callbacks where you don't check for the phase argument and follow different code paths depending on it? If not, having two callback chains should be fine and takes the burden from the programmer to the implementors. (This is feasible if we *only ever* have two values for the phase.) (2) Single collections. Currently, neither PyGC_Collect nor gc.collect() invoke the callbacks (since they do not call collect_generations). Is this an oversight or intentional? (3) Error checking. What about callbacks which are bound to fail on each and every invocation, ie. because of wrong signatures. Should these be flat-out rejected in some way *on registration*, automagically removed when first encountered, or are we okay with errors slammed into the user's face every so often because he should REALLY fix them? (4) Interop. Can this be supported as easily on other VMs? (That's perhaps a good reason for the statistics to be a dict, for GCs providing vastly different amounts of information.) |
|||
| msg123225 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年12月03日 10:13 | |
1) I'm not sure what you are asking. Does anyone think that it is simpler to register two different callbacks than one? IMHO it complicates the interface and creates so many oppertunities to do things incorrectly. 2)No, it is an oversight, let me verify the code. 3)I don't think we ought to be smart and try to remove callbacks. I don't think that is common practice, the developer will know soon enough if things don't work. Just keep it simple. 4)Interop, I think, is not an issue. the gc module is a C-Python only implementation detail. Every implementation has the freedom to collect garbage in its own way. I'll post an updated version soon. |
|||
| msg123253 - (view) | Author: Jim Jewett (jimjjewett) | Date: 2010年12月03日 15:29 | |
<blockquote>Does anyone think that it is simpler to register two different callbacks than one? </blockquote> Moderately, yes. Functions that actually help with cleanup should normally be run only in one phase; it is just stats-gathering and logging functions that might run both times, and I don't mind registering those twice. For functions that are run only once (which I personally think is the more normal case), the choices are between @register_gc def my_callback(actually_run_flag, mydict): if not actually_run_flag: return ... vs @register_gc_before def my_callback(mydict): ... |
|||
| msg123415 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年12月05日 08:45 | |
Here is a third patch. The callback now gets two argument, phase and info. I've added documentation and unittests. |
|||
| msg124508 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年12月22日 13:56 | |
Uh oh. I forgot about this and there now we have passed beta 2. Didn't anyone want to review the patch? |
|||
| msg124570 - (view) | Author: Lukas Lueg (ebfe) | Date: 2010年12月23日 21:26 | |
Why not make the start-callback be able to return a boolean value to the gcmodule that indicates if garbage collection should take place or not. For example, any value returned from the callback that evaluates to False (like null) will cause the module to evaluate any other callback and possibly collect garbage objects. Any value that evaluates to True (like True) returned from any callback causes all further callbacks to not be called and garbage collection not to take place now. |
|||
| msg124631 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年12月25日 06:52 | |
Well, the idea is good and it did cross my mind. Particularly it could be useful for performance sensitive applications. However it complicates things. 1) If a GC is rejected, when do we make the next attempt? 2) If a callback cancels GC, then what about the other callbacks that have already been called? They would need to get some "canceled" call. But this is ok, I suppose. Since we have already passed the beta 2, I'll see if I can come up with a simple change, particularly if we don't descend into some infinite loop wrt. 1) above. |
|||
| msg124666 - (view) | Author: Lukas Lueg (ebfe) | Date: 2010年12月26日 17:05 | |
Collection may re-occur at any time, there is no promise to the callback code. However, the callback can disable the gc, preventing further collection. I don't think we need the other callbacks to be informed. As the callbacks are worked down in the order they registered, whoever comes first is served first. Returning True from the callback is mereley a "I dont mind if gc happens now..." |
|||
| msg124698 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2010年12月27日 02:35 | |
1) what I mean is that if a callback rejects GC, the GC algorithm may find its condition for invoking GC in the first place to be still valid immediately afterwards, so doing a GC will be immediately retried. I have to check, but it could mean that more changes would be required. 2) Of course callbacks have to know, e.g. those that intend to gather statisctic or measure the time of GC. They have started a timer on the "start" opcode, and expect a "stop" code to follow. They have to get some "canceled" code for their bookkeeping to work. Then additionally we have the question: Should you be able to cancel a direct gc request (like calling gc.collect()) or just the automatic one? This then starts to be a much more complicated change, perhaps one that requires a PEP so I don't think we should do all of that in one gulp. Once the callback mechanism is in, there is every oppertunity to extend it. |
|||
| msg124705 - (view) | Author: Lukas Lueg (ebfe) | Date: 2010年12月27日 07:25 | |
Agreed, let's have the simple callback first. To solve 2) later on, we could have the callback proposed here be the 'execution'-callback. It neither has nor will have the capability to prevent garbage-collection. We can introduce another 'prepare'-callback later which is called when the gc-modules decides that it is time for collection. Callbacks may react with a negative value so execution does not happen and the execution-callbacks are also never called. |
|||
| msg156450 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年03月20日 19:26 | |
Hi! Michael Foord reminded me of this little gem. I'm getting this started again, hopefully for inclusion in 3.3 |
|||
| msg156453 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年03月20日 20:05 | |
Hm, actually it wasn't Michael, but Martin. No matter! Here is a proposed patch, as promised without all the bells and whistles, in particular, there is no feature to cancel the garbage collection. Can we get this into 3.3? |
|||
| msg156471 - (view) | Author: Jim Jewett (Jim.Jewett) * (Python triager) | Date: 2012年03月21日 01:31 | |
gccallback4a.patch is a few changes to gccallback4.patch. Most are just spelling or grammar in comments, but I did modify the test case so that the Uncollectable loop had *multiple* objects with __del__ methods. |
|||
| msg157172 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年03月31日 11:36 | |
Thanks, Jim. Unless anyone objects, I'll commit this then. |
|||
| msg157176 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年03月31日 11:57 | |
Comments: - the tests look fragile. How can you know a garbage collection will only collect your own objects? So you should call gc.collect() first at the beginning of each test and then initialize the self.visit list. We don't want weird failures because of the unittest machinery or anything else. - I also don't understand the logic in testCollect. Why can't you directly check the contents of self.visit instead of that convoluted code? - In invoke_gc_callback(), "i" should be a Py_ssize_t, not an int - In invoke_gc_callback(), in which situation can callbacks be something else than a list? I think the PyList_Check() should be an assert (and probably use PyList_CheckExact()). - Finally, *please* try to follow PEP 8. Comments should have a space after the "#". Otherwise they look unreadable. |
|||
| msg157187 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年03月31日 13:27 | |
Thank you Antoine. Your points: - Yes, I can robustify this. - I think I worried that the actual contents might be too complex to test for it. I'll see if I can't just simplify it as you suggest. - right, thanks. - gc.callbacks is a simple module attribute. Anyone can set it to anything else, e.g. gc.callbacks=None. We have to accomodate this possibility or else introduce annoying api functions to edit the list. I thought it best to do things similarly to sys.import_hooks etc, simply expose a list object and trust the user to treat this list object with care, but check it regardless to avoid crashing. - Pep8, yes sorry. I do a lot of programming in other projects that are more lenient in comment styles, so it doesn't come automatically. I personally have no problem whatsoever #to #read #such #comments :) |
|||
| msg157188 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年03月31日 13:35 | |
> - gc.callbacks is a simple module attribute. Anyone can set it to > anything else, e.g. gc.callbacks=None. We have to accomodate this > possibility or else introduce annoying api functions to edit the list. > I thought it best to do things similarly to sys.import_hooks etc, > simply expose a list object and trust the user to treat this list > object with care, but check it regardless to avoid crashing. The way I read it, you don't fetch it from the module dictionary, though, you just use the static C variable, which shouldn't change when the dict is mutated. |
|||
| msg157660 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年04月06日 11:46 | |
You are right, I was thinking more of pyobject attributes. I'll fix this then. |
|||
| msg157733 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年04月07日 13:00 | |
Here is an updated patch, taking Jim's and Antoine's comments into account. Jim, I ́d like to comment that I think the reason __del__ objects are uncollectable is more subtle than there being no defined order of calling the __del__ functions. More significantly, no python code may be executed during an implicit garbage collection. Now, it is possible that one could clean up cycles containing only one __del__ method during _expcicit_ collections (calling gc.collect()) but it hardly seems worth the effort. |
|||
| msg157734 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年04月07日 13:48 | |
Uploaded another review. I also notice you didn't really address my point, since self.visit is still initialized too early. IMO it should be initialized after the first gc.collect() at the beginning of each test (not in setUp()). |
|||
| msg157788 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2012年04月08日 13:45 | |
A new patch, taking Antoine's review and comments into account. |
|||
| msg158320 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年04月15日 11:42 | |
New changeset 88f8ef5785d7 by Kristján Valur Jónsson in branch 'default': Issue #10576: Add a progress callback to gcmodule http://hg.python.org/cpython/rev/88f8ef5785d7 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:09 | admin | set | github: 54785 |
| 2012年04月15日 11:43:36 | kristjan.jonsson | set | status: open -> closed resolution: fixed |
| 2012年04月15日 11:42:22 | python-dev | set | nosy:
+ python-dev messages: + msg158320 |
| 2012年04月08日 13:45:19 | kristjan.jonsson | set | files:
+ gccallback.patch messages: + msg157788 |
| 2012年04月07日 13:48:35 | pitrou | set | messages: + msg157734 |
| 2012年04月07日 13:00:09 | kristjan.jonsson | set | files:
+ gccallback.patch messages: + msg157733 |
| 2012年04月06日 11:46:29 | kristjan.jonsson | set | messages: + msg157660 |
| 2012年03月31日 13:35:27 | pitrou | set | messages: + msg157188 |
| 2012年03月31日 13:27:42 | kristjan.jonsson | set | messages: + msg157187 |
| 2012年03月31日 11:57:56 | pitrou | set | messages: + msg157176 |
| 2012年03月31日 11:36:55 | kristjan.jonsson | set | messages: + msg157172 |
| 2012年03月21日 01:31:45 | Jim.Jewett | set | files:
+ gccallback4a.patch nosy: + Jim.Jewett messages: + msg156471 |
| 2012年03月20日 20:05:01 | kristjan.jonsson | set | files:
+ gccallback4.patch messages: + msg156453 |
| 2012年03月20日 19:26:57 | kristjan.jonsson | set | nosy:
+ michael.foord messages: + msg156450 |
| 2010年12月27日 07:25:26 | ebfe | set | nosy:
jimjjewett, pitrou, kristjan.jonsson, lehmannro, stutzbach, ebfe, asvetlov, Yury.Selivanov messages: + msg124705 |
| 2010年12月27日 02:35:59 | kristjan.jonsson | set | nosy:
jimjjewett, pitrou, kristjan.jonsson, lehmannro, stutzbach, ebfe, asvetlov, Yury.Selivanov messages: + msg124698 |
| 2010年12月26日 17:05:06 | ebfe | set | nosy:
jimjjewett, pitrou, kristjan.jonsson, lehmannro, stutzbach, ebfe, asvetlov, Yury.Selivanov messages: + msg124666 |
| 2010年12月25日 06:52:05 | kristjan.jonsson | set | nosy:
jimjjewett, pitrou, kristjan.jonsson, lehmannro, stutzbach, ebfe, asvetlov, Yury.Selivanov messages: + msg124631 |
| 2010年12月23日 21:26:22 | ebfe | set | nosy:
+ ebfe messages: + msg124570 |
| 2010年12月22日 13:56:14 | kristjan.jonsson | set | nosy:
jimjjewett, pitrou, kristjan.jonsson, lehmannro, stutzbach, asvetlov, Yury.Selivanov messages: + msg124508 |
| 2010年12月05日 14:09:46 | pitrou | set | stage: patch review versions: + Python 3.3, - Python 3.2 |
| 2010年12月05日 08:45:03 | kristjan.jonsson | set | files:
+ gccallback3.patch messages: + msg123415 |
| 2010年12月03日 20:01:02 | Yury.Selivanov | set | nosy:
+ Yury.Selivanov |
| 2010年12月03日 15:29:11 | jimjjewett | set | messages: + msg123253 |
| 2010年12月03日 10:13:52 | kristjan.jonsson | set | messages: + msg123225 |
| 2010年12月03日 10:06:58 | lehmannro | set | nosy:
+ lehmannro messages: + msg123224 |
| 2010年11月30日 16:32:54 | stutzbach | set | messages: + msg122914 |
| 2010年11月30日 16:10:46 | pitrou | set | messages: + msg122913 |
| 2010年11月30日 13:27:40 | kristjan.jonsson | set | messages: + msg122903 |
| 2010年11月30日 13:24:19 | pitrou | set | messages: + msg122902 |
| 2010年11月30日 13:16:22 | kristjan.jonsson | set | files:
+ gccallback2.patch messages: + msg122901 |
| 2010年11月29日 20:47:00 | jimjjewett | set | nosy:
+ jimjjewett messages: + msg122869 |
| 2010年11月29日 18:40:19 | pitrou | set | nosy:
+ pitrou messages: + msg122844 |
| 2010年11月29日 18:32:48 | stutzbach | set | nosy:
+ stutzbach messages: + msg122840 |
| 2010年11月29日 14:06:36 | asvetlov | set | nosy:
+ asvetlov |
| 2010年11月29日 12:33:06 | kristjan.jonsson | create | |