This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2009年10月05日 01:40 by r.david.murray, last changed 2022年04月11日 14:56 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| test_multiprocessing.log | r.david.murray, 2009年10月05日 01:40 | |||
| Messages (30) | |||
|---|---|---|---|
| msg93570 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2009年10月05日 01:40 | |
test_multiprocessing is producing tracebacks and hanging on py3k trunk. I think this started happening fairly recently, as it seemed to work in one not-too-old checkout until I did an svn up and then it started failing. 3.1 seems fine. I've attached the tracebacks, including the tracebacks I get when I ctrl-C to break the hang. My system is Gentoo Linux. |
|||
| msg93571 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2009年10月05日 02:14 | |
r75241 seems to be the cause. Tests passes on r75240, but gives the dict error and hang on r75241. |
|||
| msg93601 - (view) | Author: Jesse Noller (jnoller) * (Python committer) | Date: 2009年10月05日 13:18 | |
Adding gps - I haven't had a chance to grok the changes, and me getting to a fix is going to take a few. |
|||
| msg93605 - (view) | Author: Jesse Noller (jnoller) * (Python committer) | Date: 2009年10月05日 15:16 | |
It's passing for me on trunk/py3k at least on OS/X - I'm going to need to dredge up a linuxvm |
|||
| msg93815 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2009年10月10日 00:55 | |
Benjamin tried reverting the weakref patch, but that didn't fix it. Turns out the problem is the other patch in that merge, that adds saving of the exception to AssertRaises. Adding Kristjan as that was his patch. (The tests pass with the weakref patch applied and the unittest patch reverted, by the way.) |
|||
| msg93825 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年10月10日 11:04 | |
Interesting. I don't have a linux machine to debug this. My patch appears harmless enough. We are only storing an exception _object_, not any tracebacks or such. If this were happening on my windows machine I would put in breakpoints where the ignored assertion is being raised and find out what's going on. |
|||
| msg93854 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年10月11日 12:03 | |
It was a bad idea to revert the change, because we may now forget about the problem rather than fix it. It is likely that the "dictionary changed size during iteration" occurs because a GC collection gets triggered during the iteration on the weakdict. A GC collection can destroy some objects in that case because keeping an exception around (which the unittest change does) usually creates reference cycles through the frame's locals. It seems that WeakValueDictionary.items() isn't very robust in py3k ;) We probably need to add list-returning variants, e.g. a listitems() method (and, probably, listkeys() and listvalues() too)? |
|||
| msg93855 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年10月11日 12:06 | |
The unittest only keeps an exception _object_ around, not the associated traceback. There should be no frames and thus reference cycles associated with this. I agree that the problem should be tackled, rather than swept under the carpet like a revert does. |
|||
| msg93856 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年10月11日 12:14 | |
> The unittest only keeps an exception _object_ around, not the associated > traceback. In py3k, the traceback is attached to the exception (on the __traceback__ attribute) ;) |
|||
| msg93857 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2009年10月11日 12:32 | |
I disagree about reverting it. We have this bug report, and that change is sitting on the merge queue again... |
|||
| msg93859 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年10月11日 14:08 | |
I didn't realize that the traceback was attached to the exception object in py3k. This makes the use of such objects more dangerous because of the circular references. The recommended practice of exc_type, exc_obj = sys.exc_info()[:2] is precisely to avoid any "accidents" with the tracebacks. See also http://mail.python.org/pipermail/python-dev/2005- August/055251.html So, can one just clear the __traceback__ member? That would make sense for the unittests, but it would make this manifestation of a GC bug go into hiding again. |
|||
| msg94413 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年10月24日 12:59 | |
Kristján: > So, can one just clear the __traceback__ member? Yes, or use `exc_value.with_traceback(None)`. |
|||
| msg95173 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2009年11月13日 02:37 | |
If this is not ported by the time we get to a 2.7 or 3.2 release, I'd like to back it out. |
|||
| msg95175 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 09:16 | |
I can easily port it again, and remove the traceback from the exception. But won't we be sweeping that GC problem under the carpet then? |
|||
| msg95178 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 15:04 | |
Please note there's a patch in issue7105 to allow safer iteration of weak dicts. |
|||
| msg95182 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 16:10 | |
Ok, I ported the change again in revision 76238, removing the traceback from the stored exception object. |
|||
| msg95183 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 16:12 | |
> Kristján Valur Jónsson <kristjan@ccpgames.com> added the comment: > > Ok, I ported the change again in revision 76238, removing the traceback > from the stored exception object. "versionchanged" should be 3.2, not 3.1. Thanks! |
|||
| msg95184 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 16:16 | |
Thanks. Fixed in revision 76239. |
|||
| msg95188 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 17:04 | |
I don't know if it's your last change, but test_multiprocessing is now hanging on all buildbots. |
|||
| msg95194 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2009年11月13日 17:35 | |
It is. TestZZZNumberOfObjects hangs at r76238, and does not hang at r76237 on my Gentoo Linux box. |
|||
| msg95195 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 18:04 | |
It seems the error is different, though. This is what I get after Ctrl-C (I omit the tracebacks from the other processes): Traceback (most recent call last): File "/home/antoine/py3k/__svn__/Lib/test/test_multiprocessing.py", line 1071, in test_number_of_objects refs = self.manager._number_of_objects() File "Z:\py3k\__svn__\lib\multiprocessing\managers.py", line 569, in _number_of_objects conn = self._Client(self._address, authkey=self._authkey) File "Z:\py3k\__svn__\lib\multiprocessing\connection.py", line 134, in Client c = SocketClient(address) File "Z:\py3k\__svn__\lib\multiprocessing\connection.py", line 253, in SocketClient s.connect(address) socket.error: [Errno 2] No such file or directory |
|||
| msg95196 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 19:04 | |
Ok, the problem is that a finalizer calls _decref which tries to open an XMLRPC connection (!) to an address on which nobody listens anymore. Then stupid SocketClient (in multiprocessing.connection) loops endlessly because it is designed to do so when it gets ECONNREFUSED. (when you Ctrl-C, you don't see all this because it happens in a finalizer, and only a terse message is displayed) So there are two problems: - why does test_multiprocessing think it should open an XMLRPC connection to a closed endpoint - why doesn't SocketClient have some kind of timeout rather than looping stupidly |
|||
| msg95203 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 21:39 | |
Yes, this is quite different. After seeing this, I created issue 7314. I didn't realize that it had anything to do with my patch but rather blamed it on my brand new windows 7. |
|||
| msg95204 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月13日 21:41 | |
> Yes, this is quite different. > After seeing this, I created issue 7314. > I didn't realize that it had anything to do with my patch but rather > blamed it on my brand new windows 7. I don't think it has anything to do with your patch. While diagnosing the problem, I noticed that adding some random print() lines would sometimes make the problem disappear. So it is really timing-dependent and your patch happens to trigger the right (or wrong ;-)) timing. |
|||
| msg95205 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 21:42 | |
See my analysis from issue 7314: The other processes are all waiting to ReadFile(). Maybe they have yet to run to the point where they open the socket to which the master process is trying to connect to. The mystery is also: Why does the harmless act of setting an instance attribute on a context manager (and without any traceback) trigger this behaviour? |
|||
| msg95206 - (view) | Author: Kristján Valur Jónsson (kristjan.jonsson) * (Python committer) | Date: 2009年11月13日 21:44 | |
Ah, okay then. Let's focus on solving 7414 then. |
|||
| msg95372 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2009年11月17日 10:07 | |
I don't think this is a blocker anymore. Kristján's patch has been ported and the test_multiprocessing freezes have been "fixed" by adding a timeout. As for the original purpose of this ticket, issue7105 is still (part of?) the answer. |
|||
| msg203249 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2013年11月18日 08:13 | |
I never seen this issue, can we close it? #7105 has been fixed in Python 3, not in Python 2.7. |
|||
| msg205226 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年12月04日 15:09 | |
Okay, let's say it is fixed. Adding Richard to nosy so that he can review the issue if he's interested. |
|||
| msg320627 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2018年06月27日 21:30 | |
> RuntimeError: dictionary changed size during iteration I didn't see this bug the last 5 years, so I consider that the issue has been fixed in the meanwhile :-) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:56:53 | admin | set | github: 51309 |
| 2018年06月27日 21:30:55 | vstinner | set | status: open -> closed resolution: out of date -> fixed messages: + msg320627 |
| 2013年12月04日 15:09:49 | pitrou | set | resolution: out of date messages: + msg205226 stage: needs patch -> resolved |
| 2013年11月18日 08:13:54 | vstinner | set | nosy:
+ vstinner messages: + msg203249 |
| 2010年08月27日 14:13:22 | asksol | set | nosy:
+ asksol |
| 2009年11月17日 10:07:19 | pitrou | set | priority: release blocker -> normal messages: + msg95372 |
| 2009年11月13日 21:44:04 | kristjan.jonsson | set | messages: + msg95206 |
| 2009年11月13日 21:42:47 | kristjan.jonsson | set | messages: + msg95205 |
| 2009年11月13日 21:41:24 | pitrou | set | messages: + msg95204 |
| 2009年11月13日 21:39:10 | kristjan.jonsson | set | messages: + msg95203 |
| 2009年11月13日 19:29:14 | pitrou | set | dependencies: + multiprocessing should not wait endlessly |
| 2009年11月13日 19:04:46 | pitrou | set | messages: + msg95196 |
| 2009年11月13日 18:04:33 | pitrou | set | messages: + msg95195 |
| 2009年11月13日 17:35:39 | r.david.murray | set | messages: + msg95194 |
| 2009年11月13日 17:35:22 | r.david.murray | set | messages: - msg95193 |
| 2009年11月13日 17:34:57 | r.david.murray | set | messages: + msg95193 |
| 2009年11月13日 17:34:35 | r.david.murray | set | messages: - msg95190 |
| 2009年11月13日 17:26:16 | r.david.murray | set | messages: + msg95190 |
| 2009年11月13日 17:04:40 | pitrou | set | messages: + msg95188 |
| 2009年11月13日 16:16:30 | kristjan.jonsson | set | messages: + msg95184 |
| 2009年11月13日 16:12:57 | pitrou | set | messages: + msg95183 |
| 2009年11月13日 16:10:58 | kristjan.jonsson | set | messages: + msg95182 |
| 2009年11月13日 15:04:44 | pitrou | set | messages: + msg95178 |
| 2009年11月13日 09:16:22 | kristjan.jonsson | set | messages: + msg95175 |
| 2009年11月13日 02:37:21 | benjamin.peterson | set | priority: normal -> release blocker messages: + msg95173 |
| 2009年10月24日 12:59:50 | pitrou | set | messages: + msg94413 |
| 2009年10月11日 17:32:42 | jon | set | nosy:
+ jon |
| 2009年10月11日 17:29:50 | pitrou | set | dependencies: + weak dict iterators are fragile because of unpredictable GC runs |
| 2009年10月11日 14:08:33 | kristjan.jonsson | set | messages: + msg93859 |
| 2009年10月11日 12:32:10 | benjamin.peterson | set | messages: + msg93857 |
| 2009年10月11日 12:14:41 | pitrou | set | messages: + msg93856 |
| 2009年10月11日 12:06:47 | kristjan.jonsson | set | messages: + msg93855 |
| 2009年10月11日 12:03:10 | pitrou | set | nosy:
+ pitrou, benjamin.peterson messages: + msg93854 |
| 2009年10月10日 11:04:36 | kristjan.jonsson | set | messages: + msg93825 |
| 2009年10月10日 00:55:10 | r.david.murray | set | nosy:
+ kristjan.jonsson messages: + msg93815 |
| 2009年10月05日 15:16:21 | jnoller | set | messages: + msg93605 |
| 2009年10月05日 13:18:11 | jnoller | set | nosy:
+ gregory.p.smith messages: + msg93601 |
| 2009年10月05日 02:14:53 | r.david.murray | set | messages: + msg93571 |
| 2009年10月05日 01:41:17 | r.david.murray | set | nosy:
jnoller, r.david.murray components: + Library (Lib) |
| 2009年10月05日 01:40:55 | r.david.murray | create | |