This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2010年05月31日 22:20 by apexo, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| select_threads.py | apexo, 2010年06月01日 06:35 | python2 script that demonstrates the problem | ||
| concurrent_poll_and_register.py | apexo, 2010年06月01日 07:09 | python2 script to demonstrate memory corruption issue | ||
| select.patch | apexo, 2010年06月15日 10:21 | a patch | ||
| select_concurrent_poll.patch | serhiy.storchaka, 2012年11月11日 19:48 | Updated to 2.7 for review patch | review | |
| select_concurrent_poll.patch | serhiy.storchaka, 2012年11月11日 21:50 | review | ||
| issue8865-poll-multithread-gps01.diff | gregory.p.smith, 2012年11月11日 21:51 | updated, against 3.2 | review | |
| issue8865_v2.diff | apexo, 2012年11月12日 09:22 | forbid concurrent poll() invocation | review | |
| issue8865_v3.diff | apexo, 2012年11月12日 10:02 | RuntimeError instead of TypeError | review | |
| test.py | neologix, 2013年01月11日 09:00 | crasher | ||
| issue8865_test.patch | serhiy.storchaka, 2013年01月26日 18:15 | review | ||
| Messages (31) | |||
|---|---|---|---|
| msg106815 - (view) | Author: Christian Schubert (apexo) * | Date: 2010年05月31日 22:20 | |
invoking select.poll.poll() concurrently from multiple threads frequently yields garbage in one thread: while poll_poll in thread 1 is parsing its result, another thread 2 calling poll may overwrite revents; assuming poll_result was 1 in thread 1 and thread 2 managed to clear all revents before thread 1 started scanning ufds, thread 1 would iterate straight through all ufds, past its bounds (no bound checks there), and return the first out-of-bounds entry that happens to have revents != 0 this issue needs at least documentation (although bounds-checking to prevent garbage in the result wouldn't hurt) also, since there doesn't seem to be any locking w/ regards to ufds, it might be possible to corrupt python's heap, by concurrently invoking poll_register and poll_poll. poll_register could move the ufds array to another location while resizing it and poll_poll would subsequently overwrite memory that is not allocated anymore or allocated by someone else (did not test that) python 2.5.5 |
|||
| msg106816 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2010年05月31日 22:24 | |
Do you have a script that reproduces it? |
|||
| msg106827 - (view) | Author: Christian Schubert (apexo) * | Date: 2010年06月01日 06:35 | |
okay, I've managed to put together a script that fairly consistently reproduces the problem on my (core2 duo 2,something GHz) machine some parameters (sleep times) are randomized in each iteration, the script runs for 30 iterations, each for 1 second, and counts the number of events where the fd looks bogus (max 1 per iteration); some other events (fd correct, revents bogus are also reported, but not counted) example output: poll returned [(-3, 65535)], we were looking for fd 3 poll returned [(-3, 65535)], we were looking for fd 3 poll returned [(-3, 65535)], we were looking for fd 3 poll returned [(-3, 65535)], we were looking for fd 3 poll returned [(-3, 65535)], we were looking for fd 3 poll returned [(-1, 65535)], we were looking for fd 3 poll returned [(-1, 65535)], we were looking for fd 3 poll returned [(-1627358347, 60497)], we were looking for fd 3 should not get here; poll returned [(3, 0)] poll returned [(-3, 65535)], we were looking for fd 3 9 events in 30 iterations |
|||
| msg106829 - (view) | Author: Christian Schubert (apexo) * | Date: 2010年06月01日 07:09 | |
the other issue (poll_register freeing data structures that poll_poll currently uses) can be reproduced by the attached script (at least I can) depending on the parameters/circumstances it either segfaults or prints garbage results from poll, e.g. [(1684955474, 28192)]; the expected result would be [(3, 1)] |
|||
| msg107863 - (view) | Author: Christian Schubert (apexo) * | Date: 2010年06月15日 10:21 | |
added a patch which fixes both issues before releasing the GIL we take a copy of the ufds pointer and its len, erasing the ufds pointer in the poll object (to make sure nobody else fiddles with it); when we're done we either but it back into the object (when it's still NULL) or we free it the update logic is modified as well, since the uptodate flag is not sufficient anymore, we now count up a tag for each modification of the poll dictionary and remember that tag together with the ufds pointer the result is: - for the single-threaded case we do little extra work (moving ufds from/to poll object) - correct for the multi-threaded case, with slightly higher overhead (one additional call to each of malloc, update_ufd_array, free), probably worse than having one poll object per thread, but not worse than allocating a new poll object each time we want to poll there still is potential for incorrect poll results (but not for memory corruption): when poll_register/poll_unregister are called exactly 2**32 times (or multiples thereof) while poll_poll is running, it will happily put back its outdated ufds pointer into the poll object when its done, this could be alleviated by changing tag to long long ... which is unlikely to wrap around anytime soon. |
|||
| msg175387 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月11日 19:48 | |
I was unable to apply the patch automatically, so I had to do it manually. Here is an updated patch for review. I did not consider it in detail yet, but it seems to correct these errors. |
|||
| msg175389 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2012年11月11日 19:55 | |
i'm looking at getting this in. |
|||
| msg175390 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2012年11月11日 21:21 | |
Christian Schubert (apexo) - Would you please submit a PSF contributor agreement form? http://www.python.org/psf/contrib/ http://www.python.org/psf/contrib/contrib-form-python/ thanks! |
|||
| msg175394 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月11日 21:50 | |
In general the patch looks good to me. I only get rid of non-needed macros. |
|||
| msg175395 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2012年11月11日 21:51 | |
here's an updated patch. it strikes me that this should not be a very common problem. how many applications are going to share the same poll object _across_ multiple threads? if they do and the file descriptor they'll be spending a lot of time mallocing and freeing new pollfd ufds arrays in each thread as a result as they all fight for ownership of the pollObject's ufds. |
|||
| msg175397 - (view) | Author: Gregory P. Smith (gregory.p.smith) * (Python committer) | Date: 2012年11月11日 21:54 | |
our patches are similar. i updated it to use long long and Py_ssize_t and Py_CLEAR and Py_RETURN_NONE in a few places and added comments. getting rid of the CLEAR_UFDS macro as you did is a good idea. |
|||
| msg175400 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年11月11日 22:10 | |
> it strikes me that this should not be a very common problem. how many > applications are going to share the same poll object _across_ multiple > threads? Indeed that doesn't sound very likely. How about raising an error on concurrent modification, instead of trying to make it thread-safe? |
|||
| msg175409 - (view) | Author: Christian Schubert (apexo) * | Date: 2012年11月11日 23:44 | |
> How about raising an error on concurrent modification, instead of trying to make it thread-safe? That's totally fine with me. |
|||
| msg175429 - (view) | Author: Christian Schubert (apexo) * | Date: 2012年11月12日 09:22 | |
new proposed fix: forbid concurrent poll() invocation |
|||
| msg175431 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年11月12日 09:57 | |
The patch LGTM. I doubt about the exception type. May be RuntimeError is more appropriate? |
|||
| msg175432 - (view) | Author: Christian Schubert (apexo) * | Date: 2012年11月12日 10:02 | |
> I doubt about the exception type. May be RuntimeError is more appropriate? mea culpa, just copy&pasted without actually looking; fixed in v3 |
|||
| msg175501 - (view) | Author: Christian Schubert (apexo) * | Date: 2012年11月13日 15:59 | |
> Would you please submit a PSF contributor agreement form? FYI: did that yesterday 9:43 UTC, no reaction (yet) |
|||
| msg175525 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年11月13日 20:45 | |
> > Would you please submit a PSF contributor agreement form? > > FYI: did that yesterday 9:43 UTC, no reaction (yet) Don't worry, it can take some time to process. We can still apply your patch, though, since you said you sent a contributor agreement. |
|||
| msg176894 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月04日 10:44 | |
As I see, Christian's contributor agreement already confirmed. Let's go ahead. |
|||
| msg176895 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月04日 10:48 | |
Christian, it would be nice if you write the tests. |
|||
| msg176969 - (view) | Author: Christian Schubert (apexo) * | Date: 2012年12月05日 07:52 | |
What's a proper test for this? Testing, that the (now expected) exception occurs when invoking poll concurrently? Or rather, that the race condition does not occur? Last time I checked, pypy handled the concurrent poll invocation well, so it would fail both tests. |
|||
| msg176998 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2012年12月05日 19:19 | |
Test should check changed behavior. I.e. it should test that calling poll concurrently raises an exception. |
|||
| msg179657 - (view) | Author: Charles-François Natali (neologix) * (Python committer) | Date: 2013年01月11日 09:00 | |
This patch should be updated to also fix epoll(). Also, is it right to raise an exception in case of concurrent invocation? Here's a simple script that crashes systematically on my Linux box. |
|||
| msg179659 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月11日 09:19 | |
> Also, is it right to raise an exception in case of concurrent invocation? It is right for poll() because it was not concurrently usable in previous versions in any case. For epoll() it is an another issue. |
|||
| msg180696 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年01月26日 18:15 | |
Here is a test made from Charles-François's crasher. Let's go. |
|||
| msg181822 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年02月10日 17:37 | |
Ping. |
|||
| msg181823 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2013年02月10日 17:44 | |
Patches look good to me. |
|||
| msg186736 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年04月13日 15:59 | |
Gregory? |
|||
| msg194334 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年08月04日 08:37 | |
Gregory, could you commit the patch or allow me to do this? |
|||
| msg195705 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2013年08月20日 17:59 | |
New changeset 072ba5df77e4 by Serhiy Storchaka in branch '3.3': Issue #8865: Concurrent invocation of select.poll.poll() now raises a http://hg.python.org/cpython/rev/072ba5df77e4 New changeset 4543408e2ba6 by Serhiy Storchaka in branch 'default': Issue #8865: Concurrent invocation of select.poll.poll() now raises a http://hg.python.org/cpython/rev/4543408e2ba6 New changeset a4091c1de27a by Serhiy Storchaka in branch '2.7': Issue #8865: Concurrent invocation of select.poll.poll() now raises a http://hg.python.org/cpython/rev/a4091c1de27a |
|||
| msg195707 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2013年08月20日 18:23 | |
Thank you for the report and the patch Christian. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:01 | admin | set | github: 53111 |
| 2013年08月20日 18:23:29 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg195707 stage: commit review -> resolved |
| 2013年08月20日 17:59:34 | python-dev | set | nosy:
+ python-dev messages: + msg195705 |
| 2013年08月20日 14:47:22 | serhiy.storchaka | set | assignee: gregory.p.smith -> serhiy.storchaka versions: - Python 3.2 |
| 2013年08月04日 17:25:37 | vstinner | set | nosy:
+ vstinner |
| 2013年08月04日 08:37:10 | serhiy.storchaka | set | messages: + msg194334 |
| 2013年04月13日 15:59:02 | serhiy.storchaka | set | messages: + msg186736 |
| 2013年02月10日 17:44:54 | pitrou | set | messages: + msg181823 |
| 2013年02月10日 17:37:40 | serhiy.storchaka | set | messages: + msg181822 |
| 2013年01月26日 18:17:05 | serhiy.storchaka | set | stage: test needed -> commit review |
| 2013年01月26日 18:15:19 | serhiy.storchaka | set | files:
+ issue8865_test.patch messages: + msg180696 |
| 2013年01月11日 09:19:16 | serhiy.storchaka | set | messages: + msg179659 |
| 2013年01月11日 09:00:32 | neologix | set | files:
+ test.py nosy: + neologix messages: + msg179657 |
| 2013年01月11日 08:58:36 | neologix | unlink | issue16929 dependencies |
| 2013年01月11日 08:58:36 | neologix | link | issue16929 superseder |
| 2013年01月11日 08:52:14 | serhiy.storchaka | link | issue16929 dependencies |
| 2012年12月05日 19:19:43 | serhiy.storchaka | set | messages: + msg176998 |
| 2012年12月05日 10:53:52 | asvetlov | set | nosy:
+ asvetlov |
| 2012年12月05日 07:52:14 | apexo | set | messages: + msg176969 |
| 2012年12月04日 10:48:04 | serhiy.storchaka | set | messages: + msg176895 |
| 2012年12月04日 10:44:30 | serhiy.storchaka | set | messages: + msg176894 |
| 2012年11月13日 20:45:30 | pitrou | set | messages: + msg175525 |
| 2012年11月13日 15:59:15 | apexo | set | messages: + msg175501 |
| 2012年11月12日 10:02:00 | apexo | set | files:
+ issue8865_v3.diff messages: + msg175432 |
| 2012年11月12日 09:57:11 | serhiy.storchaka | set | messages: + msg175431 |
| 2012年11月12日 09:22:10 | apexo | set | files:
+ issue8865_v2.diff messages: + msg175429 |
| 2012年11月11日 23:44:34 | apexo | set | messages: + msg175409 |
| 2012年11月11日 22:10:08 | pitrou | set | messages: + msg175400 |
| 2012年11月11日 21:54:08 | gregory.p.smith | set | messages: + msg175397 |
| 2012年11月11日 21:51:05 | gregory.p.smith | set | files:
+ issue8865-poll-multithread-gps01.diff messages: + msg175395 |
| 2012年11月11日 21:50:31 | serhiy.storchaka | set | files:
+ select_concurrent_poll.patch messages: + msg175394 stage: test needed |
| 2012年11月11日 21:21:22 | gregory.p.smith | set | messages: + msg175390 |
| 2012年11月11日 19:55:29 | gregory.p.smith | set | assignee: gregory.p.smith messages: + msg175389 |
| 2012年11月11日 19:48:10 | serhiy.storchaka | set | files:
+ select_concurrent_poll.patch type: behavior -> crash messages: + msg175387 versions: + Python 3.3, Python 3.4, - Python 2.6, Python 3.1 |
| 2012年11月11日 16:52:34 | pitrou | set | nosy:
+ serhiy.storchaka |
| 2012年11月11日 04:08:53 | gregory.p.smith | set | assignee: gregory.p.smith -> (no value) |
| 2012年06月07日 20:38:40 | gregory.p.smith | set | assignee: gregory.p.smith nosy: + gregory.p.smith |
| 2010年06月15日 10:31:07 | eric.araujo | set | nosy:
- docs@python |
| 2010年06月15日 10:22:26 | pitrou | set | nosy:
+ exarkun |
| 2010年06月15日 10:21:28 | apexo | set | files:
+ select.patch keywords: + patch messages: + msg107863 |
| 2010年06月01日 07:09:26 | apexo | set | files:
+ concurrent_poll_and_register.py messages: + msg106829 |
| 2010年06月01日 06:35:20 | apexo | set | files:
+ select_threads.py messages: + msg106827 |
| 2010年05月31日 22:24:08 | pitrou | set | versions:
+ Python 2.6, Python 3.1, Python 2.7, Python 3.2, - Python 2.5 nosy: + pitrou messages: + msg106816 assignee: docs@python -> (no value) components: - Documentation |
| 2010年05月31日 22:20:13 | apexo | create | |