This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年11月28日 22:34 by Voo, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue13496.patch | mark.dickinson, 2012年01月28日 10:03 | review | ||
| issue13496_v2.patch | mark.dickinson, 2012年03月10日 15:29 | review | ||
| Messages (18) | |||
|---|---|---|---|
| msg148517 - (view) | Author: Daniel Sturm (Voo) | Date: 2011年11月28日 22:34 | |
The mid index computation in _bisectmodule.c in both internal_bisect_right and internal_bisect_left is done with: mid = (lo + hi) / 2; // all three variables Py_ssize_t which is susceptible to overflows for large arrays, which would lead to undefined behavior (and in practice almost certainly a crash with a negative index) The fix is trivial - mid = lo + (hi - lo) / 2; - but since I'm just starting to look into the code base I may be missing some undocumented assertions that guarantee this can't happen. |
|||
| msg148520 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2011年11月28日 23:04 | |
This looks like a reasonable suggestion. |
|||
| msg148524 - (view) | Author: Tim Peters (tim.peters) * (Python committer) | Date: 2011年11月28日 23:45 | |
FWIW, I doubt there's a real issue here. Objects in Python consume a lot more than a byte or two of memory, so the index range of a Python list is generally a lot less than ssize_t allows for. In other words, quantify "large" in "large arrays". How large can a Python list actually be, relative to ssize_t? Similar reasoning accounts for why we never worry about overflow when mucking with refcounts: the size of a refcount member exceeds the maximum number of references that could exist. |
|||
| msg148531 - (view) | Author: Daniel Sturm (Voo) | Date: 2011年11月29日 00:25 | |
TBH I saw this more as an opportunity to get used to the whole system, how to create a patch, etc. :) Should've made it clearer at the start that this is unlikely to ever be a problem, sorry (didn't see a way to set priority to low myself). If my math isn't completely off, the highest possible value here is hi + (hi - 1) so this only occurs if hi > (MAX_SSIZE_T + 1) / 2. So this would only work out if we had such an array containing only a handful different objects. And then that's me assuming that a list is actually a thin wrapper around an array of PyObject*. |
|||
| msg148567 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2011年11月29日 14:03 | |
Given that we typically need at least 4 bytes just for the PyObject * pointer for each item in a list, I guess real lists are safe. But how about list-like objects, implementing __len__ and __getitem__? The following appears to run forever on my machine: class SquaresList(object): def __init__(self, length): self._length = length def __len__(self): return self._length def __getitem__(self, index): if not 0 <= index <= self._length: raise IndexError return index**2 import bisect, sys squareslist = SquaresList(sys.maxsize) print bisect.bisect(squareslist, (sys.maxsize - 3)**2) |
|||
| msg148569 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2011年11月29日 14:05 | |
Very good point, Mark. |
|||
| msg148574 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * (Python committer) | Date: 2011年11月29日 14:56 | |
[x]range is enough to trigger the bug:: bisect.bisect(range(sys.maxsize), sys.maxsize-3) |
|||
| msg148607 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2011年11月29日 20:55 | |
The fix is trivial. I'll do it soonish. |
|||
| msg149239 - (view) | Author: Akira Li (akira) * | Date: 2011年12月11日 19:10 | |
Related bug in Java: http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-nearly.html Do not consider any change as trivial: http://www.solipsys.co.uk/new/BinarySearchReconsidered.html (the author ran "binary search" coding challenge about 10 years ago http://www.solipsys.co.uk/b_search/ ) |
|||
| msg152152 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年01月28日 10:03 | |
Here's a patch. |
|||
| msg152392 - (view) | Author: Jesús Cea Avión (jcea) * (Python committer) | Date: 2012年01月31日 16:01 | |
I think the patch is *NOT* correct. Does it work with http://bugs.python.org/msg148567, in a 32 bits python? |
|||
| msg152400 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年01月31日 17:50 | |
> I think the patch is *NOT* correct. Can you elaborate? It works fine for that example on a 64-bit build; not sure why 32-bit would be any different. The idea is just that the single cast forces the addition to be done as an addition of integers of type size_t. Since the integers being added are (a) nonnegative and (b) both fit into a Py_ssize_t, the sum fits into a size_t without overflow, and the result of dividing the sum by 2 fits into a size_t. (This is all assuming that 2*Py_SSIZE_T_MAX fits into a size_t, but that's a fairly safe assumption in practice.) |
|||
| msg152401 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年01月31日 17:59 | |
> Does it work with http://bugs.python.org/msg148567, in a 32 bits python? Yes. |
|||
| msg152402 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2012年01月31日 18:11 | |
Perhaps adding a comment before those two lines would make the reason for the cast clearer. |
|||
| msg155318 - (view) | Author: Mark Dickinson (mark.dickinson) * (Python committer) | Date: 2012年03月10日 15:29 | |
Updated patch, with comment. |
|||
| msg155328 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2012年03月10日 17:53 | |
Thanks for the updated patch. |
|||
| msg155436 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2012年03月12日 03:44 | |
LGTM |
|||
| msg158340 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2012年04月15日 15:43 | |
New changeset 35a3a7e0d66d by Mark Dickinson in branch '3.2': Issue 13496: Fix bisect.bisect overflow bug for large collections. http://hg.python.org/cpython/rev/35a3a7e0d66d New changeset 1a9252280f07 by Mark Dickinson in branch 'default': Issue #13496: Merge from 3.2 http://hg.python.org/cpython/rev/1a9252280f07 New changeset 709af2d0e862 by Mark Dickinson in branch '2.7': Issue 13496: Fix bisect.bisect overflow bug for large collections. http://hg.python.org/cpython/rev/709af2d0e862 |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:24 | admin | set | github: 57705 |
| 2012年04月15日 15:44:15 | mark.dickinson | set | status: open -> closed resolution: fixed |
| 2012年04月15日 15:43:46 | python-dev | set | nosy:
+ python-dev messages: + msg158340 |
| 2012年03月12日 03:44:06 | benjamin.peterson | set | nosy:
+ benjamin.peterson messages: + msg155436 |
| 2012年03月10日 17:53:52 | rhettinger | set | messages: + msg155328 |
| 2012年03月10日 15:29:29 | mark.dickinson | set | files:
+ issue13496_v2.patch messages: + msg155318 |
| 2012年01月31日 18:11:49 | pitrou | set | messages: + msg152402 |
| 2012年01月31日 17:59:14 | mark.dickinson | set | messages: + msg152401 |
| 2012年01月31日 17:50:57 | mark.dickinson | set | messages: + msg152400 |
| 2012年01月31日 16:01:07 | jcea | set | messages: + msg152392 |
| 2012年01月28日 20:26:50 | mark.dickinson | set | stage: patch review |
| 2012年01月28日 10:03:49 | mark.dickinson | set | files:
+ issue13496.patch keywords: + patch messages: + msg152152 |
| 2011年12月11日 19:10:15 | akira | set | nosy:
+ akira messages: + msg149239 |
| 2011年12月02日 18:51:07 | jcea | set | nosy:
+ jcea |
| 2011年11月29日 20:55:07 | rhettinger | set | priority: low -> normal messages: + msg148607 |
| 2011年11月29日 14:56:21 | amaury.forgeotdarc | set | nosy:
+ amaury.forgeotdarc messages: + msg148574 |
| 2011年11月29日 14:05:44 | pitrou | set | nosy:
+ pitrou messages: + msg148569 |
| 2011年11月29日 14:03:40 | mark.dickinson | set | messages: + msg148567 |
| 2011年11月29日 00:25:36 | Voo | set | messages: + msg148531 |
| 2011年11月28日 23:45:41 | tim.peters | set | nosy:
+ tim.peters messages: + msg148524 |
| 2011年11月28日 23:04:30 | rhettinger | set | priority: normal -> low assignee: rhettinger messages: + msg148520 |
| 2011年11月28日 22:37:01 | pitrou | set | nosy:
+ rhettinger, mark.dickinson versions: + Python 2.7, Python 3.2, Python 3.3, - Python 3.4 |
| 2011年11月28日 22:34:31 | Voo | create | |