This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年11月25日 10:56 by hakril, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue22939.patch | hakril, 2014年12月06日 20:08 | First try for the patch | review | |
| issue22939v2.patch | hakril, 2014年12月07日 21:26 | Second try for the patch | review | |
| Messages (24) | |||
|---|---|---|---|
| msg231651 - (view) | Author: Clement Rouault (hakril) * | Date: 2014年11月25日 10:56 | |
I found an interger overflow in the standard iterator object (Objects/iterobject.c) The `it_index` attribute used by the iterator is a `Py_ssize_t` but overflow is never checked. So after the index `PY_SSIZE_T_MAX`, the iterator object will ask for the index `PY_SSIZE_T_MIN`. Here is an example: import sys class Seq: def __getitem__(self, item): print("[-] Asked for item at <{0}>".format(item)) return 0 s = Seq() i = iter(s) # Manually set `it_index` to PY_SSIZE_T_MAX without a loop i.__setstate__(sys.maxsize) next(i) [-] Asked for item at <9223372036854775807> next(i) [-] Asked for item at <-9223372036854775808> I would be really interested in writing a patch but first I wanted to discuss the expected behaviour and fix. The iterator could stop after `PY_SSIZE_T_MAX` but it seems strange as other iterator (like `enumobject`) handle values bigger than `PY_SSIZE_T_MAX`. Or the same technique used in `enumobject` could be used: adding a field `PyObject* en_longindex` (a python long) which would be used for values bigger than `PY_SSIZE_T_MAX` |
|||
| msg231659 - (view) | Author: R. David Murray (r.david.murray) * (Python committer) | Date: 2014年11月25日 13:51 | |
For possibly relevant background information, see issue 21444 and the issues linked from it, and issue 14794. |
|||
| msg231660 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年11月25日 14:23 | |
> The `it_index` attribute used by the iterator is a `Py_ssize_t` but overflow is never checked. Yes it is a bug. iter_iternext() must raises an OverflowError if it->it_index is equal to PY_SSIZE_T_MAX. |
|||
| msg231664 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年11月25日 15:22 | |
I think OverflowError is good for maintained releases, but for 3.5 Clement's idea with long index looks attractive to me. In any case an exception should be raised for negative argument in __setstate__(). Let split this issue on two parts. First fix the bug by raising exceptions, and then add long index (if anyone will need it). |
|||
| msg231669 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年11月25日 16:40 | |
> I think OverflowError is good for maintained releases, but for 3.5 Clement's idea with long index looks attractive to me. What is your use case? |
|||
| msg231676 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年11月25日 18:21 | |
> What is your use case? Something like range(). I agree that it is very unlike to encounter this problem in real work, and we can live with implementation-related limitation (for which OverflowError exists). |
|||
| msg232250 - (view) | Author: Clement Rouault (hakril) * | Date: 2014年12月06日 20:08 | |
Here is a first try for a patch. There are two points I am not sure about: 1) The message for the OverflowError: is that explicit enough ? 2) The behaviour of the iterator after the raise of OverflowError. With this patch every call to `next(it)` where `it` have overflowed will raise `OverflowError` again. Does this behaviour seems correct our should it raise StopIteration after the first OverflowError ? |
|||
| msg232267 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年12月07日 08:44 | |
You should not rely on undefined behaviour: check if the index is greater or equal than PY_SSIZET_MAX - 1. __setstate__ must implement the same check. You must always raise an error, not hide it in a second call to next (raise StopIteration). Your unit test must check this behaviour. |
|||
| msg232269 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年12月07日 09:20 | |
> check if the index is greater or equal than PY_SSIZET_MAX - 1. Just PY_SSIZET_MAX. Added other comments on Rietveld (look the email in the Spam folder). |
|||
| msg232282 - (view) | Author: Clement Rouault (hakril) * | Date: 2014年12月07日 21:26 | |
Thanks for the comments. I corrected the patch and updated the test. I also added a test that check the behavior of setstate with negative indice. Just one question: > __setstate__ must implement the same check. Why should __setstate__ check for PY_SSIZE_T_MAX (throwing OverflowError when unpickling) if the check will be done when calling next on the resulting iterator ? |
|||
| msg232284 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年12月07日 21:31 | |
__setstate__ should check that an index is not negative. All values from 0 to PY_SSIZE_T_MAX are acceptable. |
|||
| msg232285 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年12月07日 21:39 | |
Ah, __setstate__ already handles negative indices. Then the patch LGTM. |
|||
| msg232362 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2014年12月09日 10:32 | |
I would think that the PY_SSIZE_T_MAX check belongs inside the:
if (result != NULL) {
it->it_index++;
return result;
}
just before the increment which could cause the overflow. Also, PY_SSIZE_T_MAX is a valid value to pass to PySequence_GetItem(), so it shouldn't be blocked unless necessary.
|
|||
| msg232365 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2014年12月09日 10:41 | |
This doesn't matter because next() will raise an exception if it_index == PY_SSIZE_T_MAX in any case. The code should be changed much more to allow yielding an item with index PY_SSIZE_T_MAX, use other (negative) signal value and change the behavior of __setstate__ for negative values. |
|||
| msg232367 - (view) | Author: Clement Rouault (hakril) * | Date: 2014年12月09日 10:46 | |
> Also, PY_SSIZE_T_MAX is a valid value to pass to PySequence_GetItem(), so it shouldn't be blocked unless necessary.
I agree with you, that's why my first path was checking at the next call if it->it_index had overflowed. But then it relies on undefined behaviour.
> I would think that the PY_SSIZE_T_MAX check belongs inside the:
>
> if (result != NULL) {
> it->it_index++;
> return result;
> }
If we raise the OverflowError when it->it_index really overflow (just after the getitem PY_SSIZE_T_MAX).
Is it really necessary to do the overflow check after the GetItem ? because the value returned by `PySequence_GetItem(seq, PY_SSIZE_T_MAX);` will be never used.
|
|||
| msg232368 - (view) | Author: STINNER Victor (vstinner) * (Python committer) | Date: 2014年12月09日 11:22 | |
I prefer to raise an OverflowError *before* calling PySequence_GetItem(). The call to PySequence_GetItem() may be expensive, and we have to drop the result if an OverflowError is raised after the call. At the end, the behaviour is the same: an OverflowError is raised. |
|||
| msg232385 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2014年12月09日 16:03 | |
> The call to PySequence_GetItem() may be expensive, and we have to drop > the result if an OverflowError is raised after the call. You do realize that this error will be very rare and therefore inconsequential. |
|||
| msg233021 - (view) | Author: Clement Rouault (hakril) * | Date: 2014年12月22日 15:36 | |
> > The call to PySequence_GetItem() may be expensive, and we have to drop > > the result if an OverflowError is raised after the call. > You do realize that this error will be very rare and therefore inconsequential. The real question is: why would you call the iterator for a new value if it will be discarded anyway ? I think it could be very misleading to see _getitem__ being called and have an OverflowError being raised afterward. |
|||
| msg233203 - (view) | Author: Ronald Oussoren (ronaldoussoren) * (Python committer) | Date: 2014年12月30日 11:50 | |
Is it necessary to raise when it_index is PY_SSIZE_T_MAX? An alternative is to set it_index to -1 when there would be overflow and raise an exception on the next call to next(). That way a virtual sequence with PY_SSIZE_T_MAX-1 items would still work (instead of failing unexpectedly). |
|||
| msg234233 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年01月18日 09:56 | |
> That way a virtual sequence with PY_SSIZE_T_MAX-1 items would still work (instead of failing unexpectedly). Actually with PY_SSIZE_T_MAX+1 items (indices from 0 to PY_SSIZE_T_MAX inclusive). If Raymond insists I can write more complicated patch, but I don't think that we should complicate the code for this pretty hypotetical case. I'm for committing issue22939v2.patch. |
|||
| msg242960 - (view) | Author: Clement Rouault (hakril) * | Date: 2015年05月12日 11:02 | |
After a few months, I am back to you on this issue. What should be the next step of the process ? |
|||
| msg243568 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月19日 09:17 | |
If Raymond will not stop me, I'll commit the patch tomorrow. |
|||
| msg243769 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年05月21日 17:55 | |
New changeset d6179accca20 by Serhiy Storchaka in branch '2.7': Fixed issue number for issue #22939. https://hg.python.org/cpython/rev/d6179accca20 New changeset 7fa2f4afcf5a by Serhiy Storchaka in branch '3.4': Fixed issue number for issue #22939. https://hg.python.org/cpython/rev/7fa2f4afcf5a New changeset f23533fa6afa by Serhiy Storchaka in branch 'default': Fixed issue number for issue #22939. https://hg.python.org/cpython/rev/f23533fa6afa |
|||
| msg243771 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年05月21日 18:00 | |
Thank you for your contribution Clement. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:10 | admin | set | github: 67128 |
| 2015年05月21日 18:00:05 | serhiy.storchaka | set | status: open -> closed resolution: fixed messages: + msg243771 stage: commit review -> resolved |
| 2015年05月21日 17:55:28 | python-dev | set | nosy:
+ python-dev messages: + msg243769 |
| 2015年05月19日 09:17:32 | serhiy.storchaka | set | assignee: serhiy.storchaka messages: + msg243568 |
| 2015年05月16日 15:28:30 | serhiy.storchaka | set | versions: + Python 2.7, Python 3.4 |
| 2015年05月12日 11:02:38 | hakril | set | status: pending -> open messages: + msg242960 |
| 2015年01月18日 09:56:24 | serhiy.storchaka | set | status: open -> pending messages: + msg234233 |
| 2014年12月30日 11:50:39 | ronaldoussoren | set | nosy:
+ ronaldoussoren messages: + msg233203 |
| 2014年12月22日 15:36:13 | hakril | set | messages: + msg233021 |
| 2014年12月09日 16:03:24 | rhettinger | set | messages: + msg232385 |
| 2014年12月09日 11:22:02 | vstinner | set | messages: + msg232368 |
| 2014年12月09日 10:46:39 | hakril | set | messages: + msg232367 |
| 2014年12月09日 10:41:19 | serhiy.storchaka | set | messages: + msg232365 |
| 2014年12月09日 10:32:32 | rhettinger | set | messages: + msg232362 |
| 2014年12月07日 21:39:01 | serhiy.storchaka | set | messages:
+ msg232285 stage: needs patch -> commit review |
| 2014年12月07日 21:31:54 | serhiy.storchaka | set | messages: + msg232284 |
| 2014年12月07日 21:26:59 | hakril | set | files:
+ issue22939v2.patch messages: + msg232282 |
| 2014年12月07日 09:20:21 | serhiy.storchaka | set | messages: + msg232269 |
| 2014年12月07日 08:44:12 | vstinner | set | messages: + msg232267 |
| 2014年12月06日 20:08:16 | hakril | set | files:
+ issue22939.patch keywords: + patch messages: + msg232250 |
| 2014年11月25日 18:21:29 | serhiy.storchaka | set | messages: + msg231676 |
| 2014年11月25日 16:40:32 | vstinner | set | messages: + msg231669 |
| 2014年11月25日 15:22:08 | serhiy.storchaka | set | nosy:
+ mark.dickinson messages: + msg231664 |
| 2014年11月25日 14:23:33 | vstinner | set | nosy:
+ vstinner messages: + msg231660 |
| 2014年11月25日 13:51:23 | r.david.murray | set | nosy:
+ r.david.murray messages: + msg231659 |
| 2014年11月25日 13:14:40 | serhiy.storchaka | set | nosy:
+ rhettinger, serhiy.storchaka stage: needs patch |
| 2014年11月25日 10:56:33 | hakril | create | |