This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2014年04月21日 12:13 by Anton.Afanasyev, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| issue21321_3.4_8c8315bac6a8.diff | Anton.Afanasyev, 2014年04月21日 12:13 | |||
| issue21321_2.7_e3217efa6edd.diff | Anton.Afanasyev, 2014年04月21日 12:41 | |||
| issue21321_3.4_8c8315bac6a8_2.diff | Anton.Afanasyev, 2014年04月22日 16:52 | |||
| issue21321_2.7_e3217efa6edd_3.diff | Anton.Afanasyev, 2014年04月28日 19:47 | |||
| issue21321_3.4_8c8315bac6a8_3.diff | Anton.Afanasyev, 2014年04月28日 19:48 | |||
| issue21321_2.7_e3217efa6edd_4.diff | Anton.Afanasyev, 2014年04月29日 06:27 | |||
| issue21321_3.4_8c8315bac6a8_4.diff | Anton.Afanasyev, 2014年04月29日 06:27 | |||
| issue21321_3.4_8c8315bac6a8_5.diff | Anton.Afanasyev, 2014年04月29日 09:55 | |||
| Messages (17) | |||
|---|---|---|---|
| msg216939 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月21日 12:13 | |
This issue results in redundant memory consumption for e.g. in this case: ================================================ from itertools import * def test_islice(): items, lookahead = tee(repeat(1, int(1e9))) lookahead = islice(lookahead, 10) for item in lookahead: pass for item in items: pass if __name__ == "__main__": test_islice() ================================================ This demo is taken from real case where one needs to look ahead input stream before processing it. For my PC this demo stops with 'Segmentation fault' message after exhausting all PC memory, while running it with patched python consumes only 0.1% of memory till the end. When one uses pure pythonic implementation of itertools.islice() (taken from docs), the issue goes away as well, since this implementation doesn't store redundant reference to source iterator. ================================================ def islice(iterable, *args): s = slice(*args) it = iter(xrange(s.start or 0, s.stop or sys.maxint, s.step or 1)) nexti = next(it) for i, element in enumerate(iterable): if i == nexti: yield element nexti = next(it) ================================================ Attaching patch for this issue. Have to change '__reduce__()' method since now unpickling of exhausted 'islice()' object cannot return old source iterator. |
|||
| msg216940 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月21日 12:41 | |
Added patch for 2.7 version (no need to change '__reduce__()' method since it's not implemented). |
|||
| msg216992 - (view) | Author: Raymond Hettinger (rhettinger) * (Python committer) | Date: 2014年04月22日 06:58 | |
The ref-counts in the islice_reduce code don't look to be correct at first glance. |
|||
| msg217014 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月22日 16:52 | |
Hi Raymond, do you mean allocation exceptions handling should be more accurate? Attaching fixed version for 3.4 branch. |
|||
| msg217180 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月26日 01:55 | |
Haven't reviewed the patch, but you should definitely add a unit test for the bugfix. |
|||
| msg217407 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月28日 19:47 | |
Hi Antoine, I have no found a way to check resource usage in test infrastructure and I don't think it could be done carefully. The only method I found to test issue is straightforward: just to check source iterator is not referenced from itertools.islice() after the latter has been exhausted: ================================================ a = [random.random() for i in range(10)] before = sys.getrefcount(a) b = islice(a, 5) for i in b: pass after = sys.getrefcount(a) self.assertEqual(before, after) ================================================ Attaching "issue21321_2.7_e3217efa6edd_3.diff" and "issue21321_3.4_8c8315bac6a8_3.diff" patches with this test included in "Lib/test/test_itertools.py". |
|||
| msg217410 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月28日 19:54 | |
Anton, the test is wrong: it is taking a reference to the iterable object (the list), not the iterator. To check the reference to the original iterator is released, something like this would work: >>> import itertools, weakref >>> it = (x for x in (1, 2)) >>> wr = weakref.ref(it) >>> it = itertools.islice(it, 1) >>> wr() is None False >>> list(it) [1] >>> wr() is None # returns True with the patch, False without True |
|||
| msg217411 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月28日 19:55 | |
(note I haven't looked at the C part of the patch) |
|||
| msg217474 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月29日 06:27 | |
Hi Antoine, my test works for me. It can be either >>> a = [1, 2, 3] or >>> a = iter([1, 2, 3]) , no matter: both objects will be +1 referenced after taking >>> b = islice(a, 1) . My test failed without patch and passed with one. But your test is more straightforward, thanks. Attaching patches with your test. |
|||
| msg217495 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月29日 08:39 | |
Thanks. Could you also add a test for the islice_reduce additions? Or is it already tested? I suspect there's a reference leak there: after calling PyObject_GetIter, you should always Py_DECREF(empty_list). Also, with the "O" code, Py_BuildValue will take a new reference to empty_it, so you should use the "N" code instead. |
|||
| msg217503 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月29日 09:55 | |
Hi Antoine, oops you are right about leaks: fixed them in new attached patch. As for testing changes in "reduce()": they are already covered by "self.pickletest(islice(range(100), *args))". Function "pickletest()" covers case for pickle dumping/loading of exhausted iterator. |
|||
| msg217504 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月29日 10:12 | |
For the record, checks such as: self.assertEqual(wr() is None, False) are better written: self.assertIsNotNone(wr()) No need to upload a new patch, I'm gonna make the change while committing :-) |
|||
| msg217505 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2014年04月29日 10:14 | |
New changeset b795105db23a by Antoine Pitrou in branch '3.4': Issue #21321: itertools.islice() now releases the reference to the source iterator when the slice is exhausted. http://hg.python.org/cpython/rev/b795105db23a New changeset a627b3e3c9c8 by Antoine Pitrou in branch 'default': Issue #21321: itertools.islice() now releases the reference to the source iterator when the slice is exhausted. http://hg.python.org/cpython/rev/a627b3e3c9c8 |
|||
| msg217506 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月29日 10:15 | |
Patch committed, thank you! If you want to provide a patch for 2.7, please say so, otherwise I'll close the issue. |
|||
| msg217508 - (view) | Author: Anton Afanasyev (Anton.Afanasyev) * | Date: 2014年04月29日 10:21 | |
Antoine, not sure about 2.7. The issue first arose for me at Python 2.7, so I would prefer "issue21321_2.7_e3217efa6edd_4.diff" patch be applied. |
|||
| msg217509 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2014年04月29日 10:26 | |
New changeset 8ee76e1b5aa6 by Antoine Pitrou in branch '2.7': Issue #21321: itertools.islice() now releases the reference to the source iterator when the slice is exhausted. http://hg.python.org/cpython/rev/8ee76e1b5aa6 |
|||
| msg217510 - (view) | Author: Antoine Pitrou (pitrou) * (Python committer) | Date: 2014年04月29日 10:27 | |
Ok, then I've committed to 2.7 too. Thank you very much for contributing! |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:02 | admin | set | github: 65520 |
| 2014年04月29日 10:27:27 | pitrou | set | status: open -> closed messages: + msg217510 |
| 2014年04月29日 10:26:59 | python-dev | set | messages: + msg217509 |
| 2014年04月29日 10:21:38 | Anton.Afanasyev | set | messages: + msg217508 |
| 2014年04月29日 10:15:48 | pitrou | set | resolution: fixed messages: + msg217506 stage: resolved |
| 2014年04月29日 10:14:59 | python-dev | set | nosy:
+ python-dev messages: + msg217505 |
| 2014年04月29日 10:12:04 | pitrou | set | messages: + msg217504 |
| 2014年04月29日 09:55:51 | Anton.Afanasyev | set | files:
+ issue21321_3.4_8c8315bac6a8_5.diff messages: + msg217503 |
| 2014年04月29日 08:39:22 | pitrou | set | messages: + msg217495 |
| 2014年04月29日 06:27:51 | Anton.Afanasyev | set | files: + issue21321_3.4_8c8315bac6a8_4.diff |
| 2014年04月29日 06:27:22 | Anton.Afanasyev | set | files: + issue21321_2.7_e3217efa6edd_4.diff |
| 2014年04月29日 06:27:03 | Anton.Afanasyev | set | messages: + msg217474 |
| 2014年04月28日 19:55:20 | pitrou | set | messages: + msg217411 |
| 2014年04月28日 19:54:56 | pitrou | set | messages: + msg217410 |
| 2014年04月28日 19:48:34 | Anton.Afanasyev | set | files: + issue21321_3.4_8c8315bac6a8_3.diff |
| 2014年04月28日 19:47:57 | Anton.Afanasyev | set | files:
+ issue21321_2.7_e3217efa6edd_3.diff messages: + msg217407 |
| 2014年04月26日 01:55:17 | pitrou | set | messages: + msg217180 |
| 2014年04月25日 22:25:51 | terry.reedy | set | versions: + Python 3.5, - Python 3.1, Python 3.2, Python 3.3 |
| 2014年04月23日 00:04:36 | pitrou | set | nosy:
+ pitrou |
| 2014年04月22日 16:52:50 | Anton.Afanasyev | set | files:
+ issue21321_3.4_8c8315bac6a8_2.diff messages: + msg217014 |
| 2014年04月22日 06:58:12 | rhettinger | set | assignee: rhettinger messages: + msg216992 |
| 2014年04月21日 12:41:36 | Anton.Afanasyev | set | files:
+ issue21321_2.7_e3217efa6edd.diff messages: + msg216940 |
| 2014年04月21日 12:13:50 | Anton.Afanasyev | create | |