This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2020年03月02日 15:36 by kimiguel, last changed 2022年04月11日 14:59 by admin. This issue is now closed.
| Pull Requests | |||
|---|---|---|---|
| URL | Status | Linked | Edit |
| PR 31816 | merged | Crowthebird, 2022年03月11日 09:37 | |
| Messages (13) | |||
|---|---|---|---|
| msg363186 - (view) | Author: Kim-Adeline Miguel (kimiguel) | Date: 2020年03月02日 15:36 | |
(See #33234) Recently we added Python 3.8 to our CI test matrix, and we noticed a possible backward incompatibility with the list() constructor. We found that __len__ is getting called twice, while before 3.8 it was only called once. Here's an example: class Foo: def __iter__(self): print("iter") return iter([3, 5, 42, 69]) def __len__(self): print("len") return 4 Calling list(Foo()) using Python 3.7 prints: iter len But calling list(Foo()) using Python 3.8 prints: len iter len It looks like this behaviour was introduced for #33234 with PR GH-9846. We realize that this was merged a while back, but at least we wanted to make the team aware of this change in behaviour. |
|||
| msg363188 - (view) | Author: Pablo Galindo Salgado (pablogsal) * (Python committer) | Date: 2020年03月02日 15:44 | |
Why should that be backwards incompatible? The number of times we can `__len__` on the constructor is an implementation detail. The reason is called now twice is because there is an extra check for the preallocation logic, which is detached from the logic of the subsequent list_extend(self, iterable). On the other hand, there may be a chance for optimization here, but on a very rough first plan, that may require coupling some logic (passing down the calculated length to list_extend() or some helper, which I am not very fond of. |
|||
| msg363588 - (view) | Author: Terry J. Reedy (terry.reedy) * (Python committer) | Date: 2020年03月07日 10:23 | |
The only specification is that len(ob) calls ob.__len__ and that ob.__len__ should return an 'integer >= 0'. (Adding side effects goes beyond that spec.) I agree that a detectable internal in list is not a bug. Unless there is a realistic performance enhancement in caching the result of the first call, this issue should be closed. |
|||
| msg363745 - (view) | Author: Eric Snow (eric.snow) * (Python committer) | Date: 2020年03月09日 16:09 | |
FWIW, I encouraged Kim to file this. Thanks Kim! While it isn't part of any specification, it is an unexpected change in behavior that led to some test failures. So I figured it would be worth bringing up. :) I did find it surprising that we were not caching the result, but don't think that's necessarily a problem. All that said, the change did not actually break anything other than some tests (not the code they were testing). So I don't have a problem with closing this. |
|||
| msg363746 - (view) | Author: Pablo Galindo Salgado (pablogsal) * (Python committer) | Date: 2020年03月09日 16:12 | |
Thanks Kim and Eric! I think it still makes sense to do some quick benchmarking and research on passing down the calculated length. I can try to produce a draft PR so we can discuss with something more tangible. |
|||
| msg363747 - (view) | Author: Eric Snow (eric.snow) * (Python committer) | Date: 2020年03月09日 16:20 | |
I'm not opposed. :) I just don't want to impose on your time. |
|||
| msg414724 - (view) | Author: Matt Wozniski (godlygeek) * | Date: 2022年03月08日 05:17 | |
Pardon me for necroing an old issue, but someone pointed out the surprising behavior of `__len__` being called twice by `list(iterable)`, and it caught my curiosity. https://github.com/python/cpython/commit/372d705d958964289d762953d0a61622755f5386 made it so that `list.__init__(iterable)` calls `iterable.__len__()` before calling `list.extend()`, to preallocate exactly the right amount of space, rather than allowing `list.extend()` to grow the array. That's because `list.extend()` can over-allocate. What if instead, we made it so that `list.extend(iterable)` doesn't over-allocate when called on an empty list? In the two places where `list_extend` calls `list_resize` to grow the array, we'd check if `self->ob_item == NULL` and if so call `list_preallocate_exact` instead, and we'd remove the call to `list_preallocate_exact` from `list___init___impl`. It seems like that ought to achieve the same goal as making `__init__` call preallocate exactly, without requiring the extra call to `__len__`. |
|||
| msg414727 - (view) | Author: Dennis Sweeney (Dennis Sweeney) * (Python committer) | Date: 2022年03月08日 06:02 | |
Related to Matt's idea is https://bugs.python.org/issue43574 |
|||
| msg414840 - (view) | Author: Jeremiah Gabriel Pascual (Crowthebird) * | Date: 2022年03月10日 11:05 | |
Matt's idea leads to some speedups when implemented correctly (pardon me but I have no idea how to use pyperf):
list({}): Mean +- std dev: [orig] 109 ns +- 1 ns -> [modif] 103 ns +- 1 ns: 1.06x faster
list({1: 2}): Mean +- std dev: [orig] 125 ns +- 1 ns -> [modif] 118 ns +- 1 ns: 1.05x faster
list({(1, 2, 3): 4}): Mean +- std dev: [orig] 125 ns +- 1 ns -> [modif] 118 ns +- 1 ns: 1.05x faster
list((3, 3, 4)): Mean +- std dev: [orig] 89.2 ns +- 4.5 ns -> [modif] 82.9 ns +- 4.6 ns: 1.08x faster
list(()): Mean +- std dev: [orig] 70.1 ns +- 0.8 ns -> [modif] 65.5 ns +- 0.8 ns: 1.07x faster
list({0, 1, 2, ...}): Mean +- std dev: [orig] 74.7 us +- 3.6 us -> [modif] 67.6 us +- 1.6 us: 1.11x faster
list({9, 3}): Mean +- std dev: [orig] 131 ns +- 2 ns -> [modif] 126 ns +- 4 ns: 1.04x faster
list(set()): Mean +- std dev: [orig] 115 ns +- 6 ns -> [modif] 110 ns +- 2 ns: 1.05x faster
list([]): Mean +- std dev: [orig] 73.2 ns +- 5.5 ns -> [modif] 67.8 ns +- 3.4 ns: 1.08x faster
list([1, 2, 1, 1]): Mean +- std dev: [orig] 93.5 ns +- 9.8 ns -> [modif] 87.9 ns +- 8.6 ns: 1.06x faster
list([1, 2, 1, 2, 1, 2]): Mean +- std dev: [orig] 93.0 ns +- 3.1 ns -> [modif] 87.0 ns +- 2.7 ns: 1.07x faster
Benchmark hidden because not significant (3): list({0: 0, 1: ...}), list((4, 5, 1, ...)), list([4, 1, 3, ...])
Geometric mean: 1.05x faster
Changes compared here: https://github.com/python/cpython/compare/main...thatbirdguythatuknownot:patch-17
|
|||
| msg414889 - (view) | Author: Inada Naoki (methane) * (Python committer) | Date: 2022年03月11日 05:54 | |
> Changes compared here: https://github.com/python/cpython/compare/main...thatbirdguythatuknownot:patch-17 Looks good to me. Would you create a pull request? |
|||
| msg415026 - (view) | Author: Jeremiah Gabriel Pascual (Crowthebird) * | Date: 2022年03月13日 05:11 | |
> Looks good to me. Would you create a pull request? Created a pull request (31816). |
|||
| msg415111 - (view) | Author: Inada Naoki (methane) * (Python committer) | Date: 2022年03月14日 01:24 | |
New changeset 2153daf0a02a598ed5df93f2f224c1ab2a2cca0d by Crowthebird in branch 'main': bpo-39829: Fix `__len__()` is called twice in list() constructor (GH-31816) https://github.com/python/cpython/commit/2153daf0a02a598ed5df93f2f224c1ab2a2cca0d |
|||
| msg415112 - (view) | Author: Inada Naoki (methane) * (Python committer) | Date: 2022年03月14日 01:24 | |
Thanks. |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:59:27 | admin | set | github: 84010 |
| 2022年03月14日 01:24:38 | methane | set | messages: + msg415112 |
| 2022年03月14日 01:24:29 | methane | set | status: open -> closed resolution: fixed stage: patch review -> resolved |
| 2022年03月14日 01:24:06 | methane | set | messages: + msg415111 |
| 2022年03月13日 23:42:35 | iritkatriel | set | versions: - Python 3.8 |
| 2022年03月13日 23:35:48 | Crowthebird | set | versions: + Python 3.10, Python 3.11 |
| 2022年03月13日 05:11:02 | Crowthebird | set | messages: + msg415026 |
| 2022年03月11日 09:37:10 | Crowthebird | set | keywords:
+ patch stage: patch review pull_requests: + pull_request29914 |
| 2022年03月11日 05:54:15 | methane | set | messages: + msg414889 |
| 2022年03月10日 11:05:28 | Crowthebird | set | nosy:
+ Crowthebird messages: + msg414840 |
| 2022年03月08日 06:36:11 | methane | set | nosy:
+ methane |
| 2022年03月08日 06:02:46 | Dennis Sweeney | set | nosy:
+ Dennis Sweeney messages: + msg414727 |
| 2022年03月08日 05:17:04 | godlygeek | set | nosy:
+ godlygeek messages: + msg414724 |
| 2020年03月09日 16:20:31 | eric.snow | set | status: closed -> open messages: + msg363747 assignee: pablogsal resolution: not a bug -> (no value) stage: resolved -> (no value) |
| 2020年03月09日 16:12:29 | pablogsal | set | messages: + msg363746 |
| 2020年03月09日 16:09:26 | eric.snow | set | status: open -> closed resolution: not a bug messages: + msg363745 stage: resolved |
| 2020年03月07日 10:23:44 | terry.reedy | set | type: behavior -> performance messages: + msg363588 nosy: + terry.reedy |
| 2020年03月02日 15:44:00 | pablogsal | set | messages: + msg363188 |
| 2020年03月02日 15:36:16 | kimiguel | create | |