This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2011年12月16日 00:02 by ron_adam, last changed 2022年04月11日 14:57 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| f_why2.diff | ron_adam, 2011年12月24日 06:40 | Clean up generator.c and ceval.c with better use of why codes. | review | |
| exit_code_example.c | Mark.Shannon, 2012年02月13日 14:42 | Example code for returning exit code directly | ||
| Messages (10) | |||
|---|---|---|---|
| msg149583 - (view) | Author: Ron Adam (ron_adam) * | Date: 2011年12月16日 00:02 | |
The following changes cleanup the eval loop and result in a pretty solid 2 to 3% improvement in pybench for me. And it is about 5% faster for long generators. * Change why enum type to int and #defines. And moved the why defines to opcode.h so that they can be seen by the genrator objects after a yield, return, or exception. * Added an "int f_why" to frames so the generator can see why it returned from a send. * Refactored generator obj so it can use the "f->f_why" to determine what to do without having to do several checks first. * Moved the generator specific execption save/swap/and clear out of the cevel main loop. No need to check for those on every function call. The only test that fails is the frame size is test_sys. I left that in for now so someone could check that, and tell me if it's ok to fix it, or if I need to do something else. I also considered putting the why on the tstate object. It might save some memory as there wouldn't be as many of those. |
|||
| msg149591 - (view) | Author: Ron Adam (ron_adam) * | Date: 2011年12月16日 03:48 | |
A simple test to show the difference. BEFORE: $ python3 -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(10))" 100000 loops, best of 3: 3.87 usec per loop $ python3 -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(1000000))" 10 loops, best of 3: 186 msec per loop AFTER: $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(10))" 100000 loops, best of 3: 3.81 usec per loop $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(1000000))" 10 loops, best of 3: 168 msec per loop before after y(10) usec's 3.87 3.81 - 1.55% y(1000000) msec's 186 168 - 9.67% |
|||
| msg149650 - (view) | Author: Benjamin Peterson (benjamin.peterson) * (Python committer) | Date: 2011年12月17日 00:31 | |
Seems mostly fine to me. |
|||
| msg149821 - (view) | Author: Ron Adam (ron_adam) * | Date: 2011年12月19日 06:04 | |
New diff file. The main difference is I moved the saved why value to the tstate object instead of the frame object as why_exit. I'm not seeing the time savings now for some reason. Maybe the previous increase was a case of coincidental noise. (?) Still looking for other reasons though. ;-) Moving the generator code from the eval loop to the generator object may still be a good thing to do. |
|||
| msg149953 - (view) | Author: Meador Inge (meador.inge) * (Python committer) | Date: 2011年12月21日 02:10 | |
With the new patch I see no benefits on the same micro-benchmarks you posted (it is even slower for the smaller case) on a quad-core 64-bit F15 box: BEFORE: $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(10))" 1000000 loops, best of 3: 1.33 usec per loop $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(1000000))" 10 loops, best of 3: 66 msec per loop AFTER: $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(10))" 1000000 loops, best of 3: 1.45 usec per loop $ ./python -mtimeit "def y(n):" " for x in range(n):" " yield x" "sum(y(1000000))" 10 loops, best of 3: 65.8 msec per loop |
|||
| msg149954 - (view) | Author: Alyssa Coghlan (ncoghlan) * (Python committer) | Date: 2011年12月21日 03:10 | |
The thing that most appeals to me with this concept is moving closer to making it possible to experiment with generator-style functionality in *extension* modules (albeit extension modules that are coupled to private CPython APIs). So, for me, "not worse than the status quo" would be the main thing I'd be looking for out of any micro-benchmarks. However, that also makes me question the movement of the "why_exit" from the frame to the tstate - having it on the thread state is significantly less flexible when it comes to experimenting with execution models. The move from orthogonal bit flags in a dedicated enum to int fields and macro definitions also seems like a completely unnecessary pessimisation. |
|||
| msg150026 - (view) | Author: Ron Adam (ron_adam) * | Date: 2011年12月21日 17:59 | |
I think the time benefits I saw are dependent on how the C code is compiled. So it may be different on different compilers or the same compiler with only a very minor change.
Some of the things I've noticed...
A switch is sometimes slower if it has a "default:" block.
Moving "why = tstate->why_exit;" to just before the if helps a tiny bit.
why = tstate->why_exit;
if (why != WHY_EXCEPTION) {
Returning directly out of the WHY_YIELD case.
case WHY_YIELD:
return result;
These will be mostly compiler dependent, but they won't slow things down any either.
What I was trying to do is clean up things a bit in ceval.c. Having the why available on the frame can help by allowing some things to be moved out of ceval.c where it makes sense to do that.
I'll post a new diff tonight with the why_exit moved back to the frame object and the why's back to enums. Yes, I think the frame object is a good place for it.
One of the odd issues is the opcode and why values sometimes need to be pushed on the value stack. Which means it needs to be converted to a pyobject first, then converted back after it's pulled off the stack. I'm looking for a way to avoid that.
I also was able to make fast_block_end section simpler and more understandable without making it slower. I think it was trying to be too clever in order to save some lines of code. That makes it very hard to figure out by someone else.
But first I need to finish my Christmas shopping, I'll post a new patch tonight when I get a chance. :-)
|
|||
| msg150216 - (view) | Author: Ron Adam (ron_adam) * | Date: 2011年12月24日 06:40 | |
Updated patch with suggested changes. It also has a cleaned up fast_block_end section. Concerning speed. What happens is (as Tim and Raymond have pointed out) that we can make some things a little faster, in exchange for other things being a little slower. You can play around with the order of the why cases in the fast_block_end section and see that effect. By using a switch instead of if-else's, that should result in more consistent balance between the block exit cases. The order I currently have gives a little more priority for exceptions and that seems to help a tiny bit with the ccbench scores. I think that is a better bench mark than the small micro tests like pybench does. The problem with pybench is, it doesn't test deeper nesting where these particular changes will have a greater effect. |
|||
| msg153275 - (view) | Author: Mark Shannon (Mark.Shannon) * (Python committer) | Date: 2012年02月13日 14:42 | |
Why do the codes used internally by ceval have to be the same as those made public? Have you considered returning the exit code directly, rather than adding another field to frameobject? yieldfrom can be handled in a similar way to yield by returning the iterator (instead of the yielded value). See attached file for example code. |
|||
| msg312817 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2018年02月25日 15:08 | |
The WHY_* codes are gone in 3.8 (see issue17611). Exception state was moved from frame to generator in 3.7 (see issue25612). Can some parts of the patch be applied to the current code? Or it is completely outdated? |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:57:24 | admin | set | github: 57816 |
| 2018年09月24日 16:56:26 | serhiy.storchaka | set | status: pending -> closed resolution: out of date stage: patch review -> resolved |
| 2018年02月25日 15:08:25 | serhiy.storchaka | set | status: open -> pending |
| 2018年02月25日 15:08:13 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg312817 |
| 2012年10月15日 19:24:22 | terry.reedy | set | versions: + Python 3.4, - Python 3.3 |
| 2012年02月13日 14:42:15 | Mark.Shannon | set | files:
+ exit_code_example.c nosy: + Mark.Shannon messages: + msg153275 |
| 2011年12月24日 06:41:02 | ron_adam | set | files:
+ f_why2.diff messages: + msg150216 |
| 2011年12月24日 06:05:28 | ron_adam | set | files: - f_why1.diff |
| 2011年12月21日 18:26:14 | eric.snow | set | nosy:
+ eric.snow |
| 2011年12月21日 17:59:48 | ron_adam | set | messages: + msg150026 |
| 2011年12月21日 03:10:59 | ncoghlan | set | messages: + msg149954 |
| 2011年12月21日 02:10:37 | meador.inge | set | messages: + msg149953 |
| 2011年12月21日 02:03:32 | meador.inge | set | nosy:
+ meador.inge |
| 2011年12月19日 06:04:28 | ron_adam | set | files:
+ f_why1.diff messages: + msg149821 |
| 2011年12月19日 05:28:50 | ron_adam | set | files: - f_why.diff |
| 2011年12月17日 00:31:16 | benjamin.peterson | set | assignee: benjamin.peterson messages: + msg149650 |
| 2011年12月17日 00:29:00 | terry.reedy | set | nosy:
+ terry.reedy |
| 2011年12月16日 17:04:04 | jcea | set | nosy:
+ jcea |
| 2011年12月16日 06:51:15 | pitrou | set | nosy:
+ ncoghlan, benjamin.peterson stage: patch review |
| 2011年12月16日 03:48:32 | ron_adam | set | messages: + msg149591 |
| 2011年12月16日 00:02:53 | ron_adam | create | |