This issue tracker has been migrated to GitHub ,
and is currently read-only.
For more information,
see the GitHub FAQs in the Python's Developer Guide.
Created on 2015年10月26日 15:44 by eric.smith, last changed 2022年04月11日 14:58 by admin. This issue is now closed.
| Files | ||||
|---|---|---|---|---|
| File name | Uploaded | Description | Edit | |
| format-opcode.diff | eric.smith, 2015年10月26日 15:44 | review | ||
| format-opcode-1.diff | eric.smith, 2015年10月26日 17:35 | review | ||
| format-opcode-2.diff | eric.smith, 2015年10月28日 20:59 | review | ||
| format-opcode-3.diff | eric.smith, 2015年11月02日 13:30 | review | ||
| Messages (18) | |||
|---|---|---|---|
| msg253476 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月26日 15:44 | |
Currently, the f-string f'a{3!r:10}' evaluates to bytecode that does the same thing as:
''.join(['a', format(repr(3), '10')])
That is, it literally calls the functions format() and repr(). The same holds true for str() and ascii() with !s and !a, respectively.
By redefining format, str, repr, and ascii, you can break or pervert the computation of the f-string's value:
>>> def format(v, fmt=None): return '42'
...
>>> f'{3}'
'42'
It's always been my intention to fix this. This patch adds an opcode FORMAT_VALUE, which instead of looking up format, etc., directly calls PyObject_Format, PyObject_Str, PyObject_Repr, and PyObject_ASCII. Thus, you can no longer modify what an f-string produces merely by overriding the named functions.
In addition, because I'm now saving the name lookups and function calls, performance is improved.
Here are the times without this patch:
$ ./python -m timeit -s 'x="test"' 'f"{x}"'
1000000 loops, best of 3: 0.3 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
1000000 loops, best of 3: 0.511 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
1000000 loops, best of 3: 0.497 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
1000000 loops, best of 3: 0.461 usec per loop
And with this patch:
$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!s}"'
100000000 loops, best of 3: 0.02 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!r}"'
10000000 loops, best of 3: 0.0896 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x!a}"'
10000000 loops, best of 3: 0.0923 usec per loop
So a 90%+ speedup, for these simple cases.
Also, now f-strings are faster than %-formatting, at least for some types:
$ ./python -m timeit -s 'x="test"' '"%s"%x'
10000000 loops, best of 3: 0.0755 usec per loop
$ ./python -m timeit -s 'x="test"' 'f"{x}"'
10000000 loops, best of 3: 0.02 usec per loop
Note that people often "benchmark" %-formatting with code like the following. But the optimizer converts this to a constant string, so it's not a fair comparison:
$ ./python -m timeit '"%s"%"test"'
100000000 loops, best of 3: 0.0161 usec per loop
These microbenchmarks aren't the end of the story, since the string concatenation also takes some time. That's another optimization I might implement in the future.
Thanks to Mark and Larry for some advice on this.
|
|||
| msg253484 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月26日 17:35 | |
Small cleanups. Fixed a bug in PyCompile_OpcodeStackEffect. |
|||
| msg253505 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月26日 22:50 | |
This patch addresses Larry's review, plus bumps the bytecode magic number. |
|||
| msg253610 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月28日 17:22 | |
Oops. Forgot to include the diff with that last message. But it turns out it doesn't work, anyway, because I put the #define's in opcode.h, which is generated (so my code got deleted!). I'll try to find some reasonable .h file to use and submit a new patch soon. |
|||
| msg253611 - (view) | Author: Brett Cannon (brett.cannon) * (Python committer) | Date: 2015年10月28日 17:25 | |
I know this issue is slowly turning into "make Eric update outdated docs", but if you find that https://docs.python.org/devguide/compiler.html#introducing-new-bytecode is outdated, could you update that doc? |
|||
| msg253614 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年10月28日 18:35 | |
> I'll try to find some reasonable .h file to use and submit a new patch soon. It's Lib/opcode.py. |
|||
| msg253615 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月28日 18:46 | |
Brett: I'll take a look. Serhiy: I'm looking for a place to put some #defines related to the bit masks and bit values that my FORMAT_VALUE opcode is using for opargs. One option is to just put them in Tools/scripts/generate_opcode_h.py, so that they end up in the generated opcode.h, but that seems a little sleazy. I can't find a better place they'd belong, though. Specifically, I want to put these lines into a .h file to use by ceval.c and compile.c: /* Masks and values for FORMAT_VALUE opcode. */ #define FVC_MASK 0x3 #define FVS_MASK 0x4 #define FVC_NONE 0x0 #define FVC_STR 0x1 #define FVC_REPR 0x2 #define FVC_ASCII 0x3 #define FVS_HAVE_SPEC 0x4 |
|||
| msg253618 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年10月28日 19:27 | |
Does the dis module need these constants? If no, you can use either ceval.h or compile.h. |
|||
| msg253630 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年10月28日 20:59 | |
Thanks, Serihy. I looked at those, and neither one is a great fit. But not having a better option, I went with ceval.h. Here's the patch. |
|||
| msg253910 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年11月02日 13:30 | |
Some formatting improvements. I removed one of the optimizations I was doing, because it's also done in PyObject_Format(). I plan on moving other optimizations into PyObject_Format(), but I'll open a separate issue for that. I swapped the order of the parameters on the stack, so that I could use the micro-optimization of TOP() and SET_TOP(). I'll commit this shortly. |
|||
| msg253912 - (view) | Author: Serhiy Storchaka (serhiy.storchaka) * (Python committer) | Date: 2015年11月02日 13:49 | |
It looks to me that FVS_MASK and FVS_HAVE_SPEC are duplicates. FVS_HAVE_SPEC is set but FVS_MASK is tested. |
|||
| msg253915 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年11月02日 14:10 | |
Right, they're the same because it's a single bit. You 'and' with a mask to get the bits you want, and you 'or' together the values. It's an old habit left over from my bit-twiddling days. I guess the test could really be: have_fmt_spec = (oparg & FVS_MASK) == FVS_HAVE_SPEC; to make it more clear what I'm doing. It's easier to see the same thing with the FVC_MASK and FVC_* values, since that field is multiple bits. |
|||
| msg253928 - (view) | Author: Stefan Krah (skrah) * (Python committer) | Date: 2015年11月02日 15:50 | |
The MASK idiom is nice and I think it's good to be exposed to it from time to time. |
|||
| msg254006 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年11月03日 17:45 | |
New changeset 1ddeb2e175df by Eric V. Smith in branch 'default': Issue 25483: Add an opcode to make f-string formatting more robust. https://hg.python.org/cpython/rev/1ddeb2e175df |
|||
| msg254008 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年11月03日 18:09 | |
New changeset 4734713a31ed by Eric V. Smith in branch 'default': Issue 25483: Update dis.rst with FORMAT_VALUE opcode description. https://hg.python.org/cpython/rev/4734713a31ed |
|||
| msg254009 - (view) | Author: Eric V. Smith (eric.smith) * (Python committer) | Date: 2015年11月03日 18:13 | |
Brett: https://docs.python.org/devguide/compiler.html#introducing-new-bytecode looks correct (and reminded me to update dis.rst!). |
|||
| msg254015 - (view) | Author: Berker Peksag (berker.peksag) * (Python committer) | Date: 2015年11月03日 20:48 | |
+ * ``(flags & 0x03) == 0x00``: *value* is formattedd as-is. Just noticed a small typo: formattedd Also, needs ``.. versionadded:: 3.6`` |
|||
| msg254019 - (view) | Author: Roundup Robot (python-dev) (Python triager) | Date: 2015年11月03日 21:30 | |
New changeset 93fd7adbc7dd by Eric V. Smith in branch 'default': Issue 25483: Fix doc typo and added versionadded. Thanks, Berker Peksag. https://hg.python.org/cpython/rev/93fd7adbc7dd |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2022年04月11日 14:58:23 | admin | set | github: 69669 |
| 2015年11月03日 21:30:52 | python-dev | set | messages: + msg254019 |
| 2015年11月03日 20:48:29 | berker.peksag | set | nosy:
+ berker.peksag messages: + msg254015 |
| 2015年11月03日 18:13:36 | eric.smith | set | status: open -> closed resolution: fixed messages: + msg254009 stage: patch review -> resolved |
| 2015年11月03日 18:09:29 | python-dev | set | messages: + msg254008 |
| 2015年11月03日 17:45:35 | python-dev | set | nosy:
+ python-dev messages: + msg254006 |
| 2015年11月02日 15:50:25 | skrah | set | nosy:
+ skrah messages: + msg253928 |
| 2015年11月02日 14:10:06 | eric.smith | set | messages: + msg253915 |
| 2015年11月02日 13:49:23 | serhiy.storchaka | set | messages: + msg253912 |
| 2015年11月02日 13:30:15 | eric.smith | set | files:
+ format-opcode-3.diff messages: + msg253910 |
| 2015年10月28日 20:59:49 | eric.smith | set | files:
+ format-opcode-2.diff messages: + msg253630 |
| 2015年10月28日 19:27:42 | serhiy.storchaka | set | messages: + msg253618 |
| 2015年10月28日 18:46:09 | eric.smith | set | messages: + msg253615 |
| 2015年10月28日 18:35:27 | serhiy.storchaka | set | nosy:
+ serhiy.storchaka messages: + msg253614 |
| 2015年10月28日 17:25:52 | brett.cannon | set | messages: + msg253611 |
| 2015年10月28日 17:22:46 | eric.smith | set | messages: + msg253610 |
| 2015年10月27日 17:06:58 | brett.cannon | set | nosy:
+ brett.cannon |
| 2015年10月26日 22:50:48 | eric.smith | set | messages: + msg253505 |
| 2015年10月26日 17:35:21 | eric.smith | set | files:
+ format-opcode-1.diff messages: + msg253484 |
| 2015年10月26日 15:44:59 | eric.smith | create | |