On 2021年8月22日, 10:47 am Guido van Rossum, <[email protected]
<mailto:[email protected]>> wrote:
Hopefully anyone is still reading python-dev.
I'm going to try to summarize the differences between the two
proposals, even though Mark already did so in his PEP. But I'd like
to start by calling out the key point of contention.
Everything here is about locals() and f_locals in *function scope*.
(I use f_locals to refer to the f_locals field of frame objects as
seen from Python code.) And in particular, it is about what I'll
call "extra variables": the current CPython feature that you can add
*new* variables to f_locals that don't exist in the frame, for example:
def foo():
x = 1
locals()["y"] = 2 # or sys._getframe()["y"] = 2
My first reaction was to propose to drop this feature, but I realize
it's kind of important for debuggers to be able to execute arbitrary
code in function code -- assignments to locals should affect the
frame, but it should also be possible to create new variables (e.g.
temporaries). So I agree we should keep this.
I actually tried taking this feature out in one of the PEP 558 drafts,
but actually doing so breaks the pdb test suite.
So apparently the key difference of opinion between Mark and Nick is
about f_locals, and what to do with extras. In Nick's proposal when
you reference f.f_locals twice in a row (for the same frame object
f), you get the same proxy object, whereas in Mark's proposal you
get a different object each time, but it doesn't matter, because the
proxy has no state other than a reference to the frame.
If PEP 558 is still giving that impression, I need to fix the wording -
the proxy objects are ephemeral in both PEPs (the 558 text is slightly
behind the implementation on that point, as the fast refs mapping is now
stored on the frame object, so it only needs to be built once)
In Mark's proposal, if you assign a value to an extra variable, it
gets stored in a hidden dict field on the frame, and when you read
the proxy, the contents of that hidden dict field gets included.
This hidden dict lazily created on the first store to an extra
variable. (Mark shows pseudo-code to clarify this; the hidden dict
is stored as _extra_locals on the frame.)
PEP 558 works essentially the same way, the difference is that it uses
the existing locals dict storage rather than adding new storage just for
optimised frames.
In Nick's proposal, there's a cache on the frame that stores both
the extras and the proper variables. This cache can get out of sync
with the contents of the proper variables when some bytecode is
executed (for performance reasons we don't want the bytecode to keep
the cache up to date on every store), so there's an operation to
sync the frame cache (sync_frame_cache(), it's not defined in which
namespace this exists -- is it a builtin or in sys?).
It's an extra method on the proxy objects. You only need it if you keep
an old proxy object around - if you always retrieve a new proxy object
after executing Python code, that proxy will refresh the cache when it
needs to.
Frankly the description in Nick's PEP is hard to follow -- I am not
100% sure what is meant by "the dynamic snapshot", and it's not
quite clear whether proper variables are copied into the cache (and
if so, why).
Aye, Mark was a bit quicker with his PEP than I anticipated, so I've
incorporated the implementation improvements arising from his last round
of comments, but the PEP text hasn't been updated yet.
Personally, I find Mark's proposed semantics for f_locals simpler --
there's no cache, only storage for extras, so there's nothing that
can get out of sync.
The wording in PEP 667 undersells the cost of that simplification:
"Code that uses PyEval_GetLocals() will continue to operate safely, but
will need to be changed to use PyEval_Locals() to restore functionality."
Code that uses PyEval_GetLocals() will NOT continue to operate safely
under PEP 667: all such code will raise an exception at runtime, and
need to be rewritten to use a new API with different refcounting
semantics. That's essentially all code that accesses the frame locals
from C, since we don't offer supported APIs for that other than
PyEval_GetLocals() (directly accessing the f_locals field on the frame
object is only "supported" in a very loose sense of the word, although
PEP 558 mostly keeps that working, too)
This means the real key difference between the two PEPs is that Mark is
proposing a gratuitous compatibility break for PyEval_GetLocals() that
also means that the algorithmic complexity characteristics of the proxy
implementation will be completely off from those of a regular dict (e.g.
len(proxy) will be O(n) in the number of variables defined on the frame
rather than being O(1) after the proxy's initial cache update the way it
is in PEP 558)
If Mark's claim that PyEval_GetLocals() could not be fixed was true then
I would be more sympathetic to his proposal, but I know it isn't true,
because it still works fine in the PEP 558 implementation (it even
immediately sees changes made via proxies, and proxies see changes to
extra variables). The only truly unfixable public API is
PyFrame_LocalsToFast().