[Python-Dev] Re: PEP 667: Consistent views of namespaces

Hi Nick,
On 22/08/2021 4:51 am, Nick Coghlan wrote:
On 2021年8月22日, 10:47 am Guido van Rossum, <[email protected] 
<mailto:[email protected]>> wrote:
 Hopefully anyone is still reading python-dev.
 I'm going to try to summarize the differences between the two
 proposals, even though Mark already did so in his PEP. But I'd like
 to start by calling out the key point of contention.
 Everything here is about locals() and f_locals in *function scope*.
 (I use f_locals to refer to the f_locals field of frame objects as
 seen from Python code.) And in particular, it is about what I'll
 call "extra variables": the current CPython feature that you can add
 *new* variables to f_locals that don't exist in the frame, for example:
 def foo():
   x = 1
   locals()["y"] = 2 # or sys._getframe()["y"] = 2
 My first reaction was to propose to drop this feature, but I realize
 it's kind of important for debuggers to be able to execute arbitrary
 code in function code -- assignments to locals should affect the
 frame, but it should also be possible to create new variables (e.g.
 temporaries). So I agree we should keep this.
I actually tried taking this feature out in one of the PEP 558 drafts, 
but actually doing so breaks the pdb test suite.
 So apparently the key difference of opinion between Mark and Nick is
 about f_locals, and what to do with extras. In Nick's proposal when
 you reference f.f_locals twice in a row (for the same frame object
 f), you get the same proxy object, whereas in Mark's proposal you
 get a different object each time, but it doesn't matter, because the
 proxy has no state other than a reference to the frame.
If PEP 558 is still giving that impression, I need to fix the wording - 
the proxy objects are ephemeral in both PEPs (the 558 text is slightly 
behind the implementation on that point, as the fast refs mapping is now 
stored on the frame object, so it only needs to be built once)
 In Mark's proposal, if you assign a value to an extra variable, it
 gets stored in a hidden dict field on the frame, and when you read
 the proxy, the contents of that hidden dict field gets included.
 This hidden dict lazily created on the first store to an extra
 variable. (Mark shows pseudo-code to clarify this; the hidden dict
 is stored as _extra_locals on the frame.)
PEP 558 works essentially the same way, the difference is that it uses 
the existing locals dict storage rather than adding new storage just for 
optimised frames.
 In Nick's proposal, there's a cache on the frame that stores both
 the extras and the proper variables. This cache can get out of sync
 with the contents of the proper variables when some bytecode is
 executed (for performance reasons we don't want the bytecode to keep
 the cache up to date on every store), so there's an operation to
 sync the frame cache (sync_frame_cache(), it's not defined in which
 namespace this exists -- is it a builtin or in sys?).
It's an extra method on the proxy objects. You only need it if you keep 
an old proxy object around - if you always retrieve a new proxy object 
after executing Python code, that proxy will refresh the cache when it 
needs to.
 Frankly the description in Nick's PEP is hard to follow -- I am not
 100% sure what is meant by "the dynamic snapshot", and it's not
 quite clear whether proper variables are copied into the cache (and
 if so, why).
Aye, Mark was a bit quicker with his PEP than I anticipated, so I've 
incorporated the implementation improvements arising from his last round 
of comments, but the PEP text hasn't been updated yet.
 Personally, I find Mark's proposed semantics for f_locals simpler --
 there's no cache, only storage for extras, so there's nothing that
 can get out of sync.
The wording in PEP 667 undersells the cost of that simplification:
"Code that uses PyEval_GetLocals() will continue to operate safely, but 
will need to be changed to use PyEval_Locals() to restore functionality."
Code that uses PyEval_GetLocals() will NOT continue to operate safely 
under PEP 667: all such code will raise an exception at runtime, and 
need to be rewritten to use a new API with different refcounting 
semantics. That's essentially all code that accesses the frame locals 
from C, since we don't offer supported APIs for that other than 
PyEval_GetLocals() (directly accessing the f_locals field on the frame 
object is only "supported" in a very loose sense of the word, although 
PEP 558 mostly keeps that working, too)
This means the real key difference between the two PEPs is that Mark is 
proposing a gratuitous compatibility break for PyEval_GetLocals() that 
also means that the algorithmic complexity characteristics of the proxy 
implementation will be completely off from those of a regular dict (e.g. 
len(proxy) will be O(n) in the number of variables defined on the frame 
rather than being O(1) after the proxy's initial cache update the way it 
is in PEP 558)
If Mark's claim that PyEval_GetLocals() could not be fixed was true then 
I would be more sympathetic to his proposal, but I know it isn't true, 
because it still works fine in the PEP 558 implementation (it even 
immediately sees changes made via proxies, and proxies see changes to 
extra variables). The only truly unfixable public API is 
PyFrame_LocalsToFast().
You are making claims that seem inconsistent with each other.
Namely, you are claiming that:
1. That the result of locals() is ephemeral.
2. That PyEval_GetLocals() returns a borrowed reference.
This seems impossible, as you can't return a borrowed reference to
an emphemeral object. That's just a pointer to freed memory.
Do `locals()` and `PyEval_GetLocals()` behave differently?
Is the result of `PyEval_GetLocals()` cached, but `locals()` not?
If that were the case, then it is a bit confusing, but could work.
Would PyEval_GetLocals() be defined as something like this?
(add _locals_cache attribute to the frame which is initialized to NULL).
def PyEval_GetLocals():
 frame._locals_cache attribute = locals()
 return borrow(frame._locals_cache attribute)
None of this is clear (at least not to me) from PEP 558.
Cheers,
Mark.
On the code complexity front, while the cache management in PEP 558 does 
incur a bit of extra complexity, it also offers a lot of code 
simplification as many mutable mapping API operations can be delegated 
to the cache instead of needing to be implemented directly against the 
fast locals array (e.g. the keys(), values() and items() views all 
interact with the cache rather than the underlying frame storage, so the 
implementation doesn't need proxy-specific types for those). For O(n) 
operations, the cache is refreshed every time, while for less than O(n) 
operations, the cache is refreshed if it is the first time that 
particular proxy instance has needed it.
While API clients *can* delve into the details of exactly when and how 
the cache gets refreshed, they can also adopt the simple principle of 
"if in doubt, request a new locals reference" and let the interpreter 
worry about the details.
Cheers,
Nick.
_______________________________________________
Python-Dev mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/FRBDMVCX6P7RX3M4TW7PLANIU7ZTHDO5/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP 667: Consistent views of namespaces

Reply via email to