[Python-ideas] Draft PEP on protecting finally clauses
Andrew Svetlov
andrew.svetlov at gmail.com
Sat Apr 7 23:09:37 CEST 2012
What's about reference implementation?
On Sun, Apr 8, 2012 at 12:08 AM, Andrew Svetlov
<andrew.svetlov at gmail.com> wrote:
> I've published this PEP as PEP-419: http://www.python.org/dev/peps/pep-0419/
> Thank you, Paul.
>> On Sat, Apr 7, 2012 at 12:04 AM, Paul Colomiets <paul at colomiets.name> wrote:
>> Hi,
>>>> I've finally made a PEP. Any feedback is appreciated.
>>>> --
>> Paul
>>>>>> PEP: XXX
>> Title: Protecting cleanup statements from interruptions
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Paul Colomiets <paul at colomiets.name>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 06-Apr-2012
>> Python-Version: 3.3
>>>>>> Abstract
>> ========
>>>> This PEP proposes a way to protect python code from being interrupted inside
>> finally statement or context manager.
>>>>>> Rationale
>> =========
>>>> Python has two nice ways to do cleanup. One is a ``finally`` statement
>> and the other is context manager (or ``with`` statement). Although,
>> neither of them is protected from ``KeyboardInterrupt`` or
>> ``generator.throw()``. For example::
>>>> lock.acquire()
>> try:
>> print('starting')
>> do_someting()
>> finally:
>> print('finished')
>> lock.release()
>>>> If ``KeyboardInterrupt`` occurs just after ``print`` function is
>> executed, lock will not be released. Similarly the following code
>> using ``with`` statement is affected::
>>>> from threading import Lock
>>>> class MyLock:
>>>> def __init__(self):
>> self._lock_impl = lock
>>>> def __enter__(self):
>> self._lock_impl.acquire()
>> print("LOCKED")
>>>> def __exit__(self):
>> print("UNLOCKING")
>> self._lock_impl.release()
>>>> lock = MyLock()
>> with lock:
>> do_something
>>>> If ``KeyboardInterrupt`` occurs near any of the ``print`` statements,
>> lock will never be released.
>>>>>> Coroutine Use Case
>> ------------------
>>>> Similar case occurs with coroutines. Usually coroutine libraries want
>> to interrupt coroutine with a timeout. There is a
>> ``generator.throw()`` method for this use case, but there are no
>> method to know is it currently yielded from inside a ``finally``.
>>>> Example that uses yield-based coroutines follows. Code looks
>> similar using any of the popular coroutine libraries Monocle [1]_,
>> Bluelet [2]_, or Twisted [3]_. ::
>>>> def run_locked()
>> yield connection.sendall('LOCK')
>> try:
>> yield do_something()
>> yield do_something_else()
>> finally:
>> yield connection.sendall('UNLOCK')
>>>> with timeout(5):
>> yield run_locked()
>>>> In the example above ``yield something`` means pause executing current
>> coroutine and execute coroutine ``something`` until it finished
>> execution. So that library keeps stack of generators itself. The
>> ``connection.sendall`` waits until socket is writable and does thing
>> similar to what ``socket.sendall`` does.
>>>> The ``with`` statement ensures that all that code is executed within 5
>> seconds timeout. It does so by registering a callback in main loop,
>> which calls ``generator.throw()`` to the top-most frame in the
>> coroutine stack when timeout happens.
>>>> The ``greenlets`` extension works in similar way, except it doesn't
>> need ``yield`` to enter new stack frame. Otherwise considerations are
>> similar.
>>>>>> Specification
>> =============
>>>> Frame Flag 'f_in_cleanup'
>> -------------------------
>>>> A new flag on frame object is proposed. It is set to ``True`` if this
>> frame is currently in the ``finally`` suite. Internally it must be
>> implemented as a counter of nested finally statements currently
>> executed.
>>>> The internal counter is also incremented when entering ``WITH_SETUP``
>> bytecode and ``WITH_CLEANUP`` bytecode, and is decremented when
>> leaving that bytecode. This allows to protect ``__enter__`` and
>> ``__exit__`` methods too.
>>>>>> Function 'sys.setcleanuphook'
>> -----------------------------
>>>> A new function for the ``sys`` module is proposed. This function sets
>> a callback which is executed every time ``f_in_cleanup`` becomes
>> ``False``. Callbacks gets ``frame`` as it's sole argument so it can
>> get some evindence where it is called from.
>>>> The setting is thread local and is stored inside ``PyThreadState``
>> structure.
>>>>>> Inspect Module Enhancements
>> ---------------------------
>>>> Two new functions are proposed for ``inspect`` module:
>> ``isframeincleanup`` and ``getcleanupframe``.
>>>> ``isframeincleanup`` given ``frame`` object or ``generator`` object as
>> sole argument returns the value of ``f_in_cleanup`` attribute of a
>> frame itself or of the ``gi_frame`` attribute of a generator.
>>>> ``getcleanupframe`` given ``frame`` object as sole argument returns
>> the innermost frame which has true value of ``f_in_cleanup`` or
>> ``None`` if no frames in the stack has the attribute set. It starts to
>> inspect from specified frame and walks to outer frames using
>> ``f_back`` pointers, just like ``getouterframes`` does.
>>>>>> Example
>> =======
>>>> Example implementation of ``SIGINT`` handler that interrupts safely
>> might look like::
>>>> import inspect, sys, functools
>>>> def sigint_handler(sig, frame)
>> if inspect.getcleanupframe(frame) is None:
>> raise KeyboardInterrupt()
>> sys.setcleanuphook(functools.partial(sigint_handler, 0))
>>>> Coroutine example is out of scope of this document, because it's
>> implemention depends very much on a trampoline (or main loop) used by
>> coroutine library.
>>>>>> Unresolved Issues
>> =================
>>>> Interruption Inside With Statement Expression
>> ---------------------------------------------
>>>> Given the statement::
>>>> with open(filename):
>> do_something()
>>>> Python can be interrupted after ``open`` is called, but before
>> ``SETUP_WITH`` bytecode is executed. There are two possible decisions:
>>>> * Protect expression inside ``with`` statement. This would need
>> another bytecode, since currently there is no delimiter at the start
>> of ``with`` expression
>>>> * Let user write a wrapper if he considers it's important for his
>> use-case. Safe wrapper code might look like the following::
>>>> class FileWrapper(object):
>>>> def __init__(self, filename, mode):
>> self.filename = filename
>> self.mode = mode
>>>> def __enter__(self):
>> self.file = open(self.filename, self.mode)
>>>> def __exit__(self):
>> self.file.close()
>>>> Alternatively it can be written using context manager::
>>>> @contextmanager
>> def open_wrapper(filename, mode):
>> file = open(filename, mode)
>> try:
>> yield file
>> finally:
>> file.close()
>>>> This code is safe, as first part of generator (before yield) is
>> executed inside ``WITH_SETUP`` bytecode of caller
>>>>>> Exception Propagation
>> ---------------------
>>>> Sometimes ``finally`` block or ``__enter__/__exit__`` method can be
>> exited with an exception. Usually it's not a problem, since more
>> important exception like ``KeyboardInterrupt`` or ``SystemExit``
>> should be thrown instead. But it may be nice to be able to keep
>> original exception inside a ``__context__`` attibute. So cleanup hook
>> signature may grow an exception argument::
>>>> def sigint_handler(sig, frame)
>> if inspect.getcleanupframe(frame) is None:
>> raise KeyboardInterrupt()
>> sys.setcleanuphook(retry_sigint)
>>>> def retry_sigint(frame, exception=None):
>> if inspect.getcleanupframe(frame) is None:
>> raise KeyboardInterrupt() from exception
>>>> .. note::
>>>> No need to have three arguments like in ``__exit__`` method since
>> we have a ``__traceback__`` attribute in exception in Python 3.x
>>>> Although, this will set ``__cause__`` for the exception, which is not
>> exactly what's intended. So some hidden interpeter logic may be used
>> to put ``__context__`` attribute on every exception raised in cleanup
>> hook.
>>>>>> Interruption Between Acquiring Resource and Try Block
>> -----------------------------------------------------
>>>> Example from the first section is not totally safe. Let's look closer::
>>>> lock.acquire()
>> try:
>> do_something()
>> finally:
>> lock.release()
>>>> There is no way it can be fixed without modifying the code. The actual
>> fix of this code depends very much on use case.
>>>> Usually code can be fixed using a ``with`` statement::
>>>> with lock:
>> do_something()
>>>> Although, for coroutines you usually can't use ``with`` statement
>> because you need to ``yield`` for both aquire and release operations.
>> So code might be rewritten as following::
>>>> try:
>> yield lock.acquire()
>> do_something()
>> finally:
>> yield lock.release()
>>>> The actual lock code might need more code to support this use case,
>> but implementation is usually trivial, like check if lock has been
>> acquired and unlock if it is.
>>>>>> Setting Interruption Context Inside Finally Itself
>> --------------------------------------------------
>>>> Some coroutine libraries may need to set a timeout for the finally
>> clause itself. For example::
>>>> try:
>> do_something()
>> finally:
>> with timeout(0.5):
>> try:
>> yield do_slow_cleanup()
>> finally:
>> yield do_fast_cleanup()
>>>> With current semantics timeout will either protect
>> the whole ``with`` block or nothing at all, depending on the
>> implementation of a library. What the author is intended is to treat
>> ``do_slow_cleanup`` as an ordinary code, and ``do_fast_cleanup`` as a
>> cleanup (non-interruptible one).
>>>> Similar case might occur when using greenlets or tasklets.
>>>> This case can be fixed by exposing ``f_in_cleanup`` as a counter, and
>> by calling cleanup hook on each decrement. Corouting library may then
>> remember the value at timeout start, and compare it on each hook
>> execution.
>>>> But in practice example is considered to be too obscure to take in
>> account.
>>>>>> Alternative Python Implementations Support
>> ==========================================
>>>> We consider ``f_in_cleanup`` and implementation detail. The actual
>> implementation may have some fake frame-like object passed to signal
>> handler, cleanup hook and returned from ``getcleanupframe``. The only
>> requirement is that ``inspect`` module functions work as expected on
>> that objects. For this reason we also allow to pass a ``generator``
>> object to a ``isframeincleanup`` function, this disables need to use
>> ``gi_frame`` attribute.
>>>> It may need to be specified that ``getcleanupframe`` must return the
>> same object that will be passed to cleanup hook at next invocation.
>>>>>> Alternative Names
>> =================
>>>> Original proposal had ``f_in_finally`` flag. The original intention
>> was to protect ``finally`` clauses. But as it grew up to protecting
>> ``__enter__`` and ``__exit__`` methods too, the ``f_in_cleanup``
>> method seems better. Although ``__enter__`` method is not a cleanup
>> routine, it at least relates to cleanup done by context managers.
>>>> ``setcleanuphook``, ``isframeincleanup`` and ``getcleanupframe`` can
>> be unobscured to ``set_cleanup_hook``, ``is_frame_in_cleanup`` and
>> ``get_cleanup_frame``, althought they follow convention of their
>> respective modules.
>>>>>> Alternative Proposals
>> =====================
>>>> Propagating 'f_in_cleanup' Flag Automatically
>> -----------------------------------------------
>>>> This can make ``getcleanupframe`` unnecessary. But for yield based
>> coroutines you need to propagate it yourself. Making it writable leads
>> to somewhat unpredictable behavior of ``setcleanuphook``
>>>>>> Add Bytecodes 'INCR_CLEANUP', 'DECR_CLEANUP'
>> --------------------------------------------
>>>> These bytecodes can be used to protect expression inside ``with``
>> statement, as well as making counter increments more explicit and easy
>> to debug (visible inside a disassembly). Some middle ground might be
>> chosen, like ``END_FINALLY`` and ``SETUP_WITH`` imlicitly decrements
>> counter (``END_FINALLY`` is present at end of ``with`` suite).
>>>> Although, adding new bytecodes must be considered very carefully.
>>>>>> Expose 'f_in_cleanup' as a Counter
>> ----------------------------------
>>>> The original intention was to expose minimum needed functionality.
>> Although, as we consider frame flag ``f_in_cleanup`` as an
>> implementation detail, we may expose it as a counter.
>>>> Similarly, if we have a counter we may need to have cleanup hook
>> called on every counter decrement. It's unlikely have much performance
>> impact as nested finally clauses are unlikely common case.
>>>>>> Add code object flag 'CO_CLEANUP'
>> ---------------------------------
>>>> As an alternative to set flag inside ``WITH_SETUP``, and
>> ``WITH_CLEANUP`` bytecodes we can introduce a flag ``CO_CLEANUP``.
>> When interpreter starts to execute code with ``CO_CLEANUP`` set, it
>> sets ``f_in_cleanup`` for the whole function body. This flag is set
>> for code object of ``__enter__`` and ``__exit__`` special methods.
>> Technically it might be set on functions called ``__enter__`` and
>> ``__exit__``.
>>>> This seems to be less clear solution. It also covers the case where
>> ``__enter__`` and ``__exit__`` are called manually. This may be
>> accepted either as feature or as a unnecessary side-effect (unlikely
>> as a bug).
>>>> It may also impose a problem when ``__enter__`` or ``__exit__``
>> function are implemented in C, as there usually no frame to check for
>> ``f_in_cleanup`` flag.
>>>>>> Have Cleanup Callback on Frame Object Itself
>> ----------------------------------------------
>>>> Frame may be extended to have ``f_cleanup_callback`` which is called
>> when ``f_in_cleanup`` is reset to 0. It would help to register
>> different callbacks to different coroutines.
>>>> Despite apparent beauty. This solution doesn't add anything. As there
>> are two primary use cases:
>>>> * Set callback in signal handler. The callback is inherently single
>> one for this case
>>>> * Use single callback per loop for coroutine use case. And in almost
>> all cases there is only one loop per thread
>>>>>> No Cleanup Hook
>> ---------------
>>>> Original proposal included no cleanup hook specification. As there are
>> few ways to achieve the same using current tools:
>>>> * Use ``sys.settrace`` and ``f_trace`` callback. It may impose some
>> problem to debugging, and has big performance impact (although,
>> interrupting doesn't happen very often)
>>>> * Sleep a bit more and try again. For coroutine library it's easy. For
>> signals it may be achieved using ``alert``.
>>>> Both methods are considered too impractical and a way to catch exit
>> from ``finally`` statement is proposed.
>>>>>> References
>> ==========
>>>> .. [1] Monocle
>> https://github.com/saucelabs/monocle
>>>> .. [2] Bluelet
>> https://github.com/sampsyo/bluelet
>>>> .. [3] Twisted: inlineCallbacks
>> http://twistedmatrix.com/documents/8.1.0/api/twisted.internet.defer.html
>>>> .. [4] Original discussion
>> http://mail.python.org/pipermail/python-ideas/2012-April/014705.html
>>>>>> Copyright
>> =========
>>>> This document has been placed in the public domain.
>>>>>>>> ..
>> Local Variables:
>> mode: indented-text
>> indent-tabs-mode: nil
>> sentence-end-double-space: t
>> fill-column: 70
>> coding: utf-8
>> End:
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> http://mail.python.org/mailman/listinfo/python-ideas
>>>> --
> Thanks,
> Andrew Svetlov
--
Thanks,
Andrew Svetlov
More information about the Python-ideas
mailing list