I want to implement a function (or several) that must run exactly once when the program terminates, no matter how this termination came about1.
Below is my best attempt at doing this. Specifically, the code between the ####...####
dividers is the most succinct2 implementation of this functionality I have managed to come up with.
I post it here primarily because even this my "most succinct" implementation seems to me like way too much code for what I regard as an extremely common use-case. Can the same be achieved more simply?
from __future__ import print_function
##############################################################################
import sys
import atexit
import signal
@atexit.register
def cleanup_1():
print2('running cleanup_1')
@atexit.register
def cleanup_0():
print2('running cleanup_0')
def excepthook(exception_type, exception_value, traceback):
# better: log the unhandled exception
print2('unhandled exception: {}'.format(exception_type.__name__))
sys.excepthook = excepthook
def _set_handlers():
_numbers_to_names = {
int(getattr(signal, name)): name
for name in dir(signal)
if name.startswith('SIG') and '_' not in name
}
def signal_handler(signal_number, stack):
signal_name = _numbers_to_names[signal_number]
print2('received signal: {0} ({1})'.format(signal_number,
signal_name))
sys.exit(signal_number)
# The signals included in the array below are the ones that cause the
# process to terminate when I run it on my system. This may need
# fine-tuning for portability.
to_handle = ['SIGHUP', 'SIGINT', 'SIGQUIT', 'SIGILL', 'SIGTRAP', 'SIGIOT',
'SIGBUS', 'SIGFPE', 'SIGUSR1', 'SIGSEGV', 'SIGUSR2',
'SIGALRM', 'SIGTERM', 'SIGXCPU', 'SIGVTALRM', 'SIGPROF',
'SIGPOLL', 'SIGPWR', 'SIGSYS']
for signal_name in to_handle:
signal_number = getattr(signal, signal_name)
handler = signal.getsignal(signal_number)
if handler is signal.SIG_DFL:
signal.signal(signal_number, signal_handler)
_set_handlers()
del _set_handlers
##############################################################################
def print2(*args, **kwargs):
if 'file' in kwargs:
raise TypeError("'file' is an invalid argument for print2()")
print(*args, file=sys.stderr, **kwargs)
def run(*args):
print2('running...'.format(args[0]))
if '0' in args[:1]:
return
if '1' in args[:1]:
raise RuntimeError()
while True:
pass
def bye():
print2('program terminates normally')
sys.exit(0)
def main(*args):
run(*args)
bye()
if __name__ == '__main__':
main(*sys.argv[1:])
Notes on the implementation
- The code before and after the
####...####
-delimited section is there just to let me try out the implemented cleanup functionality informally on the command line. (In other words, this code is not really the primary focus of this post; feel free to comment on it if you wish, but please do not dwell on it at the expense of the code between the####...####
dividers.) - I wrote the code so that it is compatible with both Python 2 and Python 3.
- One problem with a short example like this one is that it does not capture the complexity of production software, where multiple libraries may independently want to register their own clean up functions and signal handlers. As I was writing this code, I tried to keep these more complex scenarios in mind, though I do not know if I succeeded. It is quite possible that, if I tried to use the code above in production, I may find that it is fundamentally unsuitable for such more complex situations. Feedback on the suitability of this code for a production setting is particularly welcome.
- If you want to run the code on the command line, an argument of
0
causes the script to terminate normally (and right away); an argument of1
causes the script to fail with aRuntimeError
exception; any other argument (or no argument) causes the script to enter an infinite loop (so that one can comfortably send signals to it).
EDITS: I moved the definitions of the actual cleanup functions (cleanup_1
and cleanup_0
) nearer to the top, removed some comments from the _set_handlers
function, and re-wrote the initialization of its to_handle
variable in fewer lines.
1 I realize that, as stated, this goal is not achievable (i.e., no matter what I do, my cleanup function(s) will not run if the computer crashes, for example, while the program is running). Therefore, please interpret the stated goal as "an ideal to strive for".
2 I want to stress that, although I am looking for something "shorter", I do not intend this post as a "code golf" exercise. When I write "most succinct", I take it for granted that readability and clarity remain non-negotiable requirements. Rather I am looking for standard Python facilities or idioms that achieve the same aims as my code does in much fewer lines, and yet without sacrificing readability and clarity.
1 Answer 1
sys.exit(signal_number)
This is quite different visible behaviour, as seen by the parent process. It might be better to remove the signal handler and then re-raise the signal (using signal.raise_signal()
on ourself), so that we don't perturb the rest of the behaviour. Also, that would stop us inhibiting core dumps for the signals where that's the default response.
I see that we use signal.getsignal()
to avoid adding our handler for defaulted signals - that's pretty good. There's a tiny weakness when it returns None
- we don't know whether a non-Python signal handler is going to unilaterally exit the process without returning to the interpreter, so it's impossible to know whether or not to perform the clean-up in that case. I think we'd benefit from having that in a comment.
It's a shame we have to explicitly list the signals which normally cause program termination. It means we have to consider all platforms we could possibly run on. On Linux, we need also to catch
SIGABRT
SIGEMT
SIGLOST
SIGPIPE
SIGSTKFLT
SIGXFSZ
- The full set of real-time signals.
It would be nice if there were a way to enumerate signals whose default action is to terminate the process (with or without core dump), but I'm not aware that's even possible.
-
\$\begingroup\$ Thank you for your feedback! I agree with you that having the signal handler run
sys.exit(signal_number)
is a questionable design choice. Unfortunately, the alternative design of just having the handler runsignal.raise_signal(signal_number)
, by itself, would mean that the callbacks registered withatexit
would not run. (This is the reason for trapping the signals in the first place.) Therefore one would have to "manually" run those callbacks. One can do this withatexit._run_exitfuncs()
. As one could guess from the leading underscore in its name, however, ... \$\endgroup\$kjo– kjo2021年10月25日 11:54:41 +00:00Commented Oct 25, 2021 at 11:54 -
\$\begingroup\$ ...this function is not part of the official
atexit
API. Therefore, this design choice boils down to which one considers the "lesser evil": runningsys.exit(...)
upon receiving a terminating signal, or "reaching behind"atexit
's API. That said, on further thought, I am now inclined to opt for the latter as the lesser evil. \$\endgroup\$kjo– kjo2021年10月25日 11:54:50 +00:00Commented Oct 25, 2021 at 11:54 -
1\$\begingroup\$ Oh yes, that's a dilemma! I'm glad the review helped you come to a decision on what best to do here. \$\endgroup\$Toby Speight– Toby Speight2021年10月25日 11:56:30 +00:00Commented Oct 25, 2021 at 11:56
try
finally
instead - effectively re-writingwith
? \$\endgroup\$