Implementing a clean-up function that must run at program's termination no-matter-what

Question 1

I want to implement a function (or several) that must run exactly once when the program terminates, no matter how this termination came about¹.

Below is my best attempt at doing this. Specifically, the code between the ####...#### dividers is the most succinct² implementation of this functionality I have managed to come up with.

I post it here primarily because even this my "most succinct" implementation seems to me like way too much code for what I regard as an extremely common use-case. Can the same be achieved more simply?

from __future__ import print_function
##############################################################################
import sys
import atexit
import signal
@atexit.register
def cleanup_1():
 print2('running cleanup_1')
@atexit.register
def cleanup_0():
 print2('running cleanup_0')
def excepthook(exception_type, exception_value, traceback):
 # better: log the unhandled exception
 print2('unhandled exception: {}'.format(exception_type.__name__))
sys.excepthook = excepthook
def _set_handlers():
 _numbers_to_names = {
 int(getattr(signal, name)): name
 for name in dir(signal)
 if name.startswith('SIG') and '_' not in name
 }
 def signal_handler(signal_number, stack):
 signal_name = _numbers_to_names[signal_number]
 print2('received signal: {0} ({1})'.format(signal_number,
 signal_name))
 sys.exit(signal_number)
 # The signals included in the array below are the ones that cause the
 # process to terminate when I run it on my system. This may need
 # fine-tuning for portability.
 to_handle = ['SIGHUP', 'SIGINT', 'SIGQUIT', 'SIGILL', 'SIGTRAP', 'SIGIOT',
 'SIGBUS', 'SIGFPE', 'SIGUSR1', 'SIGSEGV', 'SIGUSR2',
 'SIGALRM', 'SIGTERM', 'SIGXCPU', 'SIGVTALRM', 'SIGPROF',
 'SIGPOLL', 'SIGPWR', 'SIGSYS']
 for signal_name in to_handle:
 signal_number = getattr(signal, signal_name)
 handler = signal.getsignal(signal_number)
 if handler is signal.SIG_DFL:
 signal.signal(signal_number, signal_handler)
_set_handlers()
del _set_handlers
##############################################################################
def print2(*args, **kwargs):
 if 'file' in kwargs:
 raise TypeError("'file' is an invalid argument for print2()")
 print(*args, file=sys.stderr, **kwargs)
def run(*args):
 print2('running...'.format(args[0]))
 if '0' in args[:1]:
 return
 if '1' in args[:1]:
 raise RuntimeError()
 while True:
 pass
def bye():
 print2('program terminates normally')
 sys.exit(0)
def main(*args):
 run(*args)
 bye()
if __name__ == '__main__':
 main(*sys.argv[1:])

Notes on the implementation

The code before and after the ####...####-delimited section is there just to let me try out the implemented cleanup functionality informally on the command line. (In other words, this code is not really the primary focus of this post; feel free to comment on it if you wish, but please do not dwell on it at the expense of the code between the ####...#### dividers.)
I wrote the code so that it is compatible with both Python 2 and Python 3.
One problem with a short example like this one is that it does not capture the complexity of production software, where multiple libraries may independently want to register their own clean up functions and signal handlers. As I was writing this code, I tried to keep these more complex scenarios in mind, though I do not know if I succeeded. It is quite possible that, if I tried to use the code above in production, I may find that it is fundamentally unsuitable for such more complex situations. Feedback on the suitability of this code for a production setting is particularly welcome.
If you want to run the code on the command line, an argument of 0 causes the script to terminate normally (and right away); an argument of 1 causes the script to fail with a RuntimeError exception; any other argument (or no argument) causes the script to enter an infinite loop (so that one can comfortably send signals to it).

EDITS: I moved the definitions of the actual cleanup functions (cleanup_1 and cleanup_0) nearer to the top, removed some comments from the _set_handlers function, and re-wrote the initialization of its to_handle variable in fewer lines.

^{¹ I realize that, as stated, this goal is not achievable (i.e., no matter what I do, my cleanup function(s) will not run if the computer crashes, for example, while the program is running). Therefore, please interpret the stated goal as "an ideal to strive for".}

^{² I want to stress that, although I am looking for something "shorter", I do not intend this post as a "code golf" exercise. When I write "most succinct", I take it for granted that readability and clarity remain non-negotiable requirements. Rather I am looking for standard Python facilities or idioms that achieve the same aims as my code does in much fewer lines, and yet without sacrificing readability and clarity.}

Question 2

Long time no see - welcome back! It's great to see code presented with such a helpful program to exercise it. I hope you get some good reviews for this.

Question 3

@TobySpeight: Thank you, I appreciate the encouragement!

Question 4

Have you thought of using a context manager? What are the drawbacks to using a context manager? If the only issue is Python 2.x compatibility have you thought about using a context manager with a try finally instead - effectively re-writing with?

Question 5

@Peilonrayz: sorry for not asking you this earlier, but it is not clear to me how context managers address my post's question. Could you post a code sketch of what you have in mind?

Question 6

Please do not incorporate answers into your question. If you have new code that you still would like to have reviewed further, please post a follow-up question instead.

Question 7

 sys.exit(signal_number)

This is quite different visible behaviour, as seen by the parent process. It might be better to remove the signal handler and then re-raise the signal (using signal.raise_signal() on ourself), so that we don't perturb the rest of the behaviour. Also, that would stop us inhibiting core dumps for the signals where that's the default response.

I see that we use signal.getsignal() to avoid adding our handler for defaulted signals - that's pretty good. There's a tiny weakness when it returns None - we don't know whether a non-Python signal handler is going to unilaterally exit the process without returning to the interpreter, so it's impossible to know whether or not to perform the clean-up in that case. I think we'd benefit from having that in a comment.

It's a shame we have to explicitly list the signals which normally cause program termination. It means we have to consider all platforms we could possibly run on. On Linux, we need also to catch

SIGABRT
SIGEMT
SIGLOST
SIGPIPE
SIGSTKFLT
SIGXFSZ
The full set of real-time signals.

It would be nice if there were a way to enumerate signals whose default action is to terminate the process (with or without core dump), but I'm not aware that's even possible.

Question 8

Thank you for your feedback! I agree with you that having the signal handler run sys.exit(signal_number) is a questionable design choice. Unfortunately, the alternative design of just having the handler run signal.raise_signal(signal_number), by itself, would mean that the callbacks registered with atexit would not run. (This is the reason for trapping the signals in the first place.) Therefore one would have to "manually" run those callbacks. One can do this with atexit._run_exitfuncs(). As one could guess from the leading underscore in its name, however, ...

Question 9

...this function is not part of the official atexit API. Therefore, this design choice boils down to which one considers the "lesser evil": running sys.exit(...) upon receiving a terminating signal, or "reaching behind" atexit's API. That said, on further thought, I am now inclined to opt for the latter as the lesser evil.

Question 10

Oh yes, that's a dilemma! I'm glad the review helped you come to a decision on what best to do here.

Toby Speight Toby Speight 87.1k14 gold badges104 silver badges322 bronze badges · Accepted Answer · 2021-10-24 14:26:16Z

 sys.exit(signal_number)

This is quite different visible behaviour, as seen by the parent process. It might be better to remove the signal handler and then re-raise the signal (using signal.raise_signal() on ourself), so that we don't perturb the rest of the behaviour. Also, that would stop us inhibiting core dumps for the signals where that's the default response.

I see that we use signal.getsignal() to avoid adding our handler for defaulted signals - that's pretty good. There's a tiny weakness when it returns None - we don't know whether a non-Python signal handler is going to unilaterally exit the process without returning to the interpreter, so it's impossible to know whether or not to perform the clean-up in that case. I think we'd benefit from having that in a comment.

It's a shame we have to explicitly list the signals which normally cause program termination. It means we have to consider all platforms we could possibly run on. On Linux, we need also to catch

SIGABRT
SIGEMT
SIGLOST
SIGPIPE
SIGSTKFLT
SIGXFSZ
The full set of real-time signals.

It would be nice if there were a way to enumerate signals whose default action is to terminate the process (with or without core dump), but I'm not aware that's even possible.

Thank you for your feedback! I agree with you that having the signal handler run sys.exit(signal_number) is a questionable design choice. Unfortunately, the alternative design of just having the handler run signal.raise_signal(signal_number), by itself, would mean that the callbacks registered with atexit would not run. (This is the reason for trapping the signals in the first place.) Therefore one would have to "manually" run those callbacks. One can do this with atexit._run_exitfuncs(). As one could guess from the leading underscore in its name, however, ...
...this function is not part of the official atexit API. Therefore, this design choice boils down to which one considers the "lesser evil": running sys.exit(...) upon receiving a terminating signal, or "reaching behind" atexit's API. That said, on further thought, I am now inclined to opt for the latter as the lesser evil.
Oh yes, that's a dilemma! I'm glad the review helped you come to a decision on what best to do here.

Stack Exchange Network

Implementing a clean-up function that must run at program's termination no-matter-what

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Implementing a clean-up function that must run at program's termination no-matter-what

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions