What is best practice to define a main function and entry point for a script/module that may be used started as main, but not always?
Here's how I've been doing it in the past, similar to realpython:
def main():
result = do_stuff()
return result
if __name__ == "__main__":
main()
This is fine for most purposes and can give a return-code if the main function is revoked from another module, however I've recently encountered an interesting twist on it, returning exit codes:
def main():
result = do_stuff()
return result
if __name__ == "__main__":
import sys
sys.exit(main())
This has the benefit of the script being easily usable as part of any other script/program, given the entire routine is required and has a useful return value.
Then, if we add argparse to the mix it gets interesting. If we try to import our module, we'll have a problem in how we use the arguments. Sure we could start our script as it's own process, but that is needlessly inefficient.
So in a scenario with arguments and returncodes in the mix I do it like this:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('arg1', ...)
def main(args=parser.parse_args('')):
result = do_stuff(args.arg1)
return result
if __name__ == "__main__":
import sys
sys.exit(main(args=parser.parse_args()))
This way the module can be imported by other programs and if the main is to be used, you'll just have to pass the argument in a namespace that you create yourself with the wanted arguments. The default value for the main is a parse_args('')
call with an empty string, which initializes the namespace with all values as None
instead of getting sys.argv
by itself. To make that difference more verbose and as sys is imported for passing the exit-code anyway, once could pass the args manually and have the last line be like this:
sys.exit(main(args=parser.parse_args(sys.argv[1:])))
So the last one is the one I'd consider going with for future Python programs/scripts. Is this good practice? Is there a better way to do this?
2 Answers 2
Your doing it in the past way is probably fine for quick& dirty scripts or for mini tools used by other developers. There’s often no need to be fancy. On the contrary, Python’s default traceback output can be an appropriate or even desired form of error reporting. A script falling off its end exits with code 0, an unhandled exception exits the script with a non-zero code, so you’re covered there, too:
def main() -> None:
# here would be code raising exceptions on error
# note the `None` return type
if __name__ == '__main__':
main()
When you need something more fancy working with sys.exit()
explicitly is a good idea. That’s basically your future approach. But don’t only think of scripts called directly, also think of installed packages. Setuptools has a cross-platform mechanism to define functions as entry points for scripts. If you have this in your setup.py:
setuptools.setup(
...
entry_points={
'console_scripts': ['your_script=your_package.your_module:main'],
},
)
and install that package you can run your_script
from the command line.
A console_scripts
function must be callable without any arguments. So, as long as all parameters have default values you’re fine. For longer running scripts consider explicitly handling KeyboardInterrupt
to print a nice message when the user aborts with Ctrl+C. In the end the relevant part of your your_package/your_module.py might look something like this:
def main(cli_args: List[str] = None) -> int:
"""
`cli_args` makes it possible to call this function command-line-style
from other Python code without touching sys.argv.
"""
try:
# Parsing with `argparse` and additional processing
# is usually lenghty enough to extract into separate functions.
raw_config = _parse_cli(
sys.argv[1:] if cli_args is None else cli_args)
config = _validate_and_sanitize(raw_config)
# Same exception raising idea as the simple approach
do_real_work(config.foo, config.bar)
except KeyboardInterrupt:
print('Aborted manually.', file=sys.stderr)
return 1
except Exception as err:
# (in real code the `except` would probably be less broad)
# Turn exceptions into appropriate logs and/or console output.
# non-zero return code to signal error
# Can of course be more fine grained than this general
# "something went wrong" code.
return 1
return 0 # success
# __main__ support is still here to make this file executable without
# installing the package first.
if __name__ == '__main__':
sys.exit(main())
-
1I really like that main setup. Yes, catching and printing errors before returning an error code seems nice, even though some might argue that one could just raise it again.jaaq– jaaq2020年11月03日 14:32:19 +00:00Commented Nov 3, 2020 at 14:32
-
But why not do
_parse_cli(cli_args or sys.argv[1:])
instead of the inline if?jaaq– jaaq2020年11月03日 14:32:50 +00:00Commented Nov 3, 2020 at 14:32 -
1@jaaq Why not
_parse_cli(cli_args or sys.argv[1:])
? For two reasons: I simply didn’t think of that possibility when writing the answer. And I’m not a big fan anyway. It reads so much like a boolean expression that I tend to trip over it. No such problem with the inlineif
even if it’s somewhat more verbose.besc– besc2020年11月03日 16:08:24 +00:00Commented Nov 3, 2020 at 16:08
I have worked on this template and use-case a bit more and here's the more refined structure for files containing a main function that I currently use:
#!/usr/bin/env python3
import sys
from argparse import ArgumentParser, Namespace
from typing import Dict, List
import yaml # just used as an example here for loading more configs, optional
def parse_arguments(cli_args: List[str] = None) -> Namespace:
parser = ArgumentParser()
# parser.add_argument()
# ...
return parser.parse_args(args=cli_args) # None defaults to sys.argv[1:]
def load_configs(args: Namespace) -> Dict:
try:
with open(args.config_path, 'r') as file_pointer:
config = yaml.safe_load(file_pointer)
# arrange and check configs here
return config
except Exception as err:
# log errors
print(err)
if err == "Really Bad":
raise err
# potentionally return some sane fallback defaults if desired/reasonable
sane_defaults = []
return sane_defaults
def main(args: Namespace = parse_arguments()) -> int:
try:
# maybe load some additional config files here or in a function called here
# e.g. args contains a path to a config folder; or use sane defaults
# if the config files are missing(if that is your desired behavior)
config = load_configs(args)
do_real_work(args, config)
except KeyboardInterrupt:
print("Aborted manually.", file=sys.stderr)
return 1
except Exception as err:
# (in real code the `except` would probably be less broad)
# Turn exceptions into appropriate logs and/or console output.
# log err
print("An unhandled exception crashed the application!", err)
# non-zero return code to signal error
# Can of course be more fine grained than this general
# "something went wrong" code.
return 1
return 0 # success
# __main__ support is still here to make this file executable without
# installing the package first.
if __name__ == "__main__":
sys.exit(main(parse_arguments()))
Having the parse_arguments
function makes integration tests much more readable, as you then can just call that function to generate the desired namespace object for you, using the same string you'd use on the cli. Then as suggested in the accepted answer handle errors to give the output you'd want and pass the arguments to the function(s) doing the work. I also load and arrange configs in this context, as necessary.
sys.argv
." See also e.g. stackoverflow.com/a/18161115/3001761. I would also suggest the explicit arguments tomain
would be better than a single argument that's the whole namespace, as the former would support default values, type annotations, etc.exit
.main
actually does return an exit code in that way. I would consider that coupling to the CLI interface,main
should return or throw a business-related value, which the code underif __name__ == "__main__":
turns into an appropriate exit code. For reuse elsewhere the exit code may not make sense (unless main is that CLI wrapper and the business logic sits in some other function it invokes).