3
\$\begingroup\$

I have a configuration (YAML) file used across related modules. I also have a few additional variables to be accessed globally in my modules. I want to have them at one place (an instance of a class in my implementation below but I am willing to change it).

This implementation (based on recommendations from the web and ChatGPT) is working as intended. I like that I can define the basic dictionary needed across all modules and then add additional items to the dictionary based on the particular module I'm calling by their relevant methods.

But there must be a better implementation. In particular, I don't like storing the database engine in a dictionary (as almost all other items have strings). And storing it as a separate instance variable seems to have one outlier. Another thing I don't like is the long name when using dictionary notation config['parameter1']['parameter2'].

What is a better way to handle this? Any help is appreciated.

"""Module to setup configuration parameters (read from YAML + additional)"""
import logging
import os
from pathlib import Path
import dotenv
import sqlalchemy
import yaml
from . import utils
# project directory as basepath irrespective of Python script or Jupyter notebook
BASEPATH = utils.agnostic_path(Path().resolve())
dotenv.load_dotenv()
codify_env = os.getenv("CODIFY_ENV")
codify_client = os.getenv("CODIFY_CLIENT")
class CodifyConfig:
 """
 CodifyConfig
 """
 def __init__(self, path_config: Path = Path(BASEPATH, "config", "config.yml")):
 self.path_config = path_config
 self.config = self.load_config()
 self.setup_additional_params()
 def load_config(self):
 """Loads project wide YAML configuration file"""
 with open(self.path_config, "r", encoding="utf-8") as file:
 config = yaml.safe_load(file)["client"][codify_client][codify_env]
 return config
 def setup_additional_params(self):
 """
 Adds parameters to config not in the YAML config file because
 initial YAML parameters are chosen based on some of these parameters
 """
 self.config["basepath"] = BASEPATH
 self.config["client"] = codify_client
 self.config["env"] = codify_env
 def setup_database(self, db: str = ""):
 """Create database engine"""
 db = db or self.config.get("database_url")
 self.config["engine"] = sqlalchemy.create_engine(db)
 def setup_logging(self, fn_log: str = "default.log") -> None:
 """Defaults and file name for logging"""
 path_log = Path(self.config["basepath"], "logs", fn_log)
 logging.basicConfig(
 filename=path_log,
 filemode="a",
 format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s",
 datefmt="%H:%M:%S",
 level=logging.INFO,
 )
asked Nov 22, 2023 at 17:56
\$\endgroup\$
1

1 Answer 1

3
\$\begingroup\$

Another thing I don't like is the long name when using dictionary notation config['parameter1']['parameter2'].

Consider using the glom library, which was designed to address exactly that concern.


informative identifier

# project directory as basepath irrespective of Python script or Jupyter notebook
BASEPATH = utils.agnostic_path(Path().resolve())

Thank you for the comment, I find it helpful.

But I'm sad that you felt you needed a comment. That is, agnostic_path seems to be an unhelpful identifier. Ideally we wouldn't need to write a comment at all.

Maybe utils.script_folder() would be more self explanatory? I often will add a get_repo_top() helper to my projects, which does the equivalent of $ git rev-parse --show-toplevel.

I feel your utility function should definitely return Path rather than str. It is more descriptive. And it is easier / less visually intrusive to convert to str, with f"{folder}", than the other way 'round.


clean import

dotenv.load_dotenv()

This is not terrible, but I'd rather see it protected by if __name__ == '__main__': or run within some function.

I assume it will complain if the calling user lacks dot files. A $ python -m unittest run should import cleanly, without error, even if dot files are missing. An import on a CI/CD host or container should work smoothly, even if dot files are missing.

From its section you're telling me pretty clearly that $ pip install dotenv is the way to obtain this library. But I don't see that method in the source.


globals

codify_env = os.getenv("CODIFY_ENV")
codify_client = os.getenv("CODIFY_CLIENT")

These are kind of OK as module level symbols. But maybe there's no need to clutter up that Public API namespace, maybe we could defer the lookup until we need it? Or do it within an __init__ ctor?

As a followup to the "clean import" remarks above, I do thank you for accepting None when env var is undefined. Flip side of that is, when we actually use the value, maybe os.eviron["CODIFY_ENV"] would offer a clearer diagnostic in the event someone forgot to set it?


unhelpful docstring

 """
 CodifyConfig
 """

Please delete.

Or write an English sentence describing the responsibility of this class. That will be an aid to future maintainers, who will pass up the temptation to add kitchen_sink to the class, based on your advice about what does and does not belong here.

In contrast, you've written a bunch of nice method and module docstrings, thank you.

If a linter "made" you insert a docstring, then consider using a linter that is less picky by default, or change your current linter's config.


local variable vs object attribute

 self.path_config = path_config

Consider deleting that, preferring a call to self.load_config(path_config).

Consider renaming the parameter to config_path.


extra helper

Usually I encourage folks to break up functions using Extract Helper, and then break them up some more. But you already have nice bite-sized pieces here. Maybe setup_additional_params should be merged with load_config? Then the codify client and env wouldn't go out of scope, so we could create them as local variables, use them twice, and be done with them.

Consider renaming setup_additional_params to setup_runtime_params, as an explanation of why those params can't come from a static text file.


dict vs object attribute

 self.config["engine"] = sqlalchemy.create_engine(db)

Yeah, I agree with you, that is not a "config" item, and it looks weird. Prefer:

 self.engine = sqlalchemy.create_engine(db)

BTW, I like that db or ... defaulting idiom. However, I don't like the default empty string value, as it is useless, it won't give us e.g. a sqlite in-memory database. So I recommend eliding the default value.


abbrev

 def setup_logging(self, fn_log: str = "default.log") -> None:

It wouldn't hurt to spell out filename_log (or log_filename).

 path_log = Path(self.config["basepath"], "logs", fn_log)

This is perfectly fine as-is. Usual way to phrase this would be with / slashes:

 path_log = self.config["basepath"] / "logs" / fn_log)

This code achieves its design goals.

I would be willing to delegate or accept maintenance tasks on it.

answered Nov 22, 2023 at 21:12
\$\endgroup\$
3
  • \$\begingroup\$ Thank you for the help. I have applied, almost, all of the improvements mentioned. The one thing I am unsure of is the last sentence of "clean import" section. Which method don't you see in the source? \$\endgroup\$ Commented Nov 23, 2023 at 7:26
  • \$\begingroup\$ By "clean import" I mean: no fatal error on missing env var, no chatty print() side effects, that sort of thing. The OP code calls load_dotenv(). It was my belief that pip would install a certain package. But when I visited the linked package source code I was unable to locate load_dotenv(), so I couldn't tell what failure modes it might suffer from. \$\endgroup\$ Commented Nov 23, 2023 at 7:38
  • 1
    \$\begingroup\$ Got it. Thanks. The package is: pypi.org/project/python-dotenv \$\endgroup\$ Commented Nov 23, 2023 at 8:01

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.