UPDATE: Second revision on separate post. Runtime function overloading / dynamic dispatch for Python (2nd revision)
When I first started using Python I had a rough time dealing with some of it's dynamic-typed nature. In particular I was pretty used to leveraging the type system of other languages to be able to define polymorphic (or "overloaded") functions and methods. At first I implemented this pattern in Python using stubs and defining functions that do runtime type checks to switch between different behaviour, but that involved a lot of boilerplate, made the resulting functions and methods brittle and had a lot of down sides when working with member functions/methods. A copule of years a go I made this library*[1], that allows for a succint way of defining polymorphic functions using a decorator. I've used it in most of my Python projects ever since, but no one outside my team has ever seen it or reviewed it. So here it is.
I'd like to ask:
- Does the interface seem ergonomic?;
- is the code remotely readable/understandable?
- am I missing something, i.e., are there any glaring issues with the code?
- could the candidate selection strategy be improved?
- altought performance is not a main issue (it is a library for an interpreted language after all), are there any obvious areas where runtime overhead could be reduced? and
- how would one approach a rewrite to
__call__
as to allow for the overloads to return proper types and notAny
, to leverage type checking tools like MyPy.
Of course, I'd also love to read any thorough review or critique in terms of runtime overhead, general style (is the code "pythonic"?), or any other comment/note about the library.
"""
===============
sobrecargar.py
===============
Method and function overloading for Python 3.
* Project Repository: https://github.com/Hernanatn/sobrecargar.py
* Documentation: https://github.com/Hernanatn/sobrecargar.py/blob/master/README.MD
Hernan ATN | [email protected]
"""
__author__ = "Hernan ATN"
__license__ = "MIT"
__version__ = "1.0"
__email__ = "[email protected]"
__all__ = ['overload']
from inspect import signature, Signature, Parameter, ismethod
from types import MappingProxyType
from typing import Callable, TypeVar, Iterator, ItemsView, OrderedDict, Self, Any, List, Tuple, Iterable, Generic
from collections.abc import Sequence, Mapping
from collections import namedtuple
from functools import partial
from sys import modules, version_info
from itertools import zip_longest
import __main__
if version_info < (3, 9):
raise ImportError("Module 'sobrecargar' requires Python 3.9 or higher.")
# Public Interface
class overload():
"""
Class that acts as a type-function decorator, allowing the definition of multiple
versions of a function or method with different sets of parameters and types.
This enables function overloading similar to that found in statically typed
programming languages like C++.
Class Attributes:
_overloaded (dict): A dictionary that maintains a record of 'overload'
instances created for each decorated function or method. Keys are function
or method names, and values are 'overload' instances.
Instance Attributes:
overloads (dict): A dictionary storing the defined overloads for the
decorated function or method. Keys are Signature objects representing
overload signatures, and values are corresponding functions or methods.
"""
_overloaded : dict[str, 'overload'] = {}
def __new__(cls, function : Callable)-> 'overload':
"""
Constructor. Creates a single instance per function name.
Args:
function (Callable): The function or method to be decorated.
Returns:
overload: The 'overload' class instance associated with the provided function name.
"""
full_name : str = cls.__full_name(function)
if full_name not in cls._overloaded.keys():
cls._overloaded[full_name] = super().__new__(overload)
return cls._overloaded[full_name]
def __init__(self, function : Callable) -> None:
"""
Initializer. Responsible for initializing the overloads dictionary
(if not already present) and registering the current version of the
decorated function or method.
Args:
function (Callable): The decorated function or method.
"""
if not hasattr(self, 'overloads'):
self.overloads : dict[Signature, Callable] = {}
signature : Signature
underlying_function : Callable
signature, underlying_function = overload.__unwrap(function)
if type(self).__is_method(function):
cls : type = type(self).__get_class(function)
for ancestor in cls.__mro__:
for base in ancestor.__bases__:
if base is object : break
full_method_name : str = f"{base.__module__}.{base.__name__}.{function.__name__}"
if full_method_name in type(self)._overloaded.keys():
base_overload : 'overload' = type(self)._overloaded[full_method_name]
self.overloads.update(base_overload.overloads)
self.overloads[signature] = underlying_function
if not self.__doc__: self.__doc__ = ""
self.__doc__ += f"\n{function.__doc__ or ''}"
def __call__(self, *args, **kwargs) -> Any:
"""
Method that allows the decorator instance to be called as a function.
The module's core engine. Validates the provided parameters and builds
a tuple of 'candidates' from functions that match the provided parameters.
Prioritizes the overload that best fits the types and number of arguments.
If multiple candidates match, propagates the result of the most specific one.
Args:
*args: Positional arguments passed to the function or method.
**kwargs: Nominal arguments passed to the function or method.
Returns:
Any: The result of the selected version of the decorated function or method.
Raises:
TypeError: If no compatible overload exists for the provided parameters.
"""
_C = TypeVar("_C", bound=Sequence)
_T = TypeVar("_T", bound=Any)
Candidate : namedtuple = namedtuple('Candidate', ['score', 'function_object', "function_signature"])
candidates : List[Candidate] = []
def validate_container(value : _C, container_parameter : Parameter) -> int | bool:
type_score : int = 0
container_annotation = container_parameter.annotation
if not hasattr(container_annotation, "__origin__") or not hasattr(container_annotation, "__args__"):
type_score += 1
return type_score
if not issubclass(type(value), container_annotation.__origin__):
return False
container_arguments : Tuple[type[_C]] = container_annotation.__args__
has_ellipsis : bool = Ellipsis in container_arguments
has_single_type : bool = len(container_arguments) == 1 or has_ellipsis
if has_ellipsis:
aux_container_list : list = list(container_arguments)
aux_container_list[1] = aux_container_list[0]
container_arguments = tuple(aux_container_list)
type_iterator : Iterator
if has_single_type:
type_iterator = zip_longest((type(t) for t in value), container_arguments, fillvalue=container_arguments[0])
else:
type_iterator = zip_longest((type(t) for t in value), container_arguments)
if not issubclass(type(value[0]), container_arguments[0]):
return False
for received_type, expected_type in type_iterator:
if expected_type == None :
return False
if received_type == expected_type:
type_score += 2
elif issubclass(received_type, expected_type):
type_score += 1
else:
return False
return type_score
def validate_parameter_type(value : _T, function_parameter : Parameter) -> int | bool:
type_score : int = 0
expected_type = function_parameter.annotation
received_type : type[_T] = type(value)
is_untyped : bool = (expected_type == Any)
default_value : _T = function_parameter.default
is_null : bool = value is None and default_value is None
is_default : bool = value is None and default_value is not function_parameter.empty
param_is_self : bool = function_parameter.name=='self' or function_parameter.name=='cls'
param_is_variable : bool = function_parameter.kind == function_parameter.VAR_POSITIONAL or function_parameter.kind == function_parameter.VAR_KEYWORD
param_is_container : bool = hasattr(expected_type, "__origin__") or (issubclass(expected_type, Sequence) and not issubclass(expected_type, str)) or issubclass(expected_type, Mapping)
is_different_type : bool
if param_is_variable and param_is_container:
is_different_type = not issubclass(received_type, expected_type.__args__[0])
elif param_is_container:
is_different_type = not validate_container(value, function_parameter)
else:
is_different_type = not issubclass(received_type, expected_type)
if not is_untyped and not is_null and not param_is_self and not is_default and is_different_type:
return False
elif param_is_variable and not param_is_container:
type_score += 1
else:
if param_is_variable and param_is_container:
if received_type == expected_type.__args__[0]:
type_score +=2
elif issubclass(received_type, expected_type.__args__[0]):
type_score +=1
elif param_is_container:
type_score += validate_container(value, function_parameter)
elif received_type == expected_type:
type_score += 4
elif issubclass(received_type, expected_type):
type_score += 3
elif is_default:
type_score += 2
elif is_null or param_is_self or is_untyped:
type_score += 1
return type_score
def validate_signature(function_parameters : MappingProxyType[str,Parameter], positional_count : int, positional_iterator : Iterator[tuple], nominal_view : ItemsView) -> int |bool:
signature_score : int = 0
this_score : int | bool
for positional_value, positional_name in positional_iterator:
this_score = validate_parameter_type(positional_value, function_parameters[positional_name])
if this_score:
signature_score += this_score
else:
return False
for nominal_name, nominal_value in nominal_view:
if nominal_name not in function_parameters: return False
this_score = validate_parameter_type(nominal_value, function_parameters[nominal_name])
if this_score:
signature_score += this_score
else:
return False
return signature_score
for signature, function in self.overloads.items():
length_score : int = 0
function_parameters : MappingProxyType[str,Parameter] = signature.parameters
positional_count : int = len(function_parameters) if type(self).__has_var_args(function_parameters) else len(args)
nominal_count : int = len({nom : kwargs[nom] for nom in function_parameters if nom in kwargs}) if (type(self).__has_var_kwargs(function_parameters) or type(self).__has_only_nom(function_parameters)) else len(kwargs)
default_count : int = type(self).__has_default(function_parameters) if type(self).__has_default(function_parameters) else 0
positional_iterator : Iterator[tuple[Any,str]] = zip(args, list(function_parameters)[:positional_count])
nominal_view : ItemsView[str,Any] = kwargs.items()
if (len(function_parameters) == 0 or not (type(self).__has_variables(function_parameters) or type(self).__has_default(function_parameters))) and len(function_parameters) != (len(args) + len(kwargs)): continue
if len(function_parameters) - (positional_count + nominal_count) == 0 and not(type(self).__has_variables(function_parameters) or type(self).__has_default(function_parameters)):
length_score += 3
elif len(function_parameters) - (positional_count + nominal_count) == 0:
length_score += 2
elif (0 <= len(function_parameters) - (positional_count + nominal_count) <= default_count) or (type(self).__has_variables(function_parameters)):
length_score += 1
else:
continue
signature_validation_score : int | bool = validate_signature(function_parameters, positional_count, positional_iterator, nominal_view)
if signature_validation_score:
this_candidate : Candidate = Candidate(score=(length_score+2*signature_validation_score), function_object=function, function_signature=signature)
candidates.append(this_candidate)
else:
continue
if candidates:
if len(candidates)>1:
candidates.sort(key= lambda c: c.score, reverse=True)
best_function = candidates[0].function_object
return best_function(*args, **kwargs)
else:
raise TypeError(f"[ERROR] No overloads of {function.__name__} exist for the provided parameters:\n {[type(pos) for pos in args]} {[(k,type(nom)) for k,nom in kwargs.items()]}\n Supported overloads: {[dict(sig.parameters) for sig in self.overloads.keys()]}")
def __get__(self, obj, obj_type):
#
class OverloadedMethod:
__doc__ = self.__doc__
__call__ = partial(self.__call__, obj) if obj is not None else partial(self.__call__, obj_type)
return OverloadedMethod()
# Private Interface
@staticmethod
def __unwrap(function : Callable) -> tuple[Signature, Callable]:
while hasattr(function, '__func__'):
function = function.__func__
while hasattr(function, '__wrapped__'):
function = function.__wrapped__
signature : Signature = signature(function)
return (signature, function)
@staticmethod
def __full_name(function : Callable) -> str :
return f"{function.__module__}.{function.__qualname__}"
@staticmethod
def __is_method(function : Callable) -> bool :
return function.__name__ != function.__qualname__ and "<locals>" not in function.__qualname__.split(".")
@staticmethod
def __is_nested(function : Callable) -> bool:
return function.__name__ != function.__qualname__ and "<locals>" in function.__qualname__.split(".")
@staticmethod
def __get_class(method : Callable) -> type:
return getattr(modules[method.__module__], method.__qualname__.split(".")[0])
@staticmethod
def __has_variables(function_parameters : MappingProxyType[str,Parameter]) -> bool:
for parameter in function_parameters.values():
if overload.__has_var_kwargs(function_parameters) or overload.__has_var_args(function_parameters): return True
return False
@staticmethod
def __has_var_args(function_parameters : MappingProxyType[str,Parameter]) -> bool:
for parameter in function_parameters.values():
if parameter.kind == Parameter.VAR_POSITIONAL: return True
return False
@staticmethod
def __has_var_kwargs(function_parameters : MappingProxyType[str,Parameter]) -> bool:
for parameter in function_parameters.values():
if parameter.kind == Parameter.VAR_KEYWORD: return True
return False
@staticmethod
def __has_default(function_parameters : MappingProxyType[str,Parameter]) -> int | bool:
default_count : int = 0
for parameter in function_parameters.values():
if parameter.default != parameter.empty: default_count+=1
return default_count if default_count else False
@staticmethod
def __has_only_nom(function_parameters : MappingProxyType[str,Parameter]) -> bool:
for parameter in function_parameters.values():
if parameter.kind == Parameter.KEYWORD_ONLY: return True
return False
English documentation
Description
sobrecargar
is a Python module that includes a single homonymous class, which provides the implementation of a universal @decorator that allows defining multiple versions of a function or method with different sets of parameters and types. This enables function overloading similar to that found in other programming languages like C++.
Basic Usage
Decorating a Function:
You can use @overload
[2] as the decorator for functions or methods.
from sobrecargar import overload
@overload
def my_function(parameter1: int, parameter2: str):
# Code for the first version of the function
...
@overload
def my_function(parameter1: float):
# Code for the second version of the function
...
Decorating a method / member function:
Since sobrecargar
interferes with the normal compilation flow of function code, and methods (member functions) are typically defined when defining the class, decorating methods requires special syntax. Attempting to use overload
like this:
from sobrecargar import overload
class MyClass:
@overload
def my_method(self, parameter1: int, parameter2: str):
# Code for the first version of the method
...
@overload
def my_method(self, parameter1: float):
# Code for the second version of the method
...
Will produce an error like:
[ERROR] AttributeError: module __main__ does not have a 'MyClass' attribute.
This happens because when overload
tries to create the dispatch dictionary for the different overloads of my_method
, the class named MyClass
has not yet finished being defined, and therefore the compiler doesn't know it exists.
The solution is to provide a signature for the class before attempting to overload any of its methods. The signature only requires the class name and inheritance scheme.
from sobrecargar import overload
class MyClass: pass # By providing a signature for the class, you ensure that `sobrecargar` can reference it at compile time
class MyClass:
@overload
def my_method(self, parameter1: int, parameter2: str):
# Code for the first version of the method
...
@overload
def my_method(self, parameter1: float):
# Code for the second version of the method
...
Edit: added an example
A more complete example that show a plausible use case, as requested in this comment
By far the most frequent use case I, personally have for function overloading is Class constructor overload. e.g., consider some rudimentary database record model.
Given a table Products
:
id | SKU | Title | Artist (FK) | Description | Format | Price | ... |
---|---|---|---|---|---|---|---|
1 | A-123-C-77 | Jazz in Ba | 899 | ... | CD | 5.99 | ... |
2 | A-705-V-5 | We'll be togheter at last | 7566 | ... | Vynil | 8.99 | ... |
3 | B-905-C-5 | Ad Cordis | 123 | ... | CD | 3.99 | ... |
4 | B-101-C-77 | Brain Damage | 1222 | ... | CD | 3.99 | ... |
... | ... | ... | ... | ... | ... | ... |
One could define a class Products
that represents that table:
class SomeDbAbstraction:
...
def run_query(query : str, ...) -> dict[str,Any]: ...
def get_insert_id() -> int: ...
...
class Format(Enum):
_invalid = 0
CD = 1
Vynil = 2
class Artist: ...
class Product:
__slots__(
"__id",
"sku",
"title",
"artist",
"description",
"format",
"price",
...
)
def __init__(
self,
id : int,
sku : str,
title : str,
artist : Artist,
description : str,
format : Format,
price : float,
...
) -> None:
self.__id = id
self.sku = title
self.title = title
self.artist = artist
self.description = description
self.format = format
self.price = price
...
Let's say that the values for each column can come from vaired sources, e.g., a read from the database, a JSON api endpoint, an HTTP form, some other Python code, &c. Then we would need to define utility functions / classmethods that correctly handel each case, one possible implementation would be:
@classmethod
def fromId(cls, db : SomeDbAbstraction, id : int) -> 'Product':
data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE id = {id};")
return cls(
data.get("id"),
data.get("sku"),
data.get("title"),
data.get("artist"),
data.get("description"),
data.get("format"),
data.get("price"),
...
)
@classmethod
def fromSku(cls, db : SomeDbAbstraction, sku : str) -> 'Product':
data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE sku = {sku};")
return cls(
data.get("id"),
data.get("sku"),
data.get("title"),
data.get("artist"),
data.get("description"),
data.get("format"),
data.get("price"),
...
)
@classmethod
def newProduct(
cls,
db : SomeDbAbstraction,
sku : str,
title : str,
artist : Artist,
description : str,
format : Format,
price : float,
) -> 'Product':
new_id = db.run_query(f"""
INSERT INTO
Product
SET
sku = {sku},
title = {title},
artist = {artist.id},
description = {description},
format = {format.name},
price = {price}
;
""").get_insert_id()
return cls(
new_id,
sku,
title,
Artist.fromId(db, artistId),
description,
format,
price
)
@classmethod
def newProduct_w_artistId(
cls,
db : SomeDbAbstraction,
sku : str,
title : str,
artistId : int,
description : str,
format : Format,
price : float,
) -> 'Product':
new_id = db.run_query(f"""
INSERT INTO
Product
SET
sku = {sku},
title = {title},
artist = {artistId},
description = {description},
format = {format.name},
price = {price}
;
""").get_insert_id()
return cls(
new_id,
sku,
title,
Artist.fromId(db, artistId),
description,
format,
price
)
@classmethod
def newProduct_from_dict(
cls,
db : SomeDbAbstraction,
data : dict
) -> 'Product':
new_id = db.run_query(f"""
INSERT INTO
Product
SET
sku = {data.get("sku")},
title = {data.get("title")},
artist = {data.get("artistId")},
description = {data.get("description")},
format = {data.get("format.name")},
price = {data.get("price")},
;
""").get_insert_id()
return cls(
new_id,
data.get("sku"),
data.get("title"),
Artist.fromId(db, data.get("artistId")),
data.get("description"),
data.get("format"),
data.get("price")
)
In each of those cases the user of the model needs to explicitly choose the function for that case.
With sobrecargar
one can, instead, provide an overloaded methods:
class Product:
__slots__(
"__id",
"sku",
"title",
"artist",
"description",
"format",
"price",
...
)
@overload
def __init__(
self,
db : SomeDbAbstraction
sku : str,
title : str,
artist : Artist,
description : str,
format : Format,
price : float,
...
) -> None:
new_id = db.run_query(f"""
INSERT INTO
Product
SET
sku = {sku},
title = {title},
artist = {artist.id},
description = {description},
format = {format.name},
price = {price}
;
""").get_insert_id()
self.__id = new_id
self.sku = title
self.title = title
self.artist = artist
self.description = description
self.format = format
self.price = price
...
@overload
def __init__(self, db : SomeDbAbstraction, id : int):
data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE id = {id};")
self.__id = data.get("id")
self.sku = data.get("title")
self.title = data.get("title")
self.artist = Artist.fromId(db, data.get("artist"))
self.description = data.get("description")
self.format = Format(data.get("format"))
self.price = data.get("price")
@overload
def __init__(self, db : SomeDbAbstraction, sku : str):
data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE sku = {sku};")
self.__id = data.get("id")
self.sku = data.get("title")
self.title = data.get("title")
self.artist = Artist.fromId(db, data.get("artist"))
self.description = data.get("description")
self.format = Format(data.get("format"))
self.price = data.get("price")
@overload
def __init__(
self,
db : SomeDbAbstraction
sku : str,
title : str,
artistId : int,
description : str,
format : Format,
price : float,
...
) -> None:
new_id = db.run_query(f"""
INSERT INTO
Product
SET
sku = {sku},
title = {title},
artist = {artistId},
description = {description},
format = {format.name},
price = {price}
;
""").get_insert_id()
self.__id = new_id
self.sku = title
self.title = title
self.artist = Artist.fromId(db, artistId)
self.description = description
self.format = format
self.price = price
...
Then, when using the model, one simply calls Product()
with the relevant parameters, without having to explicitly choose each time the apropiate overload.
Note that the two implementations are not strictly equivalent, as the overloaded one disallows instantiation of a Product
with both an id
and record data. That difference is intended to highlight a feature of function overloading: it allows for the implicit (or emerging, if you'd like) definition of constraints. In this case that constraint serves as a guarantee that every Product
object instaitated by id is an up to date representation of the record in the database. The first implementation allows construction of a Product
object say, of id
5 that referes to a record with id=5 from db, but that can have arbitrary data.
Please keep in mind that this is an overly simplified example approximation of a real-use case. It's littered woth issues and bad practices (for instance there are no checks, nor sanitization, template strings are use for raw queryes, &c.)
Candidate selection strategy
Overloaded function signatures are evaluated and scored based on the match between provided arguments and expected parameters.
The process iterates over all registered overloads in self.overloads
, where each overload is represented by a signature and its corresponding function.
1. Length Score
- Evaluate argument length match between function signature and provided arguments.
- If function signature has no parameters or no arguments are provided, and the number of signature parameters doesn't match the sum of positional and nominal arguments, the signature is ignored.
- If the number of positional and nominal arguments exactly matches the signature parameters, and the signature has no variable parameters or default arguments, assign a high score of
3
. This indicates a perfect length match. - If the argument count exactly matches signature parameters, but the signature has default arguments or variable parameters, assign a moderate score of
2
. - If provided arguments are equal to or less than signature parameters, and the signature has default arguments or variable parameters, assign a score of
1
. This indicates a partial length match. - In any other case, ignore the signature.
2. Signature Score
- Evaluate type match based on function signature and argument types.
- Use the
validate_signature
function to determine if argument types match expected signature types. - Assign a score based on type matching. If signature validation succeeds, obtain a positive score based on type compatibility.
- If signature validation fails (returns
False
), ignore the overload.
- Use the
3. Candidate List Construction
- For each valid overload, create a
Candidate
object storing the overloaded function, corresponding signature, and calculated score. Type scoring takes precedence over length scoring. - Add candidates to the
candidates
list.
4. Best Candidate Selection
- Check if candidates exist. If no valid candidates, raise a
TypeError
. - If multiple candidates exist, sort by scores, prioritizing highest scores. Select the candidate with the highest score as the preferred overloaded function.
5. Result
- Call the preferred candidate with provided arguments and return its result.
Github repo: https://github.com/Hernanatn/sobrecargar.py
Avalibale in PyPi: pip install sobrecargar
[1] Note: both the library and it's documentation are written in spanish, this post presents a translation
[2] Note: overload
is an alias for sobrecargar
baked into the library
2 Answers 2
-
Does the interface seem ergonomic?;
Yes. Except having to define
MyClass
twice for methods. Which you may be able to solve by changing the algorithm to be lazily, which is also far more complicated. -
is the code remotely readable/understandable?
For typing metaprogramming yes. But every time I've done typing metaprogramming I'm left with horrible code and the question "why do I do this to myself?"
-
am I missing something, i.e., are there any glaring issues with the code?
Your typing introspection seems somewhat basic. And doesn't seem to take into account annoying things like typed Python diverged from Python.
I wrote a typing introspection library for Python 3.5.0+ a while ago. Not something I'd do again. I'd just use an existing typing introspection library which solved all the annoying parts for you. (Not mine, mine is dead.)
-
how would one approach a rewrite to
__call__
as to allow for the overloads to return proper types and notAny
, to leverage type checking tools like MyPy.Welcome to typed Python, good luck. You probably won't have fun.
My biggest problem with typed Python is you just can't do some things untyped Python can.
However, when I typically write code I'll write a couple hundred lines then run the code. I'll tend to make a handful of mistakes. Type hints basically solve most of my issues; think wrong argument name or other typos. Now I'm left with annoying ones like off by one.
As such if you can't fix the typing here I wouldn't find your library useful. I have to use an annoying subset of Python with no typing benefits.
-
altought performance is not a main issue (it is a library for an interpreted language after all), are there any obvious areas where runtime overhead could be reduced? and
Lets say you have:
floats: Iterable[float] for f in floats: my_function(f)
Then you'll be calculating the function to use in
my_function
multiple times. We could pass the expected type to get the underlying function once.floats: Iterable[float] fn = my_function.get(float) for f in floats: fn(f)
The simple question I'm now met with is why would I want
my_function.get(float)
over just skipping the library and giving each function a different name?As such the solution here is either: accept the code will always be slow, or you need to write a bespoke Python compiler like
-O
/-OO
ornumba.jit
. I don't think Python has an easy way to do the latter, especially with the typed syntax tree you'd need to interact with. -
Of course, I'd also love to read any thorough review or critique in terms of ... any other comment/note about the library.
If you solve the
MyClass
and typing issues mentioned then your interface will probably be a bit degraded from the nice interface native languages provide.I really like metaprogramming, so I have written a couple of dynamic dispatch implementations over the years. I wouldn't again because I couldn't find a nice interface, I hated maintaining the code (typing introspection) and disliked the code being slow. Each time I was trying to solve the problems with my implementation I got fed up and just gave the functions different names. I've been defeated.
I will offer some subjective opinions and objective observations.
Benefits
The main benefit of the overload pattern as I see it (including, but not exclusively, as implemented in your code) is that it can logically group a set of closely-related functions together. In a very large API this approach can reduce the number of function names that a programmer needs to remember. Finally, in a context where the function is supposed to logically act in a similar manner despite different-typed inputs, Python's duck-typing philosophy is that the function should attempt to plow ahead regardless of what it's been given, and if that's only possible by splitting the function out into different implementations based on the input types, then overloading can attempt to present a unified interface.
Drawbacks
Subjectively: if I were to see this code at work, I would recoil in horror. I would fear for what it implies in terms of added complexity and difficulty of debugging, both of this meta-library and the modules that use it.
This code does not come without cost. It has to be tested and maintained, and anyone coming from a pure-Python background will need to learn how it works, so there's training cost. There is runtime performance cost. There is debugging cost: if something goes wrong in one of the overloads... which overload failed, and why was that one selected?
To the logical-group aspect: there are better ways to logically group methods. Classes can do this when the functions are best-represented as methods; if they aren't, then the closest thing that Python has to a namespace is a module. I would be happy to see a small, purpose-written module with different but related functions - each having a different name and signature - that act as convenience wrappers for a core logic function.
To the function memorisation aspect: all modern IDEs have some form of auto-complete, and so long as you choose your function names well, having different function names with similar prefixes will still allow for reasonable symbol traversal. I will offer that only having one function name acting as an interface to multiple frontend implementations hampers ergonomics rather than helping: if you are reading code and you see a function call out-of-context, which overload will it call? Are you sure?
To the duck-typing aspect: the more you vary a single function's types, the more complexity balloons and static analysis and testability are hindered. If a function is only supposed to accept an int
, testing and documenting it is easier than if it might accept an int
or float
or string
or a JSON dictionary or a Numpy array or a Pandas dataframe or a Pandas series (this seems like an exaggeration but there are functions in the wild written to behave this way).
In a world where API documentation matters, it's easier to write and read reference documentation for which each (dedicated) function signature has only one way that it can be used. Overloaded functions require that you sift through sub-sections of the function documentation to see which one applies.
The capacity of dynamically-typed languages to clearly and structurally represent overloads is handicapped when compared to that of statically-typed languages (@Booboo correctly alludes to this in the comments). In statically-typed languages, the compiler can make a guarantee that signature inference is being performed in a predictable, well-defined, efficient manner, and is supported by a well-defined and well-documented system of warnings and errors in case of ambiguity or resolution failure. For these reasons, using overloads in e.g. C# is much safer than it is in Python, and I take less issue with using it there.
In dynamically-typed languages, this inference is much more difficult and burdensome to perform. Python was not built with this in mind, and it's fairly square-peg-round-hole. PEP0443 describing functools.singledispatch does exist, and potentially could be used in place of your library under certain circumstances; but (a) it's really just a convenience around in-body type-checking; and (b) the new match
syntax makes this type checking more convenient - so again, if the overload pattern is to be used (which I generally advise against), there's simple syntactic sugar that obviates a meta-library.
As an excellent example of how an API can be muddied with overloading, the library I always love to pick on is matplotlib, for which signatures are so hopelessly overloaded that it's often impossible to determine what they do and don't support without guessing.
In what you've intended as a counterexample, you say
Then we would need to define utility functions / classmethods that correctly [handle] each case [...] In each of those cases the user of the model needs to explicitly choose the function for that case.
Yes. Correct. That's a good thing, not a bad thing. You're already quite familiar with Python, so you've probably already read PEP0020 Zen which states as one of Python's guiding principles
Explicit is better than implicit.
Alternatives
Objectively: This is a lot of code that doesn't need to exist, and everything that this library offers can be reduced in stages:
- Without using this library, plenty of existing code still uses signature overloading via
*args, **kwargs
. This can be annotated with typing.overload. However, this still lends itself to more complexity than is necessary. - Just... write different functions with different names. No run-time type checking, no metaclasses, no decorators, no overloads. Add type hints for each signature, and if appropriate make those thin wrapper functions around a core logic function.
In short, this is a whole lot of work for what I consider to be anywhere from a non-feature to an anti-feature. It's cool that you got it to work, and it's good to see that you do care about static analysis, but truly the best path to reliable, easy static analysis is to simply separate your signatures.
-
\$\begingroup\$ @Reinderien I don't expect to convince you that, at least in some cases, the benefits outwieght the drawbacks for this approach. Regardless, thank you for taking the time to give an honest opinion on the pattern, you've pointed out some interseting things to consider. I do notice that you're answer refers only to the pattern in general and is not a review of the specific implementation, so some of the points made simply do not apply to this case (particularlly, the bits about debuging seem to not take into account the measures the library takes to that effect, etc) so I won't adress those. + \$\endgroup\$HernanATN– HernanATN2025年01月26日 22:36:12 +00:00Commented Jan 26 at 22:36
-
\$\begingroup\$ @Reinderien You praise the benefits of
duck typing
and in particularly the philosophy that "the function should attempt to plow ahead regardless (...)" but then you note that "if a function is only supposed to accept anint
, testing and documenting it is easier than if it might accept anint
orfloat
or string or (...)", and then suggest "Just... write different functions with different names.". Besides being contradictory, it kneecaps the primary feature of dynamic typing: caller shouldn't be required to always know what types he's passing (but should be warned if nonsense) + \$\endgroup\$HernanATN– HernanATN2025年01月26日 22:43:30 +00:00Commented Jan 26 at 22:43 -
\$\begingroup\$ @Reinderien
typing.overload
is just a hint for LSPs and external linters / static checkers. It: 1. doesn't no make any garantees about what the actual implementation of the function accepts or returns, 2. it requieres extra boilerplate, 3. as it's only a hint, when the exact same set of instructions can't handle the diverse inputs, for the function to "plow ahead regardless" branches (read: the dreaded runtime overhead) need to be introduced. The only difference being that those branches are inlined, instead of being checked at the moment the function is called. \$\endgroup\$HernanATN– HernanATN2025年01月26日 22:52:30 +00:00Commented Jan 26 at 22:52 -
\$\begingroup\$ @Reinderien I understand and respect your opinion regarding function overloading in Python, but Alternative 1 just seems like a worse version of this pattern (that suffers from all the drawbacks you've mentioned, some that the library submitted for review tryes to eliminate/mitigate) and Alternative 2 is "just don't do it", which I feel exists somewhere outside the implementation of said it. \$\endgroup\$HernanATN– HernanATN2025年01月26日 22:55:25 +00:00Commented Jan 26 at 22:55
-
5\$\begingroup\$ @HernanATN, thank you very much for adding that
Products
example. It makes a good argument. But, alas, you can't win them all, I am unconvinced. I would rather use standard tooling withdef product_from_id
anddef product_from_sku
, it just seems like the natural approach to the business problem. Reinderien did an admirable job of keeping an open mind, and I agree with all of his analysis. \$\endgroup\$J_H– J_H2025年01月27日 04:32:54 +00:00Commented Jan 27 at 4:32
Explore related questions
See similar questions with these tags.
abc
is a utility for defining abstract classes, that is, it enables defining base virtual classes that need futher subtyping and concrete "children" to be used, allowing polymorphism for types... It's a module that improves OOP inheritance patterns in Python. Its a different problem domain altogheter.abc
does not provide dynamic dispatch for functions (nor methods), i.e., same-named funcs that take diverse parameters and produce diverse outputs - as it's not intended to, please see PEP 3119. \$\endgroup\$abc
? P.S. It's ok if you don't find a use case for function overloading. I do, that's why I made it \$\endgroup\$def display_parameters(verbose: bool = False):
to default it. \$\endgroup\$