Goal: extract methods/functions defined in module. This excludes:
- Imports
- Lambda methods
- Magic methods
- Builtin methods
- Class methods
- Classes
- Non-original definitions (i.e. function assignments,
alt_name = orig_fn
)
My approach + test below. Any room for improvement, or false positives/negatives? (In particular I wonder if we can get away without typechecks, e.g. isinstance(method, types.LambdaType)
)
Code: live demo
import utils
def get_module_methods(module):
def is_module_function(obj):
return ('<function' in str(obj) and
module.__name__ in getattr(obj, '__module__', ''))
def not_lambda(obj):
return not ('<lambda>' in getattr(obj, '__qualname__', ''))
def not_magic(obj):
s = getattr(obj, '__name__', '').split('__')
return (len(s) < 2) or not (s[0] == s[-1] == '')
def not_duplicate(name, obj):
return name == getattr(obj, '__name__', '')
def cond(name, obj):
return (is_module_function(obj) and not_lambda(obj) and
not_magic(obj) and not_duplicate(name, obj))
objects = {name: getattr(module, name) for name in dir(module)}
m_methods = {}
for name, obj in objects.items():
if cond(name, obj):
m_methods[name] = obj
return m_methods
mm = get_module_methods(utils)
_ = [print(k, '--', v) for k, v in mm.items()]
Test:
# utils.py
import random
import numpy as np
from inspect import getsource
def fn1(a, b=5):
print("wrong animal", a, b)
class Dog():
meow = fn1
def __init__(self):
pass
def bark(self):
print("WOOF")
d = Dog()
barker = d.bark
mewoer = d.meow
A = 5
arr = np.random.randn(100, 100)
def __getattr__(name):
return getattr(random, name, None)
magic = __getattr__
duplicate = fn1
_builtin = str
lambd = lambda: 1
# main.py
import utils
# def get_module_methods(): ...
mm = get_module_methods(utils)
_ = [print(k, '--', v) for k, v in mm.items()]
fn1 -- <function fn1 at 0x7f00a7eb2820>
Edit: additional test case for a lambda surrogate:
import functools
def not_lambda():
return 1
functools.wraps(lambda i: i)(not_lambda).__name__
Code in question and answer catch not_lambda
as a lambda
- false positive.
3 Answers 3
pyclbr
The standard library includes the pyclbr
module, which provides functions for building a module browser. pyclbr.readmodule_ex()
returns a tree of nested dicts with the functions (def
statements) and classes (class
statements) in a module. So you just need to check the top level dict for items of class pyclbr.Function
.
Note that readmodule_ex
takes the name of the module (a str), not the actual module.
Just tested this, and it prints out imported functions and dunder functions as well, so those need to be filtered out.
import pyclbr
module = 'tutils'
for k,v in pyclbr.readmodule_ex(module).items():
if ( isinstance(v, pyclbr.Function)
and v.module==module
and not v.name.startswith('__')):
print(f"{v.name} -- {v.file} at {v.lineno}")
-
\$\begingroup\$ Looks plausible, but the code given as-is fails the test case in the question (nothing is printed). \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 15:58:24 +00:00Commented May 12, 2020 at 15:58
-
\$\begingroup\$ @OverLordGoldDragon, it printed out too much when I ran it, so I had to add tests for imported functions and dunder functions. Note, the source file for the module has to be on the search path so
pyclbr
can find it. \$\endgroup\$RootTwo– RootTwo2020年05月12日 19:04:11 +00:00Commented May 12, 2020 at 19:04 -
\$\begingroup\$ This does work, but it's more of an 'alternative' than 'code review' - also worth mentioning this docs' note: "information is extracted from the Python source code rather than by importing the module, so this module is safe to use with untrusted code" - which is an advantage over existing answers. \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月13日 00:18:38 +00:00Commented May 13, 2020 at 0:18
- Not a fan of all these redundant functions.
Use equality for equality and
in
for in.'<function' in str(obj)
->str(obj).startswith('<function')
module.__name__ in ...
->module.__name__ == ...
not ('<lambda>' in ...)
->'<lambda>' not in ...
->'<lambda>' != ...
Lets play a game of what if:- What if
foo.fn
is copied tofoo.bar.fn
. - What if someone built a function from a lambda so they changed the name to
fn_from_<lambda>
.
Yeah, let's stick to using==
for equals.Why is there a dictionary comprehension when the for after it consumes it?
This is just a waste of memory.
like with the dictionary compression this is a waste of memory. Additionally it's normally easier to work from a plain for as you can use assignments.
_ = [print(k, '--', v) for k, v in mm.items()]
import utils
def get_module_methods(module):
output = {}
for name in dir(module):
obj = getattr(module, name)
obj_name = getattr(obj, '__name__', '')
if (str(obj).startswith('<function') # Is function
and '<lambda>' != obj_name # Not a lambda
and module.__name__ == getattr(obj, '__module__', '') # Same module
and name == obj_name
and not ( # Dunder
obj_name.startswith('__')
and obj_name.endswith('__')
and len(obj_name) >= 5
)
):
output[name] = obj
return output
mm = get_module_methods(utils)
for k, v in mm.items():
print(k, '--', v)
Lets play another game of what if:
- What if I have a class that's
__str__
returns'<function...'
? What if someone built a function from a lambda so changed the
__name__
?import functools functools.wraps(lambda i: i)(...).__name__
In particular I wonder if we can get away without typechecks
No.
-
1\$\begingroup\$ Thanks for the analysis. Valid loopholes in last two bullets, but the answer doesn't address them; I suppose a typecheck addresses the first, but how would one reliably identify a lambda? \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 14:27:26 +00:00Commented May 12, 2020 at 14:27
-
1\$\begingroup\$ To address some points: (1) I don't see the problem with "
foo.fn
is copied tofoo.bar.fn
"; (2)'<lambda>' != ...
can't tell how this is better than'<lambda>' not in
; (3) "waste of memory" - it's for readability, and their memory use is negligible (and garbage-collected); (4)# Same module
not necessarily; if in PYTHONPATH, a package level can be bypassed, e.g.from a import b
->import b
, but this has its own problems, and I agree with your preference for==
here; still worth a mention. \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 14:33:45 +00:00Commented May 12, 2020 at 14:33 -
2\$\begingroup\$ No, you haven't - and your answer stands as incomplete by your own assessment. Maybe I'll still 'accept' it as a net-positive, but the shortcomings should be supposedly easy to correct. \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 16:30:33 +00:00Commented May 12, 2020 at 16:30
-
1\$\begingroup\$ To clarify, I see how
'<lambda> !=
works - my question concerns not the running point "(2)", but "how to not filter a trojan horse" (e.g. functools's). Dedicated a question here. \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 17:23:30 +00:00Commented May 12, 2020 at 17:23 -
2\$\begingroup\$ I've done nothing to harass you. \$\endgroup\$OverLordGoldDragon– OverLordGoldDragon2020年05月12日 18:13:38 +00:00Commented May 12, 2020 at 18:13
Thanks to @Peilonrayz for the original answer, but I figure it's worth including a more complete version of the working function:
from types import LambdaType
def get_module_methods(module):
output = {}
for name in dir(module):
obj = getattr(module, name)
obj_name = getattr(obj, '__name__', '')
if ((str(obj).startswith('<function')
and isinstance(obj, LambdaType)) # is a function
and module.__name__ == getattr(obj, '__module__', '') # same module
and name in str(getattr(obj, '__code__', '')) # not a duplicate
and "__%s__" % obj_name.strip('__') != obj_name # not a magic method
and '<lambda>' not in str(getattr(obj, '__code__')) # not a lambda
):
output[name] = obj
return output
Lambda detection taken from this Q&A. Note that we can omit the isinstance(obj, LambdaType)
typecheck, but with caveats:
str(obj).startswith('<function')
is insufficient to tell whetherobj
is a function, since we can define a class whose__str__
returns'<function'
(as pointed by Peilonrayz). However,__str__
is effective on a class instance, which fails the# not a duplicate
check - so we can "luck out". This is unlikely in practice, but here's where a typecheck may be necessary:
class Dog():
def __init__(self):
self.__name__ = 'd'
d = Dog()
Edit: changed # not a duplicate
check, which would falsely exclude not_lambda
.