11

I'm developing a documentation testing framework -- basically unit tests for PDFs. Tests are (decorated) methods of instances of classes defined by the framework, and these are located and instantiated at runtime and the methods are invoked to execute the tests.

My goal is to cut down on the amount of quirky Python syntax that the people who will write tests need to be concerned about, as these people may or may not be Python programmers, or even very much programmers at all. So I would like them to be able to write "def foo():" instead of "def foo(self):" for methods, but still be able to use "self" to access members.

In an ordinary program I would consider this a horrible idea, but in a domain-specific-languagey kind of program like this one, it seems worth a try.

I have successfully eliminated the self from the method signature by using a decorator (actually, since I am using a decorator already for the test cases, I would just roll it into that), but "self" does not then refer to anything in the test case method.

I have considered using a global for self, and even come up with an implementation that more or less works, but I'd rather pollute the smallest namespace possible, which is why I would prefer to inject the variable directly into the test case method's local namespace. Any thoughts?

asked Aug 10, 2010 at 22:28
3
  • 11
    Just teach them that def foo(self): is part of the boilerplate that needs to go on every function. Don't focus on the why, just emphasize that it MUST be there, and you'll probably be fine. Commented Aug 10, 2010 at 22:36
  • You are probably right, but I'm still interested to see what people come up with! Commented Aug 10, 2010 at 22:46
  • 1
    Why don't you just make your classes modules and your methods functions in the module? Less boilerplate and stuff to do. py.test does the very effectively. Commented Dec 23, 2010 at 17:24

5 Answers 5

6

My accepted answer to this question was pretty dumb but I was just starting out. Here's a much better way. This is only scantily tested but it's good for a demonstration of the proper way to do this thing which is improper to do. It works on 2.6.5 for sure. I haven't tested any other versions but no opcodes are hardcoded into it so it should be about as portable as most other 2.x code.

add_self can be applied as a decorator but that would defeat the purpose (why not just type 'self'?) It would be easy to adapt the metaclass from my other answer to apply this function instead.

import opcode
import types
def instructions(code):
 """Iterates over a code string yielding integer [op, arg] pairs
 If the opcode does not take an argument, just put None in the second part
 """
 code = map(ord, code)
 i, L = 0, len(code)
 extended_arg = 0
 while i < L:
 op = code[i]
 i+= 1
 if op < opcode.HAVE_ARGUMENT:
 yield [op, None]
 continue
 oparg = code[i] + (code[i+1] << 8) + extended_arg
 extended_arg = 0
 i += 2
 if op == opcode.EXTENDED_ARG:
 extended_arg = oparg << 16
 continue
 yield [op, oparg]
def write_instruction(inst):
 """Takes an integer [op, arg] pair and returns a list of character bytecodes"""
 op, oparg = inst
 if oparg is None:
 return [chr(op)]
 elif oparg <= 65536L:
 return [chr(op), chr(oparg & 255), chr((oparg >> 8) & 255)]
 elif oparg <= 4294967296L:
 # The argument is large enough to need 4 bytes and the EXTENDED_ARG opcode
 return [chr(opcode.EXTENDED_ARG),
 chr((oparg >> 16) & 255),
 chr((oparg >> 24) & 255),
 chr(op),
 chr(oparg & 255),
 chr((oparg >> 8) & 255)]
 else:
 raise ValueError("Invalid oparg: {0} is too large".format(oparg))
def add_self(f):
 """Add self to a method
 Creates a new function by prepending the name 'self' to co_varnames, and 
 incrementing co_argcount and co_nlocals. Increase the index of all other locals
 by 1 to compensate. Also removes 'self' from co_names and decrease the index of 
 all names that occur after it by 1. Finally, replace all occurrences of 
 `LOAD_GLOBAL i,j` that make reference to the old 'self' with 'LOAD_FAST 0,0'. 
 Essentially, just create a code object that is exactly the same but has one more
 argument. 
 """
 code_obj = f.func_code
 try:
 self_index = code_obj.co_names.index('self')
 except ValueError:
 raise NotImplementedError("self is not a global")
 # The arguments are just the first co_argcount co_varnames
 varnames = ('self', ) + code_obj.co_varnames 
 names = tuple(name for name in code_obj.co_names if name != 'self')
 code = []
 for inst in instructions(code_obj.co_code):
 op = inst[0]
 if op in opcode.haslocal:
 # The index is now one greater because we added 'self' at the head of
 # the tuple
 inst[1] += 1
 elif op in opcode.hasname:
 arg = inst[1]
 if arg == self_index:
 # This refers to the old global 'self'
 if op == opcode.opmap['LOAD_GLOBAL']:
 inst[0] = opcode.opmap['LOAD_FAST']
 inst[1] = 0
 else:
 # If `self` is used as an attribute, real global, module
 # name, module attribute, or gets looked at funny, bail out.
 raise NotImplementedError("Abnormal use of self")
 elif arg > self_index:
 # This rewrites the index to account for the old global 'self'
 # having been removed.
 inst[1] -= 1
 code += write_instruction(inst)
 code = ''.join(code)
 # type help(types.CodeType) at the interpreter prompt for this one 
 new_code_obj = types.CodeType(code_obj.co_argcount + 1,
 code_obj.co_nlocals + 1,
 code_obj.co_stacksize,
 code_obj.co_flags, 
 code,
 code_obj.co_consts,
 names, 
 varnames, 
 '<OpcodeCity>',
 code_obj.co_name, 
 code_obj.co_firstlineno,
 code_obj.co_lnotab, 
 code_obj.co_freevars,
 code_obj.co_cellvars)
 # help(types.FunctionType)
 return types.FunctionType(new_code_obj, f.func_globals)
class Test(object):
 msg = 'Foo'
 @add_self
 def show(msg):
 print self.msg + msg
t = Test()
t.show('Bar')
answered Aug 10, 2010 at 22:42
Sign up to request clarification or add additional context in comments.

5 Comments

Aaron, your original suggestion works fine for my use case (there will only ever be one instance of a given class at a time), but it's always good to see a way to do it better. More to the point, I'm certain I'll learn a lot figuring out what the heck you've done here. I'm pretty shocked and impressed that both you and martineau have been continuing to gnaw away at this problem. :-)
@Kindall Doing it this way occurred to me when I solved this problem. This is a pretty obvious application. I'd been to lazy and then martineau started giving the thread attention.
@Kindall, updated the comments. I hadn't realized how paltry they were. Should be easier to understand now.
Very impressive -- I'm learning a lot, so thanks, esp for the updated comments.
FWIW, I just stumbled an entry on Michael Foord's Voidspace blog titled Selfless Python which uses a decorator to do something similar. Equally interesting is how it can be applied to all the methods in a class definition via his The Selfless Metaclass.
5

Here's a one line method decorator that seems to do the job without modifying any Special attributes of Callable types* marked Read-only:

# method decorator -- makes undeclared 'self' argument available to method
injectself = lambda f: lambda self: eval(f.func_code, dict(self=self))
class TestClass:
 def __init__(self, thing):
 self.attr = thing
 @injectself
 def method():
 print 'in TestClass::method(): self.attr = %r' % self.attr
 return 42
test = TestClass("attribute's value")
ret = test.method()
print 'return value:', ret
# output:
# in TestClass::method(): self.attr = "attribute's value"
# return value: 42

Note that unless you take precautions to prevent it, a side-effect of the eval() function may be it adding a few entries -- such as a reference to the __builtin__ module under the key __builtins__ -- automatically to the dict passed to it.

@kendall: Per your comment about how you're using this with methods being in container classes (but ignoring the injection of additional variables for the moment) -- is the following something like what you're doing? It's difficult for me to understand how things are split up between the framework and what the users write. It sounds like an interesting design pattern to me.

# method decorator -- makes undeclared 'self' argument available to method
injectself = lambda f: lambda self: eval(f.func_code, dict(self=self))
class methodclass:
 def __call__():
 print 'in methodclass::__call__(): self.attr = %r' % self.attr
 return 42
class TestClass:
 def __init__(self, thing):
 self.attr = thing
 method = injectself(methodclass.__call__)
test = TestClass("attribute's value")
ret = test.method()
print 'return value:', ret
# output
# in methodclass::__call__(): self.attr = "attribute's value"
# return value: 42
answered Nov 11, 2010 at 0:22

8 Comments

unfortunately I don't think that this will extend to methods with arguments (at least not cleanly). The solution that I have posted (as crappy as it is) will. The problem seems to be a fundamental limitation of eval.
also, func_globals is read only in that it can not be assigned to, i.e., made to point to a different dict. It can clearly be modified.
This is an interesting approach, but if you pass in your own dict of globals, you can't access any other globals. I tried passing in locals, but that didn't work at all. I could create a copy of the function's func_globals and update it with my new variable(s), I guess, and pass that in.
... and that's exactly what I ended up doing. I'm not passing anything in, so that issue is not a limitation for my use, but it is convenient for me to be able to "inject" more than one variable for my test methods, so Aaron's bytecode hack (clever though it is) is not quite as good a fit.
BTW, I recently figured out how to create a new function from an existing function, replacing just the globals dict. With that method you can still pass arguments. Here is an answer I posted on another question that shows the technique.
|
5

little upgrade for aaronasterling's solution( i haven't enough reputation to comment it ):

def wrap(f):
 @functools.wraps(f)
 def wrapper(self,*arg,**kw):
 f.func_globals['self'] = self 
 return f(*arg,**kw)
 return wrapper

but both this solutions will work unpredictable if f function will be called recursively for different instance, so you have to clone it like this:

import types
class wrap(object):
 def __init__(self,func):
 self.func = func
 def __get__(self,obj,type):
 new_globals = self.func.func_globals.copy()
 new_globals['self'] = obj
 return types.FunctionType(self.func.func_code,new_globals)
class C(object):
 def __init__(self,word):
 self.greeting = word
 @wrap
 def greet(name):
 print(self.greeting+' , ' + name+ '!')
C('Hello').greet('kindall')
answered Aug 11, 2010 at 1:28

2 Comments

Nice embellishment, thanks. Too bad you can only mark one best answer.
I think both this version and @aaronasterling's original could have problems if a similarly wrapped method of another class instance from the same module was ever called from the current one -- because the global self binding would be changed and not restored before it returns.
3

The trick is to add 'self' to f.func_globals. This works in python2.6. I really should get around to installing other versions to test stuff like this on. Sorry for the wall of code but I cover two cases: doing it with a metaclass and doing it with a decorator. For your usecase, I think the metaclass is better since the whole point of this exercise is to shield users from syntax.

import new, functools
class TestMeta(type):
 def __new__(meta, classname, bases, classdict):
 for item in classdict:
 if hasattr(classdict[item], '__call__'):
 classdict[item] = wrap(classdict[item])
 return type.__new__(meta, classname, bases, classdict)
def wrap(f):
 @functools.wraps(f)
 def wrapper(self):
 f.func_globals['self'] = self 
 return f()
 return wrapper
def testdec(f):
 @functools.wraps(f)
 def wrapper():
 return f()
 return wrapper
class Test(object):
 __metaclass__ = TestMeta
 message = 'You can do anything in python'
 def test():
 print self.message
 @testdec
 def test2():
 print self.message + ' but the wrapper funcion can\'t take a self argument either or you get a TypeError'
class Test2(object):
 message = 'It also works as a decorator but (to me at least) feels better as a metaclass'
 @wrap
 def test():
 print self.message
t = Test()
t2 = Test2()
t.test()
t.test2()
t2.test()
answered Aug 11, 2010 at 0:32

5 Comments

Thanks, that looks like pretty much exactly what I needed to know!
What about that this seems to ignore the fact that func_globals is a Read-only Callable Type Special Attribute according to the the docs? Or does that mean the attribute itself is read-only but not the contents of what it refers to?
@martineau, this was an early python project for me and I was a rank amateur. I've actually been meaning to revisit it with a proper bytecode hack. As per the question, the attribute itself is readonly but you can mess with it pretty freely. The main problem is that it refers to the actual global environment of the module that the function was defined in and is not sequestered as I thought it was at the time that I did this. I am going to get around to doing this up proper pretty soon.
@aaronasterling: A bytecode hack might not be necessary -- see the alternative I just added.
@martineau, added my new solution.
2

This might be a use case for decorators - you give them a small set of lego bricks to build functions with, and the complicated framework stuff is piped in via @testcase or somesuch.

Edit: You didn't post any code, so this is going to be sketchy, but they don't need to write methods. They can write ordinary functions without "self", and you could use decorators like in this example from the article I linked:

class myDecorator(object):
 def __init__(self, f):
 print "inside myDecorator.__init__()"
 f() # Prove that function definition has completed
 def __call__(self):
 print "inside myDecorator.__call__()"
@myDecorator
def aFunction():
 print "inside aFunction()"
answered Aug 10, 2010 at 22:39

6 Comments

Yes, I already do use decorators to mark methods as test cases, because I need to know 1) which methods to run and 2) what order to run them in. The bulk of the framework is working; I just want to get rid of that pesky "self."
Edited, gave example how to use decorators without the need to use self in the function.
but self needs to refer to something within the defined functions. this doesn't accomplish that
That works great for eliminating the self in the method signature, and I came up with something similar (using a function-style decorator, though) but of course there is no access to self inside the function. What I'd like is to still be able to access the instance in the method without passing it in explicitly. I will edit the question to make this clearer.
I fully see that this doesn't solve the entire problem, but without seeing how the OP constructed the framework, and what exactly the users are supposed to supply, it's hard to be more specific. MAybe looking into @contextmanager and something along the lines of this: code.activestate.com/recipes/534150 would be a good idea, too -- again a different approach.
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.