Tuesday, June 6, 2017

Python Test Driven Development Basics with PyTest

Python Test Driven Development Basics with PyTest¶

Introduction

Not long ago I was chatting with someone about some code he was working on that did something some might consider 'obscure';, and how to be sure it worked correctly. You can of course put in print statements to trace what is going on, then remove them later, but this also sounds like a case for writing a test that confirms the behavior, then coding up the function, mkaing adjustments until the test passes. This roughly, is what is called test-driven development (TDD).

When we chatted about how to do that, the person said they had trouble working with the Python unittest module, and I pointed out I don't much care for it either. One reason is because it forces you to use classes even when it may not feel all that natural to write a class for a particular problem. I wrote up some notes for him on using PyTest, and then dedicded to modify those for somewhat wider sharing.

So here's a vaguely practical example of applying the pattern that led to our discussion, that is how to effectively run a test function several different ways, rather than writing up a function for each permutation of your test.

First, and immediately violating the principles of TDD which say to write the test first, let's write the function to test. Our candidate function tries to return the reverse of its argument, to keep it simple, we will assume the argument is something that can be iterated over, so we can use fancy list slicing (thus we can't reverse a dictionary - but that has no meaning anyway since a dictionary has no order). To show it's working there is also code to try it out if it is called as a program (as opposed to a module). This is the way people used to write tests for a Python module, before test harnesses became widely used: just code up a few checks and stuff them in the "main" part.

def slicerev(collection):
 return collection[::-1]
if __name__ == "__main__":
 print slicerev([1,2,3,4])
 print slicerev((1,2,3,4))
 print slicerev('abcd')

If we run that, we see that all of list, tuple and string did indeed get reversed as we expected:

[4, 3, 2, 1]
(4, 3, 2, 1)
dcba

Using PyTest

Let's take this basic test code and turn it into a separate test using PyTest. Unit tests for a particular function are often named (this is convention) test_{funcname}.py. If it's named this way pytest can find it automatically - runing py.test without arguments lets it hunt for files that begin with 'test'. It's not mandatory to use this naming, you can give the name of the test file as an argument, or use other methods to describe exactly where the tests should be picked up from.

The code can be really simple since this is a contrived example - we're not really systematically "unit testing", we're spot checking. All we have to do is import the function we are going to test (even this is not needed if the test is in the same file as the code being tested, as opposed to a separate file), and then write out our three tests cases, which do nothing but call the function with a known argument, then compare the return with what we expect the result to be.

from reverser import slicerev
def test_slicerev_list():
 output = slicerev([1,2,3,4])
 assert output == [4,3,2,1]
def test_slicerev_tuple():
 output = slicerev((1,2,3,4))
 assert output == (4,3,2,1)
def test_slicerev_string():
 output = slicerev('abcd')
 assert output == 'edcba'

That's really all there is to it.

To make things a little more interesting, I have introduced an error in the test itself: the function checking the reversed string claims it expects 'edcba' instead of 'dcba'. This is done to show what it looks like when PyTest reports a failure.

Let's run it:

$ py.test test_slicerev.py
============================= test session starts ==============================
platform linux2 -- Python 2.7.13, pytest-2.9.2, py-1.4.33, pluggy-0.3.1
rootdir: /home/mats/PyBlog/pytester.d, inifile:
collected 3 items
test_slicerev.py ..F
=================================== FAILURES ===================================
_____________________________ test_slicerev_string _____________________________
 def test_slicerev_string():
 output = slicerev('abcd')
> assert output == 'edcba'
E assert 'dcba' == 'edcba'
E - dcba
E + edcba
E ? +
test_slicerev.py:13: AssertionError
====================== 1 failed, 2 passed in 0.01 seconds ======================

A run with that problem fixed is considerably quieter:

$ py.test test_slicerev.py
============================= test session starts ==============================
platform linux2 -- Python 2.7.13, pytest-2.9.2, py-1.4.33, pluggy-0.3.1
rootdir: /home/mats/PyBlog/pytester.d, inifile:
collected 3 items
test_slicerev.py ...
=========================== 3 passed in 0.00 seconds ==========================

PyTest Fixtures

If you think about this for a bit, you notice that the same code is run three times, only the data in the three test functions differs. As mentioned above, this is a very common situation in testing, where you want to try different cases to see how a unit behaves - test the boundary conditions, test invalid data or data types, etc.

PyTest provides a mechanism called a "fixture" - a fixed baseline that can be executed repeatedly, which helps with this situation.

In the first iteration of our tests, we did not need to import "pytest" for it to work when the test is run by PyTest - PyTest wraps the code and the code itself never uses anything from PyTest. However, in our second iteration, we do want something from PyTest namespace - the definition of the decorator we need to turn something into a PyTest fixture, so the import is needed.

Since what we're factoring here is supplying different sets of data, the fixture function 'slicedata' itself is extremely simple: all it does is return the data. The test function has the same two functional statements that each of the test functions had before - call the function under test, then use an assertion to check the result was as expected. In addition to that, the takes the fixture function as an argument, which would not make much sense by itself, but once it is turned into a fixture it does.

We use a decorator to turn 'slicedata' into a fixture - remember Python decorators are a piece of special syntax that helps alter the behavor of a function. The PyTest fixture decorator can take a "params" parameter, which should be something that can be iterated over, the fixture function can then receive the data one at a time. In this case we are going to pass a list of tuples, the first element of each tuple being the data we are going to apply to the test, the second element being the expected value.

We now know the other change we need to make to the test function: the "fixture object" returned by the fixture will be a tuple, so we should unpack the tuple into the pieces we want.

The new code looks like this:

import pytest
from reverser import slicerev
@pytest.fixture(params=[
 ([1,2,3,4], [4,3,2,1]),
 ((1,2,3,4), (4,3,2,1)),
 ('abcd', 'dcba')
 ])
def slicedata(request):
 return request.param
def test_slicerev(slicedata):
 input, expected = slicedata
 output = slicerev(input)
 assert output == expected

Run these tests and we'll see the results are the same as before:

$ py.test test_slicerev_fix.py
============================= test session starts ==============================
platform linux2 -- Python 2.7.13, pytest-2.9.2, py-1.4.33, pluggy-0.3.1
rootdir: /home/mats/PyBlog/pytester.d, inifile:
collected 3 items
test_slicerev_fix.py ...
=========================== 3 passed in 0.00 seconds ===========================

Posted by Unknown at 12:12 PM No comments:

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Properties in Python

Encapsulation is an important concept in object-oriented programming. Encapsulation just means that some information is not available directly from outside the encapsulating object, only through specially provided accessor methods. This sounds like a pretty good idea at first glance: the implementation of something is hidden, and is allowed to change without breaking things, because people using this code aren’t able to poke around inside and count on details they’ve discovered, they have to use the accessor methods.

What is a property?

Here’s a contrived example, a fragmentary class (in Java, perhaps) describing a person, where we’ve only shown one piece of data, the age of the person:

class Person
{
 int age;
public:
 Person() : age(0) { }
 int getAge() { return this.age; }
 void setAge(int x) { this.age = x; }
};

Here the instance variable age is private, but we’ve also provided a pair of public methods which give access to the value of age: getAge to read it and setAge to set it. Now if something a lot more complicated needs to happen later to age the getter/setter methods can be updated but it doesn’t affect clients using this code. And some evolution is not not actually an unreasonable expectation - the age of a person isn’t a good candidate for data that looks static, since every year on their birthday it changes. So perhaps someday we’ll improve the class by calculating the age from the current date and the person’s birth date only when it is needed?

A lot of developers don’t like this kind of code; it does accomplish a useful function if you need it, but because you can’t necessarily guess when you’re going to need it, when you’re writing in a language like Java, and to a slightly lesser extent C++, you have to set up the getter/setter methods for all data members that are to look public up front, in anticipation that you might need the accessors later - since the interface has a binary (ABI) nature, you can’t change from a public instance variable to a private one with a getter/setter or you will break programs depending on the older class signature. This bloats the code in anticipation of something that might not actually needed, and no new value has been introduced by all those extra lines. Ah, but no problem, right? My IDE just auto-built those for me…

There’s an aesthetic concern regarding the syntax, also. A Person instance includes that person’s age, but we can’t perform natural operations on that age - if person is an instance, we can’t access person.age or set it, we have to use person.getAge() and person.setAge().

The C# language improves on this by providing properties. C# defines properties thus:

A property is a member that provides a flexible mechanism to read, write, or compute the value of a private field. Properties can be used as if they are public data members, but they are actually special methods called accessors. This enables data to be accessed easily and still helps promote the safety and flexibility of methods.

A simple example looks like this:

class Person
{
 private int age = 0;
 public int Age
 {
 get { return this.age; }
 set { this.age = value; }
 }
}

So if person is an instance of Person, person.Age (but not person.age) can be accessed externally as if it were a variable. That leads to the ability to write the much nicer

person.Age += 1

instead of

person.setAge(person.getAge() + 1)

Properties in Python

Python has properties too, but there is another benefit in Python: as a dynamic language, it does not have the limitation of static languages, you can change the implementation, without causing problems to clients because you’re not dealing with a compiled interface. This means you can define an instance variable first, then evolve it to a property later if needed, and it will not break clients.

Here is a series of examples showing how properties work in Python.

Consider a Vector class that should be able to provide both an angle in radians and an angle in degrees. This provides an excuse to use a getter method - we don’t actually need to store both angles in the instance, and indeed we don’t really want to, because they’re related, and if someone updates one angle, we have a problem because the other one needs to change in sync with it. It’s nicer to store one, and generate the other one on demand - that solves the sync problem.

Using the `property` function

Python provides the built-in property() function which sets up a property given arguments which describe the methods which implement the property behavior. The arguments are in order are the getter, setter, deleter, and docstring; they’re successively optional so if you pass only one argument to property only a getter is assigned.

import math
class Vector(object):
 def __init__(self, angle_rad):
 self.set_angle_rad(angle_rad)
 def get_angle_rad(self):
 return math.radians(self._angle_deg)
 def set_angle_rad(self, angle_rad):
 self._angle_deg = math.degrees(angle_rad)
 angle = property(get_angle_rad, set_angle_rad)
 def get_angle_deg(self):
 return self._angle_deg
 def set_angle_deg(self, angle_deg):
 self._angle_deg = angle_deg
 angle_deg = property(get_angle_deg, set_angle_deg)

We can do some experiments with this class - in the first set of lines below we create an instance with a starting value and print both angles, then change the first the angle then the angle_deg values to show they’re working in unison. In the final chunk, we ask for some information about the objects in question to illustrate how Python has set this up.

v = Vector(2*math.pi)
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle = math.pi
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle_deg = 90.0
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
print(Vector.angle, Vector.angle.getter, Vector.angle.setter)
print(Vector.angle_deg, Vector.angle_deg.getter, Vector.angle_deg.setter)

Here’s the output of one run:

Rad: 6.283185307179586, Deg: 360.0
Rad: 3.141592653589793, Deg: 180.0
Rad: 1.5707963267948966, Deg: 90.0
<property object at 0x7fab853b5f48>
 <built-in method getter of property object at 0x7fab853b5f48>
 <built-in method setter of property object at 0x7fab853b5f48>
<property object at 0x7fab7d3d9818>
 <built-in method getter of property object at 0x7fab7d3d9818>
 <built-in method setter of property object at 0x7fab7d3d9818>

Using the property decorators

Python provides decorators that have the same effect as the the call to the property function. @property is used for the getter, @x.setter for the setter and @x.deleter for the deleter method which would be the third argument to the property function if included (replace x with the method name).

import math
class Vector(object):
 def __init__(self, value):
 self.angle = value
 @property
 def angle(self):
 return math.radians(self._angle_deg)
 @angle.setter
 def angle(self, value):
 self._angle_deg = math.degrees(value)
 @property
 def angle_deg(self):
 return self._angle_deg
 @angle_deg.setter
 def angle_deg(self, value):
 self._angle_deg = value
v = Vector(2*math.pi)
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle = math.pi
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle_deg = 90.0
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
print(Vector.angle, Vector.angle.getter, Vector.angle.setter)
print(Vector.angle_deg, Vector.angle_deg.getter, Vector.angle_deg.setter)

And the output of our experiments:

Rad: 6.283185307179586, Deg: 360.0
Rad: 3.141592653589793, Deg: 180.0
Rad: 1.5707963267948966, Deg: 90.0
<property object at 0x7f7ba29b5818>
 <built-in method getter of property object at 0x7f7ba29b5818>
 <built-in method setter of property object at 0x7f7ba29b5818>
<property object at 0x7f7ba29b5868>
 <built-in method getter of property object at 0x7f7ba29b5868>
 <built-in method setter of property object at 0x7f7ba29b5868>

By decorating the angle and angle_deg method pairs, we’ve turned them into properties with getter/setter methods, just like the call to the property function did, but this looks cleaner, you can immediately see what each method is for rather than going hunting to see they’re later part of a property call. Notice that the method names have to be the same for all the parts of the property; for the setter and deleter the decorator also takes the name of the method.

Code Simplification

I don’t particularly like this code, though. We are using a sort of hidden instance variable as the backing field which holds the value, and we’ve served up getter/setter pairs for both public variables. Except there is really no hidden data in Python - starting a name with an underscore is a visual hint that we don’t intend something to be public, but that is all it is, a hint (a leading single underscore only "matters" in imports). That means someone could actually fiddle directly with the backing field _angle_deg, bypassing the getter/setter, if they were so motivated. In the trivial example here, that doesn’t introduce any new problems, but in a setter which does a bunch of validation so you know an invalid value is never stored, it is not ideal. And in fact, that the setter for angle_deg does not do anything special is my other complaint: why implement a getter/setter when there is no need to?

So why not undo the property definition that does not seem needed and just make angle_deg an instance variable, then we don’t need _angle_deg at all. If we find we need to do something "special" with angle_deg later we can always turn it back into a property. Notice in the initializer, we are invoking the property setter, because we assign to angle. As a next refactor, I would probably turn this around and use the radians form as the instance variable to make it all feel more natural. This is the Python flexibility I was referring to at the beginning of this article. Here’s the refactored code, which is now quite a bit shorter:

import math
class Vector(object):
 def __init__(self, value):
 self.angle = value
 @property
 def angle(self):
 return math.radians(self.angle_deg)
 @angle.setter
 def angle(self, value):
 self.angle_deg = math.degrees(value)
v = Vector(2 * math.pi)
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle = math.pi
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))
v.angle_deg = 90.0
print("Rad: {}, Deg: {}".format(v.angle, v.angle_deg))

This works just the same, as we see from the output:

Rad: 6.283185307179586, Deg: 360.0
Rad: 3.141592653589793, Deg: 180.0
Rad: 1.5707963267948966, Deg: 90.0

Posted by Unknown at 12:04 PM 1 comment:

Email This BlogThis! Share to X Share to Facebook Share to Pinterest

Write Better Python

Tuesday, June 6, 2017