Showing posts with label unit testing. Show all posts
Showing posts with label unit testing. Show all posts
another interesting TDD episode
The classic TDD cycle
says that you should start with a test for new functionality and see it fail.
There is real value in not skipping this step; not jumping straight to writing code to try to make it pass.
I started by writing my first test, like this:
I made this fail by writing the initial code as follows (the
which gave me the diagnostic:
I made this pass with the following slime
Next, I returned to the test and added a test for 6:
I ran the test, fully expecting it to fail, but it passed!
Can you see the problem?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
The problem is in
Here's what's happening:
My mistake was in the test;
I ran the test again and this time it failed :-)
I made the test pass:
Let's hear it for starting with a test for new functionality and seeing it fail.
There is real value in not skipping this step; not jumping straight to writing code to try to make it pass.
- One reason is improving the diagnostic. Without care and attention diagnostics are unlikely to diagnose much.
- A second reason is to be sure the test is actually running! Suppose for example, you're using JUnit and you forget its @Test annotation? Or the public specifier?
- A third reason is because sometimes, as we saw last time, you get an unexpected green! Here's another nice example of exactly this which happened to me during a cyber-dojo demo today.
I started by writing my first test, like this:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16];
fizz_buzz(actual, sizeof actual, n);
if (strcmp(expected, actual) != 0)
{
printf("fizz_buzz(%d)\n", n);
printf("expected: \"%s\"\n", expected);
printf(" actual: \"%s\"\n", actual);
assert(false);
}
}
static void numbers_divisible_by_three_are_Fizz(void)
{
assert_fizz_buzz("Fizz", 3);
}
I made this fail by writing the initial code as follows (the
(void)n is to momentarily avoid the
"n is unused" warning which my makefile promotes to an error
using the -Werror option):
void fizz_buzz(char * result, size_t size, int n)
{
(void)n;
strncpy(result, "Hello", size);
}
which gave me the diagnostic:
...: assert_fizz_buzz: Assertion `0' failed. fizz_buzz(3) expected: "Fizz" actual: "Hello"
I made this pass with the following slime
void fizz_buzz(char * result, size_t size, int n)
{
if (n == 3)
strncpy(result, "Fizz", size);
}
Next, I returned to the test and added a test for 6:
static void numbers_divisible_by_three_are_Fizz(void)
{
assert_fizz_buzz("Fizz", 3);
assert_fizz_buzz("Fizz", 6);
}
I ran the test, fully expecting it to fail, but it passed!
Can you see the problem?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
The problem is in
assert_fizz_buzz which starts like this:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16];
...
}
Here's what's happening:
assert_fizz_buzz("Fizz", 3)is calledchar actual[16]is definedfizz_buzz(actual, sizeof actual, 3)is calledif (n == 3)istrue"Fizz"isstrncpy'd intoactualfizz_buzz(actual, sizeof actual, 3)returnsstrcmpsays thatexpectedequalsactual- ...
assert_fizz_buzz("Fizz", 6)is calledchar actual[16]is definedactualexactly overlays its previous location so its first 5 bytes are still'F','i','z','z','0円'fizz_buzz(actual, sizeof actual, 6)is calledif (n == 3)isfalsefizz_buzz(actual, sizeof actual, 6)returnsstrcmpsays thatexpectedequalsactual
My mistake was in the test;
actual has automatic
storage duration so does not get initialized.
It's initial value is indeterminate.
The first call to assert_fizz_buzz is accidentally interfering
with the second call.
Tests should be isolated from each other.
I tweaked the test as follows:
static void assert_fizz_buzz(const char * expected, int n)
{
char actual[16] = { '0円' };
...
}
I ran the test again and this time it failed :-)
...: assert_fizz_buzz: Assertion `0' failed. fizz_buzz(6) expected: "Fizz" actual: ""
I made the test pass:
void fizz_buzz(char * result, size_t size, int n)
{
if (n % 3 == 0)
strncpy(result, "Fizz", size);
}
Let's hear it for starting with a test for new functionality and seeing it fail.
lessons from testing
I have run hundreds of test-driven coding dojos using cyber-dojo.
I see the same test anti-patterns time after time after time.
Do some of your tests exhibit the same same anti-patterns?
[フレーム]
I see the same test anti-patterns time after time after time.
Do some of your tests exhibit the same same anti-patterns?
[フレーム]
what should all tests passing look like?
I was fixing a fault in cyber-dojo the other day
and I happened to be looking at the starting test code for C++.
I'd written it using an array of function pointers.
At the end of a practice the array might look like this (in main I simply iterate through the array and call each of the functions):
The most obvious benefit I can think of is that, after writing a new test, I can verify it has run by seeing the number of tests increase by one. The problem is I don't think that's a benefit at all. I think that interaction could easily encourage me to not start with a failing test.
If all tests passing produces any output I will look at the output and start to rely on it. That doesn't feel right for a supposedly automated test.
If each passing test introduces any new output, even if it's just a single dot, then I'm in danger of falling into a composition trap. Those dots don't make a difference individually, but sooner or later they will collectively.
If the code I'm testing doesn't produce any output when it runs (which is very likely for unit-tests) then is it right for my tests to introduce any output?
If I want to time how long the tests take to run shouldn't that be part of the script that runs the tests?
I think there is something to be said for all tests passing producing no output at all. None. Red means a test failed and the output should be as helpful as possible in identifiying the failure. No output means green. Green means no output.
Thoughts?
typedef void test();
static test * tests[ ] =
{
a_recently_used_list_is_initially_empty,
the_most_recently_added_item_is_always_first,
items_can_be_looked_up_by_index_which_counts_from_zero,
items_in_the_list_are_unique,
};
I imagined it written like this instead:
int main()
{
a_recently_used_list_is_initially_empty();
the_most_recently_added_item_is_always_first();
items_can_be_looked_up_by_index_which_counts_from_zero();
items_in_the_list_are_unique();
}
I wondered why I was putting the function pointers into an array at all.
The most obvious thing I lose by not using an array is that I can no longer print out the number of tests that have passed.
I thought about that a bit.
I started to wonder what benefits I actually get by being told how many tests had run.
It's a 100% quantity and 0% quality measurement.
The most obvious benefit I can think of is that, after writing a new test, I can verify it has run by seeing the number of tests increase by one. The problem is I don't think that's a benefit at all. I think that interaction could easily encourage me to not start with a failing test.
If all tests passing produces any output I will look at the output and start to rely on it. That doesn't feel right for a supposedly automated test.
If each passing test introduces any new output, even if it's just a single dot, then I'm in danger of falling into a composition trap. Those dots don't make a difference individually, but sooner or later they will collectively.
If the code I'm testing doesn't produce any output when it runs (which is very likely for unit-tests) then is it right for my tests to introduce any output?
If I want to time how long the tests take to run shouldn't that be part of the script that runs the tests?
I think there is something to be said for all tests passing producing no output at all. None. Red means a test failed and the output should be as helpful as possible in identifiying the failure. No output means green. Green means no output.
Thoughts?
testing a random die roll
What tests would I write?
Maybe I'd start by testing it only rolls 1 to 6
Back to rolling the die... an implementation of
The tests form a specification. If I want to specify a "tighter" implementation of
Once again, it's easy to imagine an incorrect implementation that passes all the tests. How about one that simply cycles repeatedly through 1-6...
def test_doesnt_roll_less_than_1_or_greater_than_6 1200.times do die = roll_die assert 1 <= die && die <= 6 end endThat's a good start. But it's not nearly enough. One way to think about testing is to imagine someone is deliberately writing the code so that it passes the tests but is nevertheless completely incorrect. Take, for example, sorting an array of N integers. What tests would you write? Now suppose I tell you the incorrect implementation simply returns an array of N 42's. Are you testing the output array is a permutation of the input array?
Back to rolling the die... an implementation of
roll_die() which always returned 3 would pass my first test. I need to test all of 1-6 get returned.
def test_rolls_1_to_6_at_least_once_each_in_600_rolls rolls = [-1,0,0,0,0,0,0] 600.times do die = roll_die rolls[die] += 1 end assert rolls[1]>= 1 assert rolls[2]>= 1 assert rolls[3]>= 1 assert rolls[4]>= 1 assert rolls[5]>= 1 assert rolls[6]>= 1 endWhy 600 rolls? You might be thinking it's too large. That a decent
roll_die() should return at least one each of 1-6 after a lot fewer than 600 rolls. Or, to put it another way, if I have to wait till the 600th roll to get my first 5 then roll_die() is looking a bit suspect. Fair enough.
The tests form a specification. If I want to specify a "tighter" implementation of
roll_die() I can simply change 600 to, 200 say. The 200 is then part of the specification.
Once again, it's easy to imagine an incorrect implementation that passes all the tests. How about one that simply cycles repeatedly through 1-6...
$n = 1 def roll_die $n += 1 $n %= 6 $n + 1 endI've tested the "die" part of "random die". I've got to the "random" part of "random die". My incorrect implementation is not very random. It's very regular. It's very ordered. Suppose I call
roll_die() 1200 times, save the 1200 values in a file, and then compress the file. I should get a lot of compression. Let's try it…
dice = ""
1200.times { dice += roll_die.to_s }
File.open('dice.txt', 'w') { |f| f.write(dice) }
unzipped_size = File.size('dice.txt')
assert_equal 1200, unzipped_size
`zip dice.txt.zip dice.txt`
zipped_size = File.size('dice.txt.zip')
p zipped_size
On my macbook I get a value of 184. As expected, a lot of compression.
Let's compare the compression to an implementation that isn't deliberately incorrect.
def roll_die [1,2,3,4,5,6].shuffle[0] endThis gives a zipped file size of around 680. A lot less compression. I can use that as part of the specification.
def test_roll_is_random_entropically
dice = ""
1200.times { dice += roll_die().to_s }
File.open('dice.txt', 'w') { |f| f.write(dice) }
unzipped_size = File.size('dice.txt')
assert_equal 1200, unzipped_size
`zip dice.txt.zip dice.txt`
zipped_size = File.size('dice.txt.zip')
assert zipped_size> 600
end
Imagine someone is deliberately writing the code so that it passes the tests but is nevertheless completely incorrect…
XP and culture change
Last week, at a clients site, I noticed an old, coffee-stained copy of the Cutter IT Journal.
It was titled "XP and culture change", dated September 2002.
Here are some quotes from it.
From Kent Beck:
From Laurent Bossavit:
From Mary Poppendieck and Ron Moriscato:
From Ken Schwaber:
From Matt Simons and Chaitanya Nadkarny
From Nancy Van Schooenderwoert and Ron Moriscato:
From Kent Beck:
Because culture embodies perception and action together, changing culture is difficult and prone to backsliding.
Is it easier to change your perception or go back to designing the old way?
From Laurent Bossavit:
A process change will always involve a cultural change.
We were also a culture of Conviviality, which you could easily mistake (as I did at first) for a culture of Communication... In Conviviality what is valued is the act of sharing information in a group setting - rather than the nature, quantity, or quality of the information thus shared.
Culture is what remains when you have forgotten everything else.
From Mary Poppendieck and Ron Moriscato:
If there were one thing that Ron's team would do differently next time, it would be to do more refactoring.
XP is a process that doesn't feel like a process.
The theory of punctuated equilibrium holds that biological species are not likely to change over a long period of time because mutations are usually swamped by the genes of the existing population. If a mutation occurs in an isolated spot away from the main population, it has a greater chance of surviving.
From Ken Schwaber:
Agile process management represents a profound shift in the development of products and software. Agile is based on an intuitive feel of what is right, springs from a completely different theoretical basis than traditional development processes, and is in sum a wholly different approach to building products in complex situations.
From Matt Simons and Chaitanya Nadkarny
A fixed-bid contract changes the very nature of the relationship between customer and vendor from collaborative to "contentious". "Embrace change" undergoes a fatal transformation into "outlaw change."
There is no way to pretend everything is fine when you have to deliver software to your customer every few weeks.
From Nancy Van Schooenderwoert and Ron Moriscato:
The advantages of pair programming hit you hard and fast. As you explain an area of code to your partner, you get a deeper understanding of how it fits into the current architecture. You're your own peer reviewer!
After pair programming for a while, we found ourselves in a situation where the entire team had worked somewhere in the module in the recent past. Code reviews became exciting idea-exchanging periods where refactoring tasks were discussed and planned.
With schedule pressure, there is a huge temptation to put off refactoring, and we did too much of that.
It's not enough for the code to work; it also has to serve as a solid base for the next wave of features that will be added.
All through the project, a frequent cause was that unit testing wasn't thorough enough.
testing legacy C/C++ when it resists
In my travels I'm sometimes asked to help a client write some unit-tests for C/C++ code which wasn't built with unit-testing in mind and so, naturally, is resisting being unit-tested.
This is a classic chicken and egg situation;
you want to refactor the code to get the unit-tests in place, but of course that's dangerous and painful and slow until you've got at least some unit tests in place.
There are two big problems:
In the original
Now it fails because
I'm reminded of a saying I heard (from Michael Stal). It's when there's a library you pull in but, on pulling it in, you find it has two further dependencies and you have to pull in those aswell. And they have their dependencies too. etc etc. The saying is:
Except that sometimes it's worse than that. Sometimes...
Ok. So now it compiles. But I haven't written my first unit-test yet! So I start that:
P.S.
Here's another horrible testing hack for C/C++.
- Repaying the legacy debt is likely to be a long and arduous road.
There's not a lot I can do to help here except offer encouragement
and to maybe remind them of the Winston Churchill quote
if you're going through hell, keep going!
- It often seems there's no way to get started. Clients might say something like "this can't be unit tested" when of course what they really mean is "I don't know how to unit test this". Sometimes I can suggest tricks and techniques.
One way to get started is to make the problem smaller.
Suppose I have a large legacy C++ class resolutely resisting being unit-tested.
I pick a method and start with that.
For example, given this file, fubar.cpp
1|#include "fubar.hpp"
2|#include ...
3|#include ...
|...
1438|int fubar::f1() const
1439|{
| ...
1452|}
1453|
1457|void fubar::example(widget & w, int x)
1458|{
| ...
1598|}
1599|
1600|int fubar::f2()
1601|{
| ...
4561|}
4562|
I decide to start with fubar::example() which starts at line 1457 of
fubar.cpp and ends 100+ lines later:
I carefully cut all of lines 1457-1598 into its own new file called
fubar-example
1|
2|void fubar::example(widget & w, int x)
3|{
| ...
141|}
142|
and replace the cut lines from fubar.cpp with a single #include to
the new file:
1|#include "fubar.hpp"
2|#include ...
3|#include ...
|...
1438|int fubar::f1() const
1439|{
| ...
1452|}
1453|
1457|#include "fubar-example" // <---- 1458| 1459|int fubar::f2() 1460|{ | ... 4420|} 4421|
I'm aiming to create a unit-test for fubar::example()
like this:
// here I'll dummy out everything used in fubar::example #include "fubar-example" // here I'll write my first unit testHowever, as safe as it seems, this could cause a change in behaviour! I can easily check this. If
fubar.cpp is one of the source files that
compiles into something.lib then I can compare the 'before' and 'after'
versions of this lib file to see if they are identical. They should be.
One reason they might not be is because of things like the assert macro
which uses __FILE__ and __LINE__ to report the filename and line-number.
I've changed the line numbers on everything below the new #include and the lines numbers and
filename inside the included file.
I can fix that using the #line directive.
In the original
fubar.cpp file example() started at line 1457 so
fubar-example becomes:
1|
2|...
3|
4|#line 1457 "fubar.cpp" // <---- 5|void fubar::example(widget & w, int x) 6|{ | ... 145|} 146|
and the next method f2() started at line 1600 so
fubar.cpp becomes:
1|#include "fubar.hpp"
2|#include ...
3|#include ...
|...
1438|int fubar::f1() const
1439|{
| ...
1452|}
1453|
1457|#include "fubar-example"
1458|
1459|#line 1600 // <---- 1460|int fubar::f2() 1461|{ | ... 4421|} 4422|
Now the before and after versions of the lib file are identical.
Now I try to compile the test file:
// here I'll dummy out everything used in fubar::example #include "fubar-example" // here I'll write my first unit testIt fails to compile of course, since I can't define
fubar::example() unless I've previously declared it. So I dummy it out:
class fubar
{
public:
void example(widget & w, int x); // <---- }; #include "fubar-example" // here I'll write my first unit test
Now it fails because the compiler doesn't know what widget is.
So I forward declare it:
class widget; // <---- class fubar { public: void example(widget & w, int x); }; #include "fubar-example" // here I'll write my first unit test
Now it fails because fubar::example() calls a method nudge(int,int) on the widget parameter:
1|
2|...
3|#line 1457 "fubar.cpp"
4|void fubar::example(widget & w, int x)
5|{
| ...
| w.nudge(10,10);
| ...
145|}
146|
So I dummy it out:
class widget
{
public:
void nudge(int,int) // <---- { } }; class fubar { public: void example(widget & w, int x); }; #include "fubar-example" // here I'll write my first unit test
Now it fails because fubar::example() invokes a macro LOG:
1|
2|...
3|#line 1457 "fubar.cpp"
4|void fubar::example(widget & w, int x)
5|{
| ...
| LOG(... , ...);
| ...
145|}
146|
So I dummy it out:
#define LOG(where,what) /*nothing*/ // <---- class widget ... class fubar { public: void example(widget & w, int x); }; #include "fubar-example" // here I'll write my first unit test
Maybe later I can return to the dummy LOG macro
and make it less dumb but for now I'm not even compiling. One thing at a time.Now it fails because
fubar::example()
declares a local std::string:
1|
2|...
3|#line 1457 "fubar.cpp"
4|void fubar::example(widget & w, int x)
5|{
| ...
| std::string name = "...";
| ...
145|}
146|
This one I don't need to dummy out.
#include <string> // <---- #define LOG(where,what) /*nothing*/ class widget ... class fubar { public: void example(widget & w, int x); }; #include "fubar-example" // here I'll write my first unit test
Now it fails because fubar::example() calls a sibling method:
1|
2|...
3|#line 1457 "fubar.cpp"
4|void fubar::example(widget & w, int x)
5|{
| ...
| if (tweedle_dee(w))
| ...
145|}
146|
So I dummy it out:
#include <string>
#define LOG(where,what) /*nothing*/
class widget ...
class fubar
{
public:
void example(widget & w, int x);
bool tweedle_dee(widget &) // <---- { return false; } }; #include "fubar-example" // here I'll write my first unit test
Now it fails because fubar::example() makes a call on one of its data members:
1|
2|...
3|#line 1457 "fubar.cpp"
4|void fubar::example(widget & w, int x)
5|{
| ...
| address_->resolve(name.begin(), name.end());
| ...
145|}
146|
So I dummy it out, making no attempt to write the actual types of the parameters (a useful trick):
#include <string>
#define LOG(where,what) /*nothing*/
class widget ...
class address_type
{
public:
template<typename iterator>
void resolve(iterator, iterator) // <---- { } }; class fubar { public: void example(widget & w, int x); bool tweedle_dee(widget &) { return false; } address_type * address_; // <---- }; #include "fubar-example" // here I'll write my first unit test
On I go, one step at a time, until finally, it compiles! Hoorah!I'm reminded of a saying I heard (from Michael Stal). It's when there's a library you pull in but, on pulling it in, you find it has two further dependencies and you have to pull in those aswell. And they have their dependencies too. etc etc. The saying is:
you reach for the banana; you get the whole gorilla!
Except that sometimes it's worse than that. Sometimes...
you reach for the banana; you get the whole jungle!
Ok. So now it compiles. But I haven't written my first unit-test yet! So I start that:
#include <string>
#define LOG(where,what) /*nothing*/
class widget ...
class address_type ...
class fubar
{
public:
void example(widget & w, int x);
bool tweedle_dee(widget &)
{
return false;
}
address_type * address_;
};
#include "fubar-example"
int main()
{
fubar f;
widget w;
f.example(w, 42); // <---- }
Now I have a test I can actually run!
Hoorah!
There's no actual assertions yet, but one thing at a time.
I run it.
It crashes of course.
The problem could be the address_ data member. The compiler
generated default constructor doesn't set it so it's a random pointer.
I might be able to fix that by repeating the same extraction of the constructor(s)
into separate #included files. Ultimately that's what I want of course.
But one thing at a time.
I can definitely fix it by writing my own constructor:
#include <string>
#define LOG(where,what) /*nothing*/
class widget ...
class address_type ...
class fubar
{
public:
explicit fubar(address_type * address) // <---- : address_(address) { } void example(widget & w, int x); bool tweedle_dee(widget &) { return false; } address_type * address_; }; #include "fubar-example" int main() { address_type where; // <---- fubar f(&where); // <---- widget w; f.example(w, 42); }
Now it compiles and runs without crashing! Hoorah!
It is a horrible hack. Painful.
But the gorilla is no longer so invisible!
And if nothing else, I've got code reflecting the current understanding of my attempt to hack a way into the jungle!
I've made a start.
I've got something I can build on.
And remember, when you say "X is impossible" what you really mean is "I don't know how to X".
P.S.
Here's another horrible testing hack for C/C++.
c++ development slide deck
Here's the slide deck for a presentation I recently did for a large gathering of C++ developers.
[フレーム]
Poker hands in Ruby
John Cleary (@TheRealBifter) is doing a nice project -
The 12 TDD's of Xmas.
Day 11 was Poker Hands. I did it in Ruby. Here's the code from traffic-light 116.
Tests first:
Tests first:
require './card'
require './hand'
require 'test/unit'
class TestUntitled < Test::Unit::TestCase def test_start hand = Hand.new("2H 4S 4C 2D 4H") assert_equal Card.new('2',:hearts), hand[0] assert_equal Card.new('4',:spades), hand[1] assert_equal Card.new('4',:clubs), hand[2] assert_equal Card.new('2',:diamonds), hand[3] assert_equal Card.new('4',:hearts), hand[4] end def test_card_has_pips_and_suit_set_on_creation card = Card.new('2',:hearts) assert_equal '2', card.pips assert_equal :hearts, card.suit card = Card.new('T',:hearts) assert_equal 'T', card.pips assert_equal :hearts, card.suit card = Card.new('J',:hearts) assert_equal 'J', card.pips assert_equal :hearts, card.suit card = Card.new('Q',:hearts) assert_equal 'Q', card.pips assert_equal :hearts, card.suit card = Card.new('K',:hearts) assert_equal 'K', card.pips assert_equal :hearts, card.suit card = Card.new('A',:hearts) assert_equal 'A', card.pips assert_equal :hearts, card.suit end def test_hand_ranked_three_of_a_kind assert_equal :three_of_a_kind, Hand.new("2H 4S 4C AD 4H").rank end def test_hand_ranked_one_pair assert_equal :one_pair, Hand.new("2H 4S 5C JD 4H").rank end def test_hand_ranked_two_pairs assert_equal :two_pairs, Hand.new("2H 4S 5C 2D 4H").rank end def test_hand_ranked_flush assert_equal :flush, Hand.new("2H 4H 6H 8H TH").rank end def test_hand_ranked_straight assert_equal :straight, Hand.new("2H 3C 4H 5H 6H").rank end def test_hand_ranked_full_house assert_equal :full_house, Hand.new("2H 4S 4C 2D 4H").rank end def test_hand_ranked_four_of_a_kind assert_equal :four_of_a_kind, Hand.new("2H 4S 4C 4D 4H").rank end def test_hand_ranked_straight_flush assert_equal :straight_flush, Hand.new("2H 4H 3H 5H 6H").rank end def test_hand_ranked_high_card assert_equal :high_card, Hand.new("2C 3H 4S 8C AH").rank end def test_full_house_beats_flush black = Hand.new("2H 4S 4C 2D 4H") white = Hand.new("2S 8S AS QS 3S") assert_equal 1, black <=> white
end
def test_higher_card_wins_if_equal_rank
black = Hand.new("2H 3D 5S 9C KD")
assert_equal :high_card, black.rank
white = Hand.new("2C 3H 4S 8C AH")
assert_equal :high_card, white.rank
assert_equal -1, black <=> white
end
def test_equal_hands
black = Hand.new("2H 3D 5S 9C KD")
assert_equal :high_card, black.rank
white = Hand.new("2D 3H 5C 9S KH")
assert_equal :high_card, white.rank
assert_equal 0, black <=> white
end
end
Code second:
class Hand
def initialize(cards)
@cards =
cards.gsub(/\s+/, "")
.scan(/.{2}/)
.map{|ch| Card.new(ch[0],suit(ch[1]))}
end
def [](n)
return @cards[n]
end
def rank
return :straight_flush if straight? && flush?
return :flush if flush?
return :straight if straight?
pip_tallies = pip_counts.sort.reverse
return {
[4,1] => :four_of_a_kind,
[3,2] => :full_house,
[3,1] => :three_of_a_kind,
[2,2] => :two_pairs,
[2,1] => :one_pair,
[1,1] => :high_card
}[pip_tallies[0..1]]
end
def <=>(other)
keys <=> other.keys
end
def keys
[ranking,pip_counts]
end
private
def ranking
ranks.index(rank)
end
def ranks
[
:high_card,
:pair,
:two_pairs,
:three_of_a_kind,
:straight,
:flush,
:full_house,
:four_of_a_kind,
:straight_flush
]
end
def pip_counts
"23456789TJQKA"
.chars
.collect {|pips| pip_count(pips)}
end
def pip_count(pips)
@cards.count{|card| card.pips == pips}
end
def pip_flags
pip_counts.map{|n| n> 0 ? 'T' : 'F'}.join
end
def straight?
pip_flags.include? 'TTTTT'
end
def flush?
suit_counts.any?{|n| n == 5}
end
def suit_counts
suits.collect{|suit| suit_count(suit)}
end
def suits
[:clubs,:diamonds,:hearts,:spades]
end
def suit_count(suit)
@cards.count{|card| card.suit == suit}
end
def suit(ch)
return suits["CDHS".index(ch)]
end
end
class Card def initialize(pips,suit) @pips,@suit = pips,suit end def ==(other) pips == other.pips && suit == other.suit end def pips @pips end def suit @suit end endYou can replay my entire progression (warts and all) on Cyber-Dojo (naturally).
Phone Numbers in Ruby
John Cleary (@TheRealBifter) is doing a nice project -
The 12 TDD's of Xmas.
Day 10 was the Phone number prefix problem. I did it in Ruby. Here's the code from traffic-light 14.
Tests first (very minimal - I'm a bit pressed for time):
Tests first (very minimal - I'm a bit pressed for time):
require './consistent'
require 'test/unit'
class TestUntitled < Test::Unit::TestCase def test_consistent_phone_list list = { 'Bob' => '91125426',
'Alice' => '97625992',
}
assert consistent(list)
end
def test_inconsistent_phone_list
list = {
'Bob' => '91125426',
'Alice' => '97625992',
'Emergency' => '911'
}
assert !consistent(list)
end
end
Code second:
def consistent(list)
list.values.sort.each_cons(2).none? { |pair| prefix(*pair) }
end
def prefix(lhs,rhs)
rhs.start_with? lhs
end
The each_cons from the previous problem proved very handy here. As did none?
I really feel I'm starting to get the hang of ruby.
You can
replay my entire progression (warts and all) on Cyber-Dojo (naturally).
Monty Hall in Ruby
John Cleary (@TheRealBifter) is doing a nice project -
The 12 TDD's of Xmas.
Day 4 was the Monty Hall problem. I did it in Ruby. Here's the code from traffic-light 111.
Tests first:
Tests first:
require './monty_hall'
require 'test/unit'
class TestMontyHall < Test::Unit::TestCase def test_either_goat_door_is_opened_when_you_choose_the_car_door check_either_goat([:car, :goat, :goat], { :chosen_door => 0,
:goat_doors => [1, 2]
})
check_either_goat([:goat, :car, :goat],
{ :chosen_door => 1,
:goat_doors => [0, 2]
})
check_either_goat([:goat, :goat, :car],
{ :chosen_door => 2,
:goat_doors => [0, 1]
})
end
def test_other_goat_door_is_opened_when_you_choose_a_goat_door
prizes = [:car, :goat, :goat]
check_other_goat(prizes,
{ :chosen_door => 1,
:opened_door => 2,
:offered_door => 0
})
check_other_goat(prizes,
{ :chosen_door => 2,
:opened_door => 1,
:offered_door => 0
})
prizes = [:goat, :car, :goat]
check_other_goat(prizes,
{ :chosen_door => 0,
:opened_door => 2,
:offered_door => 1
})
check_other_goat(prizes,
{ :chosen_door => 2,
:opened_door => 0,
:offered_door => 1
})
prizes = [:goat, :goat, :car]
check_other_goat(prizes,
{ :chosen_door => 0,
:opened_door => 1,
:offered_door => 2
})
check_other_goat(prizes,
{ :chosen_door => 1,
:opened_door => 0,
:offered_door => 2
})
end
def test_strategy_of_sticking_with_chosen_door
wins = Array.new(big) { MontyHall.new() }
.count { |game| game.chosen_door == game.car_door }
puts "Win car(sticking with chosen door):#{wins}/#{big}"
end
def test_strategy_of_switching_to_offered_door
wins = Array.new(big) { MontyHall.new() }
.count { |game| game.offered_door == game.car_door }
puts "Win car(switching to offered door):#{wins}/#{big}"
end
#- - - - - - - - - - - - - - - - - - - - - - - -
def check_either_goat(prizes, expected)
chosen_door = expected[:chosen_door]
goat_doors = expected[:goat_doors]
check_params(prizes, chosen_door, goat_doors[0], goat_doors[1])
goat_counts = [0,0]
100.times do |n|
game = MontyHall.new(prizes,chosen_door)
opened_door = game.opened_door
offered_door = game.offered_door
assert_equal chosen_door, game.chosen_door
assert_equal goat_doors.sort, [opened_door,offered_door].sort
assert_equal doors, [chosen_door,opened_door,offered_door].sort
assert_equal :car , prizes[chosen_door]
assert_equal :goat, prizes[opened_door]
assert_equal :goat, prizes[offered_door]
[0,1].each do |n|
goat_counts[n] += (offered_door == goat_doors[n] ? 1 : 0)
end
end
[0,1].each { |n| assert goat_counts[n]> 25 }
end
def check_other_goat(prizes, expected)
chosen_door = expected[:chosen_door]
opened_door = expected[:opened_door]
offered_door = expected[:offered_door]
check_params(prizes, chosen_door, opened_door, offered_door)
game = MontyHall.new(prizes, chosen_door)
assert_equal chosen_door, game.chosen_door
assert_equal opened_door, game.opened_door
assert_equal offered_door, game.offered_door
assert_equal :goat, prizes[ chosen_door]
assert_equal :goat, prizes[ opened_door]
assert_equal :car , prizes[offered_door]
end
def check_params(prizes, door1, door2, door3)
assert_equal 3, prizes.length
prizes.each { |prize| assert [:goat,:car].include? prize }
assert_equal doors, [door1,door2,door3].sort
end
def doors
[0,1,2]
end
def big
1000
end
end
Code second:
class MontyHall
def initialize(prizes = [:goat,:goat,:car].shuffle,
chosen_door = doors.shuffle[0])
@prizes = prizes
@chosen_door = chosen_door
@car_door = prizes.find_index { |prize| prize == :car }
if prizes[chosen_door] == :car
@opened_door = goat_doors.shuffle[0]
end
if prizes[chosen_door] == :goat
@opened_door = (goat_doors - [chosen_door])[0]
end
@offered_door = (doors - [chosen_door, opened_door])[0]
end
def chosen_door
@chosen_door
end
def car_door
@car_door
end
def opened_door
@opened_door
end
def offered_door
@offered_door
end
private
def doors
[0,1,2]
end
def goat_doors
doors.select { |door| @prizes[door] == :goat }
end
end
You can
replay my entire progression (warts and all) on Cyber-Dojo (naturally).
Bowling Game in Ruby
John Cleary (@TheRealBifter) is doing a nice project -
The 12 TDD's of Xmas.
Day 9 was the Bowling Game problem. I did it in Ruby. Here's the code from traffic-light 100.
Tests first:
Tests first:
require './score' require 'test/unit' class TestScore < Test::Unit::TestCase def test_score_uninteresting_game # 7 7 7 7 7 = 35 # 6 6 6 6 6 = 30 balls = "51|52|51|52|51|52|51|52|51|52" assert_equal 65, score(balls) end def test_score_all_frames_5_spare # 15 15 15 15 15 = 75 # 15 15 15 15 15 = 75 balls = "5/|5/|5/|5/|5/|5/|5/|5/|5/|5/|5" assert_equal 150, score(balls) end def test_perfect_score # 30 30 30 30 30 = 150 # 30 30 30 30 30 = 150 balls = "X|X|X|X|X|X|X|X|X|X|XX" assert_equal 300, score(balls) end def test_10_strikes_then_33 # 30 30 30 30 16 = 136 # 30 30 30 30 23 = 143 balls = "X|X|X|X|X|X|X|X|X|X|33" assert_equal 279, score(balls) end def test_game_with_spare_in_middle_of_strikes # 20 30 30 30 30 = 140 # 25 20 30 30 30 = 135 balls = "X|X|5/|X|X|X|X|X|X|X|XX" assert_equal 275, score(balls) end def test_game_with_strike_in_middle_of_spares # 15 20 15 13 20 = 83 # 13 11 20 14 12 = 70 balls = "2/|3/|5/|1/|X|3/|5/|4/|3/|2/|X" assert_equal 153, score(balls) end def test_game_with_zero_balls # 8 7 7 7 7 = 36 # 6 5 6 6 6 = 29 balls = "51|62|50|52|51|52|51|52|51|52" assert_equal 65, score(balls) end def test_game_with_dash_as_zero_balls # 8 7 7 7 8 = 37 # 6 5 6 6 6 = 29 balls = "51|62|5-|52|51|52|51|52|51|62" assert_equal 66, score(balls) end endCode second:
def score(balls)
frames = (balls).split("|")
while frames.length != 12
frames << "0" end frames.each_cons(3).collect{ |frame| frame_score(frame) }.inject(:+) end def frame_score(frames) if strike? frames[0] 10 + strike_bonus(frames[1..2]) elsif spare? frames[0] 10 + ball_score(frames[1][0]) else frames[0].chars.collect{ |ball| ball_score(ball) }.inject(:+) end end def strike_bonus(frames) if frames[0] == "XX" 20 elsif strike? frames[0] 10 + ball_score(frames[1][0]) elsif spare? frames[0] 10 else ball_score(frames[0][0]) + ball_score(frames[0][1]) end end def ball_score(ball) if strike? ball 10 else ball.to_i end end def strike?(frame) frame == "X" end def spare?(frame) frame[-1] == "/" end
It took me a while to realize that each_slice should have been each_cons.
You can
replay my entire progression (warts and all) on Cyber-Dojo (naturally).
the power of pairing
In a previous post I talked about an example where a pair performed better than either individually.
Here's another small example that happened to me today.
My very good friend Syver was running a coding dojo in Erlang. A test for the filter exercise looked like this:
odd(Integer) when Integer rem 2 == 1 -> true; odd(_) -> false. filter_test() -> ?assertEqual([3, 1], filter(fun(Elem) -> odd(Elem) end, [1, 2, 3, 4])).I don't know Erlang at all, so Syver helped me out with this first-cut version:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter_helper(Fun(Head), Fun, Head, Tail, []). filter_helper(true, Fun, TrueElement, [Head|Tail], Result) -> filter_helper(Fun(Head), Fun, Head, Tail, [TrueElement|Result]); filter_helper(true, _, TrueElement, [], Result) -> [TrueElement|Result]; filter_helper(false, Fun, _, [Head|Tail], Result) -> filter_helper(Fun(Head), Fun, Head, Tail, Result); filter_helper(false, _, _, [], Result) -> ResultI realized Erlang is a lot like Prolog which I knew in the dim and distant past. I stared at the code and after several minutes something about it started nagging me. It was the splitting of the list into it's Head and Tail elements in both filter and filter_helper. I wondered if this duplication could be avoided by making filter_helper call back into filter. After several false attempts I came up with:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter_helper(Fun(Head), Fun, Head, Tail). filter_helper(true, Fun, TrueElement, List) -> X = filter(Fun,List), [TrueElement|X]; filter_helper(false, Fun, _FalseElement, List) -> filter(Fun, List).Then Syver showed me how the use of X could be collapsed:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter_helper(Fun(Head), Fun, Head, Tail). filter_helper(true, Fun, TrueElement, List) -> [TrueElement|filter(Fun, List)]; filter_helper(false, Fun, _FalseElement, List) -> filter(Fun, List).We did some argument renaming:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter_helper(Fun(Head), Fun, Head, Tail). filter_helper(true, Fun, Head, List) -> [Head|filter(Fun, List)]; filter_helper(false, Fun, _, List) -> filter(Fun, List).Then Syver noticed that we could pass Head,Tail as a single [Head|Tail] argument:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter_helper(Fun(Head), Fun, [Head|Tail]). filter_helper(true, Fun, [Head|Tail]) -> [Head|filter(Fun, [Head|Tail])]; filter_helper(false, Fun, [_|Tail]) -> filter(Fun, Tail).We agreed that filter_helper was better as filter:
filter(_, []) -> []; filter(Fun, [Head|Tail]) -> filter(Fun(Head), Fun, [Head|Tail]). filter(true, Fun, [Head|Tail]) -> [Head|filter(Fun, [Head|Tail])]; filter(false, Fun, [_|Tail]) -> filter(Fun, Tail).As a final polish Syver refactored to this:
filter(_, []) -> []; filter(Fun, [Head|Tail] = List) -> filter(Fun(Head), Fun, List). filter(true, Fun, [Head|Tail]) -> [Head|filter(Fun, [Head|Tail])]; filter(false, Fun, [_|Tail]) -> filter(Fun, Tail).and then, again, after the dojo, thanks to Syver's comment below, to this:
filter(_, []) -> []; filter(Fun, [Head|_] = List) -> filter(Fun(Head), Fun, List). filter(true, Fun, [Head|Tail]) -> [Head|filter(Fun, [Head|Tail])]; filter(false, Fun, [_|Tail]) -> filter(Fun, Tail).We agreed that the final version was definitely better than either of us could have come up with individually.
test code is code but it's different
Here's a bit of code...
And here's a bit of test code...
I'm sometimes asked which is more important, the code or the tests? Does it matter if you refactor the code but let debt accumulate in the tests? A question like this tells me that the questioner probably doesn't really understand that test-code is the yin to the yang of the code it tests and vice versa. That they form a co-evolving system. When you feel the tests need refactoring that isn't a sign you did something wrong. That's just part of development! The tests mean that code under test is not as closed a system as it would be without them. And being less closed it can stave off entropy for longer.
That's not to say that code and tests are the same. They're both code, but they're not the same.
For example, suppose I have a metric that calculates a measure of the complexity of code. I use this metric to calculate the complexity of the code under test and also the complexity of its tests. Imagine if the complexity of my tests is greater than the complexity of the code it tests. How do I know the tests aren't wrong? Should I write some tests for the tests? And if the pattern repeats and the complexity of the tests for the tests is higher still should I write some tests for the tests for the tests? No. That way leads nowhere. I don't want complexity in my tests. It want them simple. I want them linear. The complexity of my tests might be greater than one, but it must be smaller than the complexity of what it's testing.
Test code is code but it's different.
Or consider the names of functions. In code under test I want Goldilocks names - names that are not too long, and not too short. On the one hand I want reasonably long names - long enough to express intention. On the other hand, names always occur as sub-expressions of larger expressions. If my names are too long, the full expressions they're part of quickly become unwieldy. It's in the full expressions I really want understandability, so I can turn the name-length-dial down a bit.
But the names of my test functions are different. They are never part of larger expressions. So for them I can turn the name-length-dial up up up - all the way to, deep breath,
Test code is code but it's different.
def to_roman(n) result = '' ['V',5,'IV',4,'I',1].each_slice(2) do |ch,unit| while n>= unit do result += ch n -= unit end end result end
And here's a bit of test code...
def test_arabic_integer_to_roman_string assert_equal "I", to_roman(1) assert_equal "II", to_roman(2) assert_equal "III", to_roman(3) assert_equal "IV", to_roman(4) assert_equal "V", to_roman(5) assert_equal "VI", to_roman(6) assert_equal "VII", to_roman(7) assert_equal "VIII", to_roman(8) end
I'm sometimes asked which is more important, the code or the tests? Does it matter if you refactor the code but let debt accumulate in the tests? A question like this tells me that the questioner probably doesn't really understand that test-code is the yin to the yang of the code it tests and vice versa. That they form a co-evolving system. When you feel the tests need refactoring that isn't a sign you did something wrong. That's just part of development! The tests mean that code under test is not as closed a system as it would be without them. And being less closed it can stave off entropy for longer.
That's not to say that code and tests are the same. They're both code, but they're not the same.
For example, suppose I have a metric that calculates a measure of the complexity of code. I use this metric to calculate the complexity of the code under test and also the complexity of its tests. Imagine if the complexity of my tests is greater than the complexity of the code it tests. How do I know the tests aren't wrong? Should I write some tests for the tests? And if the pattern repeats and the complexity of the tests for the tests is higher still should I write some tests for the tests for the tests? No. That way leads nowhere. I don't want complexity in my tests. It want them simple. I want them linear. The complexity of my tests might be greater than one, but it must be smaller than the complexity of what it's testing.
Test code is code but it's different.
Or consider the names of functions. In code under test I want Goldilocks names - names that are not too long, and not too short. On the one hand I want reasonably long names - long enough to express intention. On the other hand, names always occur as sub-expressions of larger expressions. If my names are too long, the full expressions they're part of quickly become unwieldy. It's in the full expressions I really want understandability, so I can turn the name-length-dial down a bit.
But the names of my test functions are different. They are never part of larger expressions. So for them I can turn the name-length-dial up up up - all the way to, deep breath,
no_newline_at_end_of_file_msg_is_gobbled_when_at_end_of_common_section
Test code is code but it's different.
no scaffolding means we're done
Suppose I'm doing the print-diamond kata in cyber-dojo in Java. I start with a test
But suppose you get hit by a bus tomorrow. How easily could your colleagues tell that this code was work in progress? Code that was not finished?
As an experiment I thought I would try an alternative style. One that is not so relaxed about the asymmetry. One that tries a bit harder to be more explicit about distinguishing code that's a temporary step still on the path from code that has reached its destination.
First I wrote a test that expresses the fact that nothing is implemented.
I have unreachable scaffolding.
Time for the scaffolding to come down.
I delete the throw statement.
We're green.
The scaffolding is gone.
We're done.
@Test
public void diamond_A() {
String[] expected = {
"A"
};
String[] actual = new Diamond('A').toLines();
assertArrayEquals(expected, actual);
}
I slime a solution as follows
public class Diamond {
private char widest;
public Diamond(char widest) {
this.widest = widest;
}
public String[] toLines() {
return new String[]{ "A" };
}
}
now I add a second test
@Test
public void diamond_B() {
String[] expected = {
" A ",
"B B",
" A ",
};
String[] actual = new Diamond('B').toLines();
assertArrayEquals(expected, actual);
}
and I slime again as follows
public class Diamond {
private char widest;
public Diamond(char widest) {
this.widest = widest;
}
public String[] toLines() {
if (widest == 'A')
return new String[] {
"A"
};
else
return new String[] {
" A ",
"B B",
" A ",
};
}
}
Like all techniques this approach has a certain style.
In this case there is a small but definite asymmetry between the specificness of the tests
(one for 'A' and another one for 'B') and the slightly less specificness of the code (an explicit if for 'A' but a default everything-else for 'B').
This is a style that is relaxed about the asymmetry, a style that emphasises this step as merely one temporary step on the path of many steps leading towards something more permanent. A style that recognises that code, by it's nature, is always going to be more general than tests.
But suppose you get hit by a bus tomorrow. How easily could your colleagues tell that this code was work in progress? Code that was not finished?
As an experiment I thought I would try an alternative style. One that is not so relaxed about the asymmetry. One that tries a bit harder to be more explicit about distinguishing code that's a temporary step still on the path from code that has reached its destination.
First I wrote a test that expresses the fact that nothing is implemented.
@Test(expected = ScaffoldingException.class)
public void scaffolding() {
new Diamond('Z').toLines();
}
I make this pass as follows
public class Diamond {
private char widest;
public Diamond(char widest) {
this.widest = widest;
}
public String[] toLines() {
throw new ScaffoldingException("not done");
}
}
The scaffolding, as its name suggests, will have to be taken down once the code is done.
While it remains, it indicates that the code is not done.
Now I start.
I add the diamond_A test (as before) and make it pass
public class Diamond {
...
public String[] toLines() {
if (widest == 'A') {
return new String[]{ "A" };
}
throw new ScaffoldingException("not done");
}
}
I add a second diamond_B test (as before) and make it pass
public class Diamond {
...
public String[] toLines() {
if (widest == 'A')
return new String[] {
"A"
};
if (widest == 'B')
return new String[] {
" A ",
"B B",
" A ",
};
throw new ScaffoldingException("not done");
}
}
The scaffolding is still there.
We haven't finished yet.
Now suppose I refactor the slime (by deliberately duplicating) and end up with this
public class Diamond {
...
public String[] toLines() {
if (widest == 'A') {
String[] inner = innerDiamond();
String[] result = new String[inner.length-1];
int mid = inner.length / 2;
for (int dst=0,src=0; src != inner.length; src++)
if (src != mid)
result[dst++] = inner[src];
return result;
}
if (widest == 'B') {
String[] inner = innerDiamond();
String[] result = new String[inner.length-1];
int mid = inner.length / 2;
for (int dst=0,src=0; src != inner.length; src++)
if (src != mid)
result[dst++] = inner[src];
return result;
}
throw new ScaffoldingException("not done");
}
}
The code inside the two if statements is (deliberately) identical so I refactor to this
public class Diamond {
...
public String[] toLines() {
if (widest == 'A' || widest == 'B') {
String[] inner = innerDiamond();
String[] result = new String[inner.length-1];
int mid = inner.length / 2;
for (int dst=0,src=0; src != inner.length; src++)
if (src != mid)
result[dst++] = inner[src];
return result;
}
throw new ScaffoldingException("not done");
}
}
Now I add a new test for 'C'
@Test
public void diamond_C() {
String[] expected = {
" A ",
" B B ",
"C C",
" B B ",
" A ",
};
String[] actual = new Diamond('C').toLines();
assertArrayEquals(expected, actual);
}
This fails. I make it pass by changing the line
if (widest == 'A' || widest == 'B')to
if (widest == 'A' || widest == 'B' || widest == 'C')Now I remove the if completely
public class Diamond {
...
public String[] toLines() {
String[] inner = innerDiamond();
String[] result = new String[inner.length-1];
int mid = inner.length / 2;
for (int dst=0,src=0; src != inner.length; src++)
if (src != mid)
result[dst++] = inner[src];
return result;
throw new ScaffoldingException("not done");
}
}
And it no longer compiles. I have unreachable scaffolding.
Time for the scaffolding to come down.
I delete the throw statement.
public class Diamond {
...
public String[] toLines() {
String[] inner = innerDiamond();
String[] result = new String[inner.length-1];
int mid = inner.length / 2;
for (int dst=0,src=0; src != inner.length; src++)
if (src != mid)
result[dst++] = inner[src];
return result;
}
}
Now the scaffolding() test fails. I delete that too.We're green.
The scaffolding is gone.
We're done.
sliming and refactoring and deliberate duplication
Suppose I'm doing the Print-Diamond kata in Ruby:
Given a letter print a diamond starting with 'A' with the supplied letter at the widest point. For example: print-diamond 'E' prints A B B C C D D E E D D C C B B AI start with a test
def test_diamond_A
assert_equal ['A'], diamond('A')
end
which I pass using
def diamond(widest) ['A'] endI add another test
def test_diamond_B
assert_equal [' A',
'B B',
' A'], diamond('B')
end
which I pass using
def diamond(widest) if widest == 'A' return ['A'] end if widest == 'B' return [' A', 'B B', ' A'] end endI add one more test
def test_diamond_C
assert_equal [' A',
' B B',
'C C',
' B B',
' A'], diamond('C')
end
which I pass using
def diamond(widest) if widest == 'A' return ['A'] end if widest == 'B' return [' A', 'B B', ' A'] end if widest == 'C' return [' A', ' B B', 'C C', ' B B', ' A'] end endThe tests have already proved valuable:
- I've decided I don't want to actually test printing
- I've chosen the result format - an array of strings
- I've chosen not to embed newlines at the end of the strings
- I've something to refactor against
While coding the array of strings for the 'C' case I found myself copying the result for 'B' and modifying that. Specifically, I had to:
- duplicate the 'B B' string
- add a space at the start of the ' A' and 'B B' strings
- add a new middle string 'C C'
def diamond(widest) d = inner_diamond(widest) mid = d.length / 2 d[0..mid-1] + d[mid+1..-1] end def inner_diamond(widest) if widest == 'A' return ['A', 'A'] end if widest == 'B' return [' A', 'B B', 'B B', ' A'] end if widest == 'C' return [' A', ' B B', 'C C', 'C C', ' B B', ' A'] end endThis looks a promising step towards a recursive solution - to make the implementation of 'C' contain the implementation of 'B' and then add strings only for 'C'. So, remembering what I had to do when copying and modifying, I refactored to this:
def inner_diamond(widest)
if widest == 'A'
return ['A',
'A']
end
if widest == 'B'
return [' A',
'B B',
'B B',
' A']
end
if widest == 'C'
b = inner_diamond('B')
upper,lower = split(b.map{ |s| ' ' + s })
c = widest + ' ' + widest
return upper + [c,c] + lower
end
end
def split(array)
mid = array.length / 2
[ array[0..mid-1], array[mid..-1] ]
end
From here I verified the recursive solution works for 'B' as well:
def inner_diamond(widest)
if widest == 'A'
return ['A',
'A']
end
if widest == 'B'
a = inner_diamond('A')
upper,lower = split(a.map{ |s| ' ' + s })
b = widest + ' ' + widest
return upper + [b,b] + lower
end
if widest == 'C'
b = inner_diamond('B')
upper,lower = split(b.map{ |s| ' ' + s })
c = widest + ' ' + widest
return upper + [c,c] + lower
end
end
Now I worked on generalizing the use of the hard-coded argument to inner_diamond() and the hard-coded number of spaces:
def inner_diamond(widest)
if widest == 'A'
return ['A','A']
end
if widest == 'B'
a = inner_diamond(previous(widest))
upper,lower = split(a.map{ |s| ' ' + s })
n = (widest.ord - 'A'.ord) * 2 - 1
b = widest + (' ' * n) + widest
return upper + [b,b] + lower
end
if widest == 'C'
b = inner_diamond(previous(widest))
upper,lower = split(b.map{ |s| ' ' + s })
n = (widest.ord - 'A'.ord) * 2 - 1
c = widest + (' ' * n) + widest
return upper + [c,c] + lower
end
end
def previous(letter)
(letter.ord - 1).chr
end
Now I collapsed the duplicated specific code to its more generic form:
def inner_diamond(widest)
if widest == 'A'
return ['A','A']
else
a = inner_diamond(previous(widest))
upper,lower = split(a.map{ |s| ' ' + s })
n = (widest.ord - 'A'.ord) * 2 - 1
b = widest + (' ' * n) + widest
return upper + [b,b] + lower
end
end
Finally some renaming:
def inner_diamond(widest)
if widest == 'A'
return ['A','A']
else
inner = inner_diamond(previous(widest))
upper,lower = split(inner.map{ |s| ' ' + s })
n = (widest.ord - 'A'.ord) * 2 - 1
middle = widest + (' ' * n) + widest
return upper + [middle,middle] + lower
end
end
To summarise:
- When sliming I try to think ahead and choose tests which allow me to unslime the slime.
- If I have slimed 3 times, my next step should be to unslime rather than adding a 4th gob of slime.
- My first unsliming step is often deliberate duplication, done in a way that allows me to collapse the duplication.
red-green starts with red
Suppose I've written a test and got it passing...
Either way I get the mechanics of seeing the fail out of the way and then I write the code as a separate thing. By un-asking the question I avoid having to decide what to temporarily fiddle with - the code or the tests. I get to write the code I actually want to write. All the time.
@Test
public void yahtzee_full_house_scores_25() {
assertEquals(25,
new YahtzeeScorer(4,4,3,4,3).fullHouse());
}
After refactoring I write my next test...
@Test
public void yahtzee_not_full_house_scores_0() {
assertEquals(0,
new YahtzeeScorer(4,4,4,4,4).fullHouse());
}
And this passes first time. That is, it passes without failing first.
One of the reasons for failing first is to be sure the test is
actually running. For example, suppose I'd forgetten to write @Test and didn't
notice that the JUnit output wasn't showing one more test passing. Ooops.
A question I've been asked on several occasions is whether, in this situation, you should
change the code or the tests to force an initial red. For example, my
first version of the yahtzee_not_full_house_scores_0 could have been this, (where
I've deliberately used 42 instead of 0 simply because 42 is a good example of a number
that is not 0):
@Test
public void yahtzee_not_full_house_scores_0() {
assertEquals(42,
new YahtzeeScorer(4,4,4,4,4).fullHouse());
}
I see it fail, and then change the 42 to 0 and see it pass.
This works, and I have done this. Perhaps you have too.
If so, do you agree that it doesn't feel right?
I've learned that when something doesn't feel right my subconscious is trying to tell
me something. I've learned that when I've got two choices it's often a good
idea to look for a third. And there is a third way. I could instead
start with this:
@Test
public void yahtzee_not_full_house_scores_0() {
fail("RED-FIRST");
}
And when I've seen it fail, I delete the fail() and write
the actual code I want to write:
@Test
public void yahtzee_not_full_house_scores_0() {
assertEquals(0,
new YahtzeeScorer(4,4,4,4,4).fullHouse());
}
Or, as an alternative, I could start with this:
@Test
public void yahtzee_not_full_house_scores_0() {
fail("RED-FIRST");
assertEquals(0,
new YahtzeeScorer(4,4,4,4,4).fullHouse());
}
and when I've seen it fail I simply delete the fail() line.
Either way I get the mechanics of seeing the fail out of the way and then I write the code as a separate thing. By un-asking the question I avoid having to decide what to temporarily fiddle with - the code or the tests. I get to write the code I actually want to write. All the time.
Bare bones ruby unit testing
This morning I spent a happy hour exploring a little of ruby's Test::Unit::TestCase.
I started with this:
require 'test/unit' class MyTest < Test::Unit::TestCase def test_one_plus_one_equals_two assert_equal 2, 1+1.1 end endI wanted to see how little I needed to write my own, super-minimal implementation of Test::Unit::TestCase...
require 'test/unit' class MyTest < MyTestCase def test_one_plus_one_equals_two assert_equal 2, 1+1.1 end endAfter 204 traffic lights in cyber-dojo I ended up with this...
require 'assertion_failed_error'
class MyTestCase
def self.test_names
public_instance_methods.select{|name| name =~ /^test_/}
end
def assert_equal( expected, actual )
message =
"#{expected.inspect} expected but was\n" +
"#{actual.inspect}\n"
assert_block(message) { expected == actual }
end
def assert_block( message )
if (! yield)
raise AssertionFailedError.new(message.to_s)
end
end
end
at_exit do
::ObjectSpace.each_object(Class) do |klass|
if (klass < MyTestCase) klass.test_names.each do |method_name| begin klass.new.send method_name rescue AssertionFailedError => error
print "#{klass.name}:#{method_name}:\n" +
"#{error.message}"
end
end
end
end
end
class AssertionFailedError < RuntimeError; endwhich allowed me write...
require 'my_test_case' class MyTest < MyTestCase def test_one_plus_one_equals_two assert_equal 2, 1+1.1 end endand finally, I added this...
class MyTestCase
...
def self.test( name, &block )
define_method("test_#{name}".to_sym, &block)
end
...
end
which allowed me to rewrite the test as...
require 'my_test_case' class MyTest < MyTestCase test "1+1 == 2" do assert_equal 2, 1+1.1 end endFun :-)
Isolating legacy C code from external dependencies
Code naturally resists being isolated if it isn't designed to be isolatable.
Isolating legacy code from external dependencies can be awkward.
In C and C++ the transitive nature of #includes is the most obvious and direct
reflection of the high-coupling such code exhibits.
However, there is a technique
you can use to isolate a source file by cutting all it's #includes.
It relies on a little known third way of writing a #include.
From the C standard:
An example. Suppose you have a legacy C source file that you want to write some unit tests for. For example:
First you write null implementations of the external dependencies you want to fake (more Null Object Pattern):Lean On The Compiler and refactor
You'll also need to create a trivial implementation of
Using this you can create the following file:
This allows your
6.10.2 Source file inclusion
... A preprocessing directive of the form:
#include pp-tokens(that does not match one of the two previous forms) is permitted. The preprocessing tokens afterincludein the directive are processed just as in normal text. ... The directive resulting after all replacements shall match one of the two previous forms.
An example. Suppose you have a legacy C source file that you want to write some unit tests for. For example:
/* legacy.c */
#include "wibble.h"
#include <stdio.h>
...
int legacy(int a, int b)
{
FILE * stream = fopen("some_file.txt", "w");
char buffer[256];
int result = sprintf(buffer,
"%d:%d:%d", a, b, a * b);
fwrite(buffer, 1, sizeof buffer, stream);
fclose(stream);
return result;
}
Your first step is to
create a file called nothing.h as follows:
/* nothing! */
nothing.h is a file containing nothing and is an example of the
Null Object Pattern.
Then you refactor legacy.c to this:
/* legacy.c */
#if defined(UNIT_TEST)
# define LOCAL(header) "nothing.h"
# define SYSTEM(header) "nothing.h"
#else
# define LOCAL(header) #header
# define SYSTEM(header) <header>
#endif
#include LOCAL(wibble.h) /* <--- */ #include SYSTEM(stdio.h) /* <--- */ ... int legacy(int a, int b) { FILE * stream = fopen("some_file.txt", "w"); char buffer[256]; int result = sprintf(buffer, "%d:%d:%d", a, b, a*b); fwrite(buffer, 1, sizeof buffer, stream); fclose(stream); return result; }
Now structure your unit-tests for legacy.c as follows:First you write null implementations of the external dependencies you want to fake (more Null Object Pattern):
/* legacy.test.c: Part 1 */
static FILE * fopen(const char * restrict filename,
const char * restrict mode)
{
return 0;
}
static size_t fwrite(const void * restrict ptr,
size_t size,
size_t nelem,
FILE * restrict stream)
{
return 0;
}
static int fclose(FILE * stream)
{
return 0;
}
Then #include the source file.
Note carefully that you're #including legacy.c here
and not legacy.h and you're #defining UNIT_TEST
so that legacy.c will have no #includes of its own:
/* legacy.test.c: Part 2 */ #define UNIT_TEST #include "legacy.c"Then write your tests:
/* legacy.test.c: Part 3 */
#include <assert.h>
void first_unit_test_for_legacy(void)
{
/* writes "2:9:18" which is 6 chars */
assert(6, legacy(2,9));
}
int main(void)
{
first_unit_test_for_legacy();
return 0;
}
When you compile legacy.test.c you will find your first problem -
it does not compile! You have cut away all the #includes
which cuts away not only the function declarations but also the type definitions,
such as FILE which is a type used in the code under test, as well as
in the real and the null fopen, fwrite, and
fclose functions.
What you need to do now is introduce a seam only for the functions:
/* stdio.seam.h */
#ifndef STDIO_SEAM_INCLUDED
#define STDIO_SEAM_INCLUDED
#include <stdio.h>
struct stdio_t
{
FILE * (*fopen)(const char * restrict filename,
const char * restrict mode);
size_t (*fwrite)(const void * restrict ptr,
size_t size,
size_t nelem,
FILE * restrict stream);
int (*fclose)(FILE * stream);
};
extern const struct stdio_t stdio;
#endif
Now you legacy.c
to use stdio.seam.h:
/* legacy.c */
#if defined(UNIT_TEST)
# define LOCAL(header) "nothing.h"
# define SYSTEM(header) "nothing.h"
#else
# define LOCAL(header) #header
# define SYSTEM(header) <header>
#endif
#include LOCAL(wibble.h)
#include LOCAL(stdio.seam.h) /* <--- */ ... int legacy(int a, int b) { FILE * stream = stdio.fopen("some_file.txt", "w"); char buffer[256]; int result = sprintf(buffer, "%d:%d:%d", a, b, a*b); stdio.fwrite(buffer, 1, sizeof buffer, stream); stdio.fclose(stream); return result; }
Now you can structure your null functions as follows:
/* legacy.test.c: Part 1 */
#include "stdio.seam.h"
static FILE * null_fopen(const char * restrict filename,
const char * restrict mode)
{
return 0;
}
static size_t null_fwrite(const void * restrict ptr,
size_t size,
size_t nelem,
FILE * restrict stream)
{
return 0;
}
static int null_fclose(FILE * stream)
{
return 0;
}
const struct stdio_t stdio =
{
.fopen = null_fopen,
.fwrite = null_fwrite,
.fclose = null_fclose,
};
And viola, you have a unit test.
Now you have your knife in the seam you can push it in a bit further.
For example, you can do a little spying:
/* legacy.test.c: Part 1 */
#include "stdio.seam.h"
#include <assert.h>
#include <string.h>
static FILE * null_fopen(const char * restrict filename,
const char * restrict mode)
{
return 0;
}
static size_t spy_fwrite(const void * restrict ptr,
size_t size,
size_t nelem,
FILE * restrict stream)
{
assert(strmp("2:9:18", ptr) == 0);
return 0;
}
static int null_fclose(FILE * stream)
{
return 0;
}
const struct stdio_t stdio =
{
.fopen = null_fopen,
.fwrite = spy_fwrite,
.fclose = null_fclose,
};
This approach is pretty brutal, but it might just allow you to create an initial seam which you
can then gradually prise open. If nothing else it allows you to create
characterisation tests to familiarize yourself with legacy code.
You'll also need to create a trivial implementation of
stdio.seam.h
that the real code uses:
/* stdio.seam.c */
#include "stdio.seam.h"
#include <stdio.h>
const struct stdio_t stdio =
{
.fopen = fopen,
.fwrite = fwrite,
.fclose = fclose,
};
The -include compiler option might also prove useful.
-include file
Process file as if #include "file" appeared as the first line of the primary source file.
Using this you can create the following file:
/* include.seam.h */ #ifndef INCLUDE_SEAM #define INCLUDE_SEAM #if defined(UNIT_TEST) # define LOCAL(header) "nothing.h" # define SYSTEM(header) "nothing.h" #else # define LOCAL(header) #header # define SYSTEM(header) <header> #endif #endifand then compile with the
-include include.seam.h option.
This allows your
legacy.c file to look like this:
#include LOCAL(wibble.h)
#include LOCAL(stdio.seam.h)
...
int legacy(int a, int b)
{
FILE * stream = stdio.fopen("some_file.txt", "w");
char buffer[256];
int result = sprintf(buffer, "%d:%d:%d", a, b, a*b);
stdio.fwrite(buffer, 1, sizeof buffer, stream);
stdio.fclose(stream);
return result;
}
every teardrop is a waterfall
I was listening to Coldplay the other day and got to thinking about waterfalls.
The classic waterfall diagram is written something like this:
Analysis
leading down to...
Design
leading down to...
Implementation
leading down to...
Testing.
The Testing phase at the end of the process is perhaps the biggest giveaway that something is very wrong. In waterfall, the testing phase at the end is what's known as a euphemism. Or, more technically, a lie. Testing at the end of waterfall is really Debugging. Debugging at the end of the process is one of the key dynamics that prevents waterfall from working. There are at least two reasons:
The first is that of all the activities performed in software development, debugging is the one that is the least estimable. And that's saying something! You don't know how long it's going to take to find the source of a bug let alone fix it. I recall listening to a speaker at a conference who polled the audience to see who'd spent the most time tracking down a bug (the word bug is another euphemism). It was just like an auction! Someone called out "3 days". Someone else shouted "2 weeks". Up and up it went. The poor "winner" had spent all day, every day, 9am-5pm for 3 months hunting one bug. And it wasn't even a very large audience. This 'debug it into existence' approach is one of the reasons waterfall projects take 90% of the time to get to 90% "done" (done is another euphemism) and then another 90% of the time to get to 100% done.
The second reason is Why do cars have brakes?. In waterfall, even if testing was testing rather than debugging, putting it at the end of the process means you'll have been driving around during analysis, design and implementation with no brakes! You won't be able to stop! And again, this tells you why waterfall projects take 90% of the time to get to 90% done and then another 90% of the time to get to 100% done. Assuming of course that they don't crash.
In Test First Development, the testing really is testing and it really is first. The tests become an executable specification. Specifying is the opposite of debugging. The first 8 letters of specification are S, P, E, C, I, F, I, C.
A test is specific in exactly the same way a debugging session is not.
Analysis
leading down to...
Design
leading down to...
Implementation
leading down to...
Testing.
The Testing phase at the end of the process is perhaps the biggest giveaway that something is very wrong. In waterfall, the testing phase at the end is what's known as a euphemism. Or, more technically, a lie. Testing at the end of waterfall is really Debugging. Debugging at the end of the process is one of the key dynamics that prevents waterfall from working. There are at least two reasons:
The first is that of all the activities performed in software development, debugging is the one that is the least estimable. And that's saying something! You don't know how long it's going to take to find the source of a bug let alone fix it. I recall listening to a speaker at a conference who polled the audience to see who'd spent the most time tracking down a bug (the word bug is another euphemism). It was just like an auction! Someone called out "3 days". Someone else shouted "2 weeks". Up and up it went. The poor "winner" had spent all day, every day, 9am-5pm for 3 months hunting one bug. And it wasn't even a very large audience. This 'debug it into existence' approach is one of the reasons waterfall projects take 90% of the time to get to 90% "done" (done is another euphemism) and then another 90% of the time to get to 100% done.
The second reason is Why do cars have brakes?. In waterfall, even if testing was testing rather than debugging, putting it at the end of the process means you'll have been driving around during analysis, design and implementation with no brakes! You won't be able to stop! And again, this tells you why waterfall projects take 90% of the time to get to 90% done and then another 90% of the time to get to 100% done. Assuming of course that they don't crash.
In Test First Development, the testing really is testing and it really is first. The tests become an executable specification. Specifying is the opposite of debugging. The first 8 letters of specification are S, P, E, C, I, F, I, C.
A test is specific in exactly the same way a debugging session is not.
#include - there is a third way
Isolating legacy code from external dependencies can be awkward. Code naturally resists being isolated if it isn't designed to be isolatable. In C and C++ the transitive nature of #includes is the most obvious and direct reflection of the high-coupling such code exhibits. There is a technique that you can use to isolate a source file by cutting all it's #includes.
It relies on a little known third way of writing a #include.
From the C standard:
An example. Suppose you have a legacy source file that you want to write some unit tests for. For example:
First create a file called
Now structure your unit-tests for legacy.c as follows:
First you write the fake implementations of the external dependencies. Note that the type of
Then compile
This is pretty brutal, but it might just allow you to create an initial seam which you can then gradually prise open. If nothing else it provides a way to create characterisation tests to familiarize yourself with legacy code.
The -include compiler option might also prove useful.
Using this you can create the following file:
and then compile with the
6.10.2 Source file inclusion
... A preprocessing directive of the form:
#include pp-tokens(that does not match one of the two previous forms) is permitted. The preprocessing tokens afterincludein the directive are processed just as in normal text. ... The directive resulting after all replacements shall match one of the two previous forms.
An example. Suppose you have a legacy source file that you want to write some unit tests for. For example:
/* legacy.c */
#include "wibble.h"
#include <stdio.h>
int legacy(void)
{
...
info = external_dependency(stdout);
...
}
First create a file called
nothing.h as follows:
/* nothing! */
nothing.h is a file containing nothing and is an example of the
Null Object Pattern).
Then refactor legacy.c to this:
/* legacy.c */
#if defined(UNIT_TEST)
# define LOCAL(header) "nothing.h"
# define SYSTEM(header) "nothing.h"
#else
# define LOCAL(header) #header
# define SYSTEM(header) <header>
#endif
#include LOCAL(wibble.h) /* <--- */ #include SYSTEM(stdio.h) /* <--- */ int legacy(void) { ... info = external_dependency(stdout); ... }
Now structure your unit-tests for legacy.c as follows:
First you write the fake implementations of the external dependencies. Note that the type of
stdout is not FILE*.
/* legacy.test.c: Part 1 */
int stdout;
int external_dependency(int stream)
{
...
return 42;
}
Then #include the source file.
Note carefully that we're #including legacy.c here
and not legacy.h
/* legacy.test.c: Part 2 */ #include "legacy.c"Then write your tests:
/* legacy.test.c: Part 3 */
#include <assert.h>
void first_unit_test_for_legacy(void)
{
...
assert(legacy() == expected);
...
}
int main(void)
{
first_unit_test_for_legacy();
return 0;
}
Then compile
legacy.test.c with the -D UNIT_TEST option.
This is pretty brutal, but it might just allow you to create an initial seam which you can then gradually prise open. If nothing else it provides a way to create characterisation tests to familiarize yourself with legacy code.
The -include compiler option might also prove useful.
-include file
Process file as if #include "file" appeared as the first line of the primary source file.
Using this you can create the following file:
/* include_seam.h */ #ifndef INCLUDE_SEAM #define INCLUDE_SEAM #if defined(UNIT_TEST) # define LOCAL(header) "nothing.h" # define SYSTEM(header) "nothing.h" #else # define LOCAL(header) #header # define SYSTEM(header) <header> #endif #endif
and then compile with the
-include include_seam.h option.
Subscribe to:
Comments (Atom)