Test Driven Development in Python

TDD is a mindset

Test driven development is a way of designing as much as it is a way of developing.  I have been trying to use it on and off for a several years now without much success.  I think I understand the process, but not necessarily the mindset.  So, I am taking on a challenge in hopes of learning the mindset as well as the process.  The transition to TDD is going to take practice and I intend to get that practice using the Project Euler problems that I started on a couple of years ago.  One of the added benefits of using the Euler problems is that they are simple enough usually (or I should say, “So far”) that they can be solved using a single class or even a single method.  This makes writing tests for the design a little more direct and therefore simpler.

I have decided to use Python, for now.  Python is a simple language to use because it is interpreted rather than compiled.  I am also taking this opportunity to learn to use VIM a little better.  I have setup VIM as my python ide, and am pretty happy with it so far.  And when I am away from my main machine I can use my Koding virtual machine to work on a problem, but that is another post.  Plus I have always wanted to know more python, and so far I am really enjoying using it, it was super easy to get up and running with TDD.  Writing a unit test in python is very easy, as you will see.  So, despite the fact that I write C# all day everyday at work, I am trying to use an easier setup to get some practice in with Test Driven Development.

The Problem

I am currently on problem number 6 at Project Euler, which is title: Sum Square Difference.  The statement of the problem is this:

The sum of the squares of the first ten natural numbers is,

12 + 22 + … + 102 = 385

The square of the sum of the first ten natural numbers is,

(1 + 2 + … + 10)2 = 552 = 3025

Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 − 385 = 2640.

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This led me to do some research.  The past couple of problems I have tackled with more of a brute-force tack and so I decided that this time I would do some research first.  I knew there were some formulas that could help me with this one.  I basically needed a formula for the following:

  1. The sum of the squares for 1..n
  2. The square of the sums of 1..n

With formulas for these two tasks I could then simply diff the result for my answer.  I have sample values from the problem statement to use in my unit tests of the formulas.  Here are the formulas I will use in solving the problem, both were readily available through Google or Wikipedia searches.

Sum of Integers
Sum of Integers
Sum of Squares
Sum of Squares Identity

The Test Class (and the first test)

With test driven development, rather than writing some code for the problem first, of course I need to write a test.  So what test?  Well I look at the problem and see that I need to find the difference between the sum of the squares of a list of numbers and the square of the sum of those numbers.  Therefore I will need to be able to get those values.  My first test will be for a correct sum of squares value (the second formula above).

I know that I am supposed to dive right in and write a test for a method that does what I need.  This is where you actually do some designing as well.  I wrote the following test as the first code for this problem:

Test for a sum of squares method
Test for a sum of squares method

And here is the output for that test:

Output from run of first test.
Output from run of first test.

As you can see the test failed with an ‘ImportError:  No module named Problem6’.  That is because I need to add a module for Problem6, which is a new file named Problem6.py.  In order to maintain the strict tenets of test driven development I need to only to the minimum required to get past this error.  So I will add a new file/module named Problem6.py, and then re-run the test.  Here is the output from the next run:

Output from 2nd test run
Output from 2nd test run

Now we failed with an AttributeError: ‘module’ object has no attribute ‘Problem6Solution’ and this is because the new module we just added is empty and has no class.  So, I will add a class to the module and name it ‘Problem6Solution’ and a method named sumOfSquares that accepts an input number.  This should be enough to get through this error.  Here is the first look at the solution class:

Solution Class
Solution Class

Now we can run the test again and see what we get this time.  It should be an assertion error because we are not yet performing the formula calculation.  Here is the result:

Result from pass 3 of test 1
Result from pass 3 of test 1

Ah, there is my assertion error.  Now I am finally ready to add some code to execute the formula and hopefully pass the test.  The formula is simple but as I am a bit of a noobie with python I looked up the math.pow method to help me execute the power calls.  Here is the code for the sumOfSquares method complete:

Test1 - Passing!
Test1 – Passing!

Repeat for Remaining Tests

The rest of the process is a little smoother in that now we have our classes and files and beginnings completed.  We have a passing test and are under way!  The next test will be for another piece of our solution’s puzzle, the square of sums formula, where we sum 1..n and then square the result.  We have the formula for summing 1..n so it should be fairly simple.  Let us see what the test looks like:

Test 2 for a square of sums method
Test 2 for a square of sums method

Of course we will run the test without writing any new code to see it fail.  Then we will refactor until it passes.  I am seeing though, at this point that in writing a test first I am letting the desired output dictate the design of the class.  I am starting with what I need and only writing the minimum code in order to get it.  Resulting in a fairly concise bit of coding.  I am still only utilizing this at a very simple level, but I can see the benefits none the less.  What follows if the test result and the code written to make it pass:

Output from Test 2
Output from Test 2
Square of Sums method
Square of Sums method

And now our test 2 passes.

Test 2 - Passing!
Test 2 – Passing!

For the last test, where we test a solve method that takes the difference of our two sums, I will show the resulting classes in their entirety.  You will notice there will now be a third test and a third method in the solution class, called ‘solve’.

The complete test class
The complete test class
Complete solution class
Complete solution class

Summary

I have added a main method at the end of the solution class that actually shows the answer and the elapsed time to calculate it.  That, in my mind, makes this a complete solution that does provide the output desired.  And, thanks to Test Driven Development, without anything extra.  Of course I could further refactor the formula methods to shorten them, specifically using the sum() method rather than the += operator.  However, I have left the code the way it is because I feel it is more readable this way.  And in my book, readability counts for something.

Lastly here is the output, both the passing tests and the printed answer!

All Tests Passing!
All Tests Passing!
Results!
Results!

Project euler – again

I am back working on Project Euler again.  It has been a great long while, maybe 4 or 5 years, not sure, since I started on it originally, using ruby.  Now I am using a mixture, starting for Python for a bit, and possibly switching back to ruby.  I say that after solving this last problem (#4) and then going thru some of the other folks solutions, the most elegant usually are the ruby ones, in my humble opinion. 

There are of course smaller, or rather more concise, ones but they are too cryptic.  Solutions is in statistical languages such as J or K are frightfully illegible.  I have to share, just for kicks.  First, the problem was to find the largest numeric palindrome that is the product of two 3-digit numbers.  Similar to 9009 being the largest that is the product of two 2-digit numbers, 99 and 91.  So just for giggles, here are a couple of super concise (and un-readable INHO) solutions:

1. This is in J (http://www.jsoftware.com)

   1: >([:{: ]#~ (=|.&.>)) <@":"0 /:~(0:-.~[:,>:/**/)~(i.100)-.~i.1000

and

2. This one is in K (http://www.kx.com/developers/documentationkdb.php)

   1: |/m@&{x~|x}'10_vs'm:*/',/n,/::n:100+!900

Please realize that I am in awe of the guys that posted these solutions and maybe someday I will be able to learn a language such as one of these, but for now…

I am going to stick to C# and Javascript, and Ruby and Python.  They are sufficient to accomplish the tasks I perform on a daily basis at work.  I am just learning Python again and also planning to brush up on Ruby.  I spent a year or so messing with ruby and ruby on rails, but only at home in my spare time so naturally I didn’t get super far.

So, to the issue at hand.  How to find the largest numeric palindrome that is a product of two 3-digit numbers.  And, for me, more importantly how do I do this in Python using Test-Driven-Development, only using VIM as my development environment.  Pow!  Throw that one in there at the last minute.  Ha, Ha!  Vim super rocks and I am learning more and more about how to mack it out and make it easier to use.  Death to my mouse!

Vim setup

I decide to use VIM because I have been really wanting an excuse to get better with it.  I use it at work sometimes but when I am in a hurry I end up going back to Notepad++ or SublimeText (which is super nice in its own right, closest thing you can find to Textmate for windows).  Anyways I figured this would be a good chance to spend some time with VIM.  I am using gVIM by the way on Windows 7 and I have python 2.7 installed as well and in my PATH variable. 

I did some research on VIM and found out that I needed to add a ~_vimrc file with some settings in it.  Over the next day or two I slowly acquired a pretty good selection of customizations for VIM that make it much nicer to use.  Things like syntax highlighting and default font and color scheme presets.  One of the coolest things I figured out from this wonderful article, Turning VIM into a modern Python IDE, which gave me numerous tips, was to use the vertical screen split to show two buffers (files being edited) at once.  This allowed me to work on my class in on one side and my unit tests on the other.  Like so:

vim_split_screen

Once I had this setup I was rolling pretty good, really helped with the TDD workflow of:

  1. Write a test
  2. See the test fail
  3. Refactor code to pass test

This was also aided by the VIM settings for executing my scripts using the python command and seeing the output in a new command window.  The following additions to my _vimrc file allow me to simply hit ‘F5’ to execute the code in the current buffer.  And, conveniently, the *.py filter ensures that it is only set up to do so for Python files.  These autocmd lines take advantage of the fact that you can execute any shell command from the VIM command line by prepending it with the ‘!’ character.  The last setting maps the ‘F5’ key to the command output of “python “ + the current filename (represented by the ‘%’ output placeholder).  Super convenient.

vim_settings

The settings file allowed me to correct the weird backspace key behavior that VIM was displaying.  I couldn’t be more pleased, VIM is seemingly endlessly configurable.

The Problem

The goal was to find the correct answer to the 4th problem on Project Euler which is to find the largest numeric palindrome that is a product of two 3-digit numbers.  After a bit of thought I decided to start at 999 and work down with any loop I would use, that way I would get to the solution faster than if I started at 100 and went up.  But the very first thing I needed to do was to write a test.  So I went for an isPalindrome method test, as I knew I would need a test for my products as I iterated through the numbers.  

I turns out that there are numerous ways to solve the problem, of course, including using pen and paper and some algebra.  I must admit that it never occurred to me to use algebra to solve the problem.  Brute force was immediately where my thoughts went and where they stayed.  I guess years of coding have taught me that using the simplest and most direct approach is usually the best.  There were many interesting solutions posted in the forum for the problem though, and I encourage everyone to check out the many different bits of code there, but only after you solve the problem yourself.

TDD

Before I could write a test I had to figure out exactly how unit testing worked with python, so I did some research and came up with what I would guess is the most direct solution.  There is an included unit test module called, unittest and so it was there I began.  No configuration necessary, only the use of the very sweet vertical split screen in VIM, so I could look at my tests and my code as I wrote them both.

My first code that I wrote was a class that derived from the base unittest.TestCase class and I named it TestProblem4Solution.  I then did a little setup, a valid palindrome value and an invalid palindrome value that I figured I would need in my tests of an isPalindrome method.  Then straight to my first test, which was:

   1: def test_isPalindrome_ReturnsFalse(self):

   2:        result = self.cut.isNumericPalindrome(self.invalid_palindrome)

   3:        self.assertFalse(result)

Naturally this test failed when I ran it, so I had some code to write in my Problem4Solution class, which I needed to create as well.  Here is a shot of what I ended up with for making the first test pass:

problem4_firstmethod

This was enough to make my second test pass as well, test_isPalindrome_ReturnsTrue which is just a positive result test to go with the first one which tested the negative result.

Moving on to some product work, I knew I would want to test products that were produced from a loop, so I wrote a test called test_highestPalindromeProduct_ReturnsCorrect that would test a method that would take a starting number and multiply it by a set of numbers beginning with itself and stepping down by –1 until a palindrome product was found.  Of course there was no such method so the test failed and I was back to my solution class to fix the problem.  The resulting method, after a few tweaks and quick trips to the python online docs came out looking like this:

problem4_secondmethod

I then added a test for a solve method that I figured would use the two helper methods and find my solution.  But, before I finished that I realized that I might want to solve this problem for multiple length numbers.  The sample given in the problem was for tow 2-digit numbers and the challenge was for two 3-digit numbers so right off I needed to handle multiple length start numbers (I was using the two 2-digit as test values because I already had the correct answer).  Therefore I stepped back and wrote a test for a buildStartNumber method that would take a length up to 5 and return the highest number of that length.  I wanted to get back 999 for an input of 3.  Here is that test:

   1: def test_buildStartNumber_ReturnsCorrectNumber(self):

   2:         result = self.cut.buildStartNumber(3)

   3:         self.assertEqual(self.expectedStartNumber, result)

Again, fairly straight forward, I am just testing a result produced by calling the method.  The expected value is hanging off of the class instance (self.expectedStartNumber) and is 999 which is what I wanted back.  The assertEqual test statement just checks that value against the actual result.

The method to make this test pass was the simplest one in the solution, but only because I did some Googling first to learn some more about what python is capable of…

   1: def buildStartNumber(self, length):

   2:        start = ''.join("9" for x in range(length))

   3:        return int(start)

You can see that I am using the join function to join the “9” string together for x times in a range of the provided length. The range function is very handy.

That brought me to the solve method which would get me my answer.  I first wrote a test that used the known example solution for two 2-digit numbers which is 99 * 91 = 9009.  These values ensured that my solve method produced correct results.  Sort of!  I did get an incorrect answer the first time because I had assumed that if I counted down instead of up the first solution I came to would be the correct one but it wasn’t.  I had to store the results and continue on down until I found them all and then keep only the highest one.  Counting down from 999 and looking for palindrome products you come first to 995 * 583 which is 580085, a palindrome but not the highest palindrome.  You have to keep going to get to 993 * 913 which is 906609 and the correct answer.  Now, if I had limited my attempts with a bottom end of say 900 then I would have come to the correct answer first, but I didn’t.  I let it go all the way down to 100 until it found a result.  Oh well, live and learn.

Back to the TDD, the next test looked like this:

   1: def test_solve_ReturnsCorrectValue(self):

   2:         result = self.cut.solve(2)

   3:         self.assertEqual(self.solution, result)

Another simple test, where the expected value of self.solution (defined in the test setup method) is an array of [99,91,9009] which is the correct answer for two 2-digit numbers, including the product itself.  I knew I would want my solve method to provide me all three values for a complete answer, even though technically all I needed was the palindrome itself to submit to the answer page at project euler.

And the code that I added to make the test pass:

problem4_solvemethod

This completed the problem.  I submitted the solution, which thanks to my handy ‘F5’ key mapping was just a click away after adding a small main() method to the class that would print out the result of the solve method to the screen.  The final output looked like this:

problem4_finaloutput

And the answer you can see printed out as [993, 913, 906609], nice and neat.  The final code is actually shown at the beginning of the post in the screen shot of VIM with the vertical split.  You can see both the Problem4Solution class on the left and the TestProblem4Solution class on the right.  If you have any questions just post a comment!