Friday, 30 October 2009

Keeping a change out of our master repository with Git

I love Git. I can't believe I ever loved another version control system. Recently I've started using Git branches to keep one particular lot of changes out of my master repository. I wanted to upgrade my current VS solution and projects to VS 2010 Beta 2, use VS 2010 to add features to the code base, then only check in the non-VS 2010 related changes. That way my fellow developers don't have to worry about the slightly different .csproj files and we can all work together happily. :)

Now the changes made by the VS2010 upgrade wizard are very minor -- basically just updating the ToolsVersion attribute to 4.0 and providing a fallback OldToolsVersion element set to 3.5.

So here's what I did. The first step was to create a branch off my master using git checkout -b vs_upgrade. I then ran the VS 2010 upgrade wizard which changed the relevant files, then commited those to my vs_upgrade branch. Second step was to create a branch off my vs_upgrade branch to add my feature (not strictly necessary, but keeps things nicely separated). I can now happily make changes to the code and commit the new feature to my feature branch. The structure now looks something like this:

Now I need to get my feature back into my master branch, minus the VS 2010 upgrade. We can just cherry pick the change we want (I use gitk --all rather than the command line) and apply it to the master branch. Now we can push master to our remote server and everyone has the change, minus the VS 2010 bits. Very cool!


We can rebase our vs_upgrade branch whenever some new changes are applied by other developers to master, and we get back to our first picture. We always have just a single commit containing our VS-related changes (so we don't have to repeat the upgrade), and we can always just cherry pick the changes we want made on top of the upgrade and apply them to our master. I get to play with VS 2010 and the nightly builds of ReSharper 5.0, without having any effect whatsoever on my team mates using VS 2008. The ease with which Git handles this branching and any merging is just awesome! :)

Thursday, 22 October 2009

Calculators and a tale of two TDDs - Pt 3: Finishing up our BDD kata

This is the wrap up of my second attempt at Roy Osherove's String Calculator kata. You can read the details at the original site, but the overall gist is a calculator that has an Add() method. This method takes a string input, adds the numbers in the string together, then returns the result.

In Part 1 of this series we looked at using a traditional TDD approach to the calculator kata. In Part 2 we attempted to use a variation of Behavioural Driven Development (BDD) . In this part we'll finish off our BDD example. So far our implementation handles empty strings, single numbers, and numbers delimited by a comma and/or a newline.

To accomplish this, we have a Calculator class that uses an INumberParser to parse numbers from the string, and then an IAdder to calculate the sum of these numbers.

The final part of the exercise (well, as far as we're going for this series) is to be able to parse expressions that specify a custom delimiter. So the input to our Add() method might be in the form //[delimiter]\n[numbers]. We also need to continue handling the case were there is no delimiter specified.

This post is pretty code heavy, so if you're viewing it in an RSS reader you might find it more readable on my site where the code has syntax highlighting.

Where should customer delimiter logic go?

This seems like a different concern to me -- we are no longer just parsing out numbers, we are parsing out a delimiter message, then parsing out numbers from the remaining string based on that. Where should we add this concern? Well our Calculator itself is looking pretty tight already:

public class Calculator {
    private readonly INumberParser _numberParser;
    private readonly IAdder _adder;

    public Calculator(INumberParser numberParser, IAdder adder) {
        _numberParser = numberParser;
        _adder = adder;
    }

    public int Add(string expression) {
        var numbers = _numberParser.GetNumbers(expression);
        return _adder.Add(numbers);
    }
}

This seems a really clear implementation of what we are trying to achieve: we are getting numbers from an expression and adding them. The current specification for this class also looks pretty neat. I don't really fancy mucking it up by shoehorning in this new concern at this level of abstraction.

Maybe we can push this down into the INumberParser? The intention behind the custom delimiter feature is to provide another way to parse numbers, so this sounds pretty reasonable. Let's try creating a CalculatorExpressionParser that implements INumberParser, which can parse out the delimiter and delegate to our original NumberParser. I'm unsure of exactly what this will look like at this stage (will we have a series of handlers selected using the Specification Pattern? Or decorate our current number parser?), but we'll let our tests/specification drive out the details.

Specifying our CalculatorExpressionParser

Let's start our CalculatorExpressionParserSpec and see where that leads us.

public class When_parsing_expression : ConcernFor<CalculatorExpressionParser> {
    
    [Test]
    public void Should_parse_numbers_using_specified_delimiter() {
        Assert.That(result, Is.EquivalentTo(numbers));
    }

    protected override void Because() {
        result = sut.GetNumbers(expression);
    }
}

We still want the same basic INumberParser behaviour, so the spec isn't telling us much here. The context is where it is going to get a little scary. We want to parse out the delimiter from the expression, then remove the delimiter information from the expression and feed that into our standard NumberParser. Let's start with this:

protected override void Context() {
    char[] delimiter = new[]{';'};
    const string numbersExpression = "1;2;3;4;5";
    expression = "//;\n1;2;3;4;5";
    
    delimiterParser = MockRepository.GenerateStub<IDelimiterParser>();
    delimiterParser
        .Stub(x => x.Parse(expression))
        .Return(new DelimitedNumbersExpression(numbersExpression, delimiter));
  
    /*.... need to use this output somehow... */
}

We've now pushed down the responsibility for parsing out the delimiter and updating the expression to an IDelimiterParser. Because C# functions only support single return values, we'll need a new class to hold these two pieces of information which I'll call DelimitedNumbersExpression. This will just be a DTO, so we may as well code this up now:

public class DelimitedNumbersExpression {
    public char Delimiter { get; private set; }
    public string DelimitedNumbers { get; private set; }

    public DelimitedNumbersExpression(char delimiter, string delimitedNumbers) {
        Delimiter = delimiter;
        DelimitedNumbers = delimitedNumbers;
    }
}

Now we need to configure a NumberParser with this required delimiter, and give it our numbersExpression to parse. This is starting to sound like another responsibility, isn't it? Let's use a factory to create a configured NumberParser for us. We'll update the context like this:

protected override void Context() {
    char[] delimiter = new[]{';'};
    const string numbersExpression = "1;2;3;4;5";
    expression = "//;\n1;2;3;4;5";
    numbers = new [] {1, 2, 3, 4, 5};
    
    delimiterParser = MockRepository.GenerateStub<IDelimiterParser>();
    delimiterParser
        .Stub(x => x.Parse(expression))
        .Return(new DelimitedNumbersExpression(numbersExpression, delimiter));
    numberParserFactory = MockRepository.GenerateStub<INumberParserFactory>();
    numberParser = MockRepository.GenerateStub<INumberParser>();
    numberParserFactory.Stub(x => x.CreateWithDelimiters(delimiter)).Return(numberParser);
    numberParser.Stub(x => x.GetNumbers(numbersExpression)).Return(numbers);
}

protected override CalculatorExpressionParser CreateSubjectUnderTest() {
    return new CalculatorExpressionParser(delimiterParser, numberParserFactory);
}

This is a bit scary, particularly if you haven't used a mocking framework before. :) We've given ourselves another dependency, an INumberParserFactory which has a CreateWithDelimiter() method. We'll then return a stubbed INumberParser, which in turn will process our numbersExpression and return our required result.

Notice how we are testing a lot of things implicitly here. We don't have a test called Should_create_number_parser_from_factory() or similar, although this is in fact tested indirectly via the context (i.e. if it were not setup in the context, our actual test would not pass!). This stops our tests from being too brittle -- we can change how the SUT does its job by modifying the context, while leaving most of the specification (the scenario name, test name, assertion, and the interface used in the Because() method) unchanged.

On the positive side, this is trivial for us to implement:

public class CalculatorExpressionParser : INumberParser {
    private readonly IDelimiterParser _delimiterParser;
    private readonly INumberParserFactory _numberParserFactory;

    public CalculatorExpressionParser(IDelimiterParser delimiterParser, INumberParserFactory numberParserFactory) {
        _delimiterParser = delimiterParser;
        _numberParserFactory = numberParserFactory;
    }

    public IEnumerable<int> GetNumbers(string expression) {
        var delimitedNumbersExpression = _delimiterParser.Parse(expression);
        var numberParser = _numberParserFactory.CreateWithDelimiters(delimitedNumbersExpression.Delimiters);
        return numberParser.GetNumbers(delimitedNumbersExpression.DelimitedNumbers);
    }
}

I'm not 100% happy with this: that context is pretty mean and I think I have the names wrong, but the code is still flowing freely and seems fairly clean so let's press on.

Confession: I actually intially coded this with INumberParserFactory only accepting a single character delimiter, then refactored to support multiple delimiters for the default case of ',' and '\n' (this was a concious decision). I've omitted this step in consideration of your bandwidths costs and in the interest of finishing this post this year. :)

A delimiter parser

Our previous spec required the use of an IDelimiterParser. As I see it the DelimiterParser has to cope with two separate scenarios: when the expression contains a delimiter specifier, and when it doesn't. Let's start with the case where it actually needs to do some work.

public class When_parsing_expression_that_starts_with_a_delimiter_specifier : ConcernFor<DelimiterParser> {
    private DelimitedNumbersExpression result;
    private string expression;
    private string delimitedNumbers;
    private char delimiter;

    [Test]
    public void Should_return_specified_delimiter() {
        Assert.That(result.Delimiters, Is.EquivalentTo(new[] {delimiter}));
    }

    [Test]
    public void Should_return_delimited_numbers_without_delimiter_specifier() {
        Assert.That(result.DelimitedNumbers, Is.EqualTo(delimitedNumbers));
    }

    protected override void Because() {
        result = sut.Parse(expression);
    }

    protected override void Context() {
        delimiter = ';';
        delimitedNumbers = "1;2;3";
        expression = "//" + delimiter + "\n" + delimitedNumbers;
    }

    protected override DelimiterParser CreateSubjectUnderTest()
    {
        return new DelimiterParser();
    }
}

//DelimiterParser.cs
 public DelimitedNumbersExpression Parse(string expression) {
     var delimiter = expression[2];
    var delimitedNumbers = expression.Substring(4);
    return new DelimitedNumbersExpression(delimiter, delimitedNumbers);
 }

I actually did this in two steps, one to pass each test. Our second scenario and passing implementation looks like this:

public class When_parsing_expression_without_delimiter_specifier :ConcernFor<DelimiterParser> {
    private DelimitedNumbersExpression result;
    private char[] defaultDelimiters;
    private string expression;

    [Test]
    public void Should_return_default_delimiters() {
        Assert.That(result.Delimiters, Is.EqualTo(defaultDelimiters));
    }

    [Test]
    public void Should_return_numbers_expression_unchanged() {
     Assert.That(result.DelimitedNumbers, Is.EqualTo(expression));   
    }

    protected override void Because() {
        result = sut.Parse(expression);
    }

    protected override void Context() {
        defaultDelimiters = new[]{',', '\n'};
        expression = "1,2,3,4";
    }

    protected override DelimiterParser CreateSubjectUnderTest() {
        return new DelimiterParser();
    }
}

//DelimiterParser.cs
public class DelimiterParser : IDelimiterParser {
    private readonly char[] defaultDelimiters = new[] {',', '\n'};
    private const string delimiterSpecifier = "//";

    public DelimitedNumbersExpression Parse(string expression) {
        if (!expression.StartsWith(delimiterSpecifier)) { return new DelimitedNumbersExpression(expression, defaultDelimiters);}
        var delimiter = expression[2];
        var delimitedNumbers = expression.Substring(4);
        return new DelimitedNumbersExpression(delimitedNumbers, delimiter);
    }
}

Our tests are now all green, and it looks like we have a good case for some refactoring. Let's use Extract Method to make our DelimiterParser a bit more readable:

public class DelimiterParser : IDelimiterParser {
    private readonly char[] defaultDelimiters = new[] {',', '\n'};
    private const string delimiterSpecifier = "//";

    public DelimitedNumbersExpression Parse(string expression) {
        if (!StartsWithDelimiterSpecifier(expression)) { return ExpressionWithDefaultDelimiters(expression);}
        return ExpressionWithDelimitedNumbersAndCustomDelimiter(expression);
    }

    private DelimitedNumbersExpression ExpressionWithDelimitedNumbersAndCustomDelimiter(string expression) {
        const int positionOfDelimiter = 2;
        const int positionOfFirstNewLine = positionOfDelimiter + 1;
        const int startOfDelimiterNumbers = positionOfFirstNewLine + 1;

        var delimiter = expression[positionOfDelimiter];
        var delimitedNumbers = expression.Substring(startOfDelimiterNumbers);
        return new DelimitedNumbersExpression(delimitedNumbers, delimiter);
    }

    private DelimitedNumbersExpression ExpressionWithDefaultDelimiters(string expression) {
        return new DelimitedNumbersExpression(expression, defaultDelimiters);
    }

    private bool StartsWithDelimiterSpecifier(string expression) {
        return expression.StartsWith(delimiterSpecifier);
    }
}

This is a longer class now, but I feel the Parse() method is much more understandable now. The ugliest bits are kept private.

A trivial factory implementation

Our CalculatorExpresionParser also needed an INumberParserFactory. As this will just be creating a NumberParser via a constructor, I don't see much benefit in unit testing it. You could write tests for it if you like, but I'm happy to assume that .NET constructors work ok. :)

public class NumberParserFactory : INumberParserFactory {
    public INumberParser CreateWithDelimiters(char[] delimiters) {
        return new NumberParser(delimiters);
    }
}

This will require us to refactor our NumberParser to take constructor arguments. Let's update our spec for it:

//NumberParserSepcs.cs, in abstract Concern class:
protected override void Context() {
    delimiters = new[] {',', '\n'};
}

protected override NumberParser CreateSubjectUnderTest() {
    return new NumberParser(delimiters);
}

This also lets us simplify our NumberParser specifications too, as we no longer need to test specific delimiters (',' and '\n'), but just test that it splits on whatever delimiters it has been created with. Let's make this pass:

public class NumberParser : INumberParser {
    public readonly char[] Delimiters;

    public NumberParser(char[] delimiters) { Delimiters = delimiters; }

    public IEnumerable<int> GetNumbers(string expression) {
        if (expression == "") return new int[0];
        return expression.Split(Delimiters).Select(x => int.Parse(x));
    }
}

Please, not another spec! Have mercy!

Guess what? We're done. We don't have any unimplemented interfaces defined in specs. We should probably check this works now though. I'll add a default constructor to our Calculator class to wire up our dependencies (although for a real app we'd use an DI container).

//In Calculator.cs
public Calculator() {
    IDelimiterParser delimiterParser = new DelimiterParser();
    INumberParserFactory numberParserFactory = new NumberParserFactory();
    _numberParser = new CalculatorExpressionParser(delimiterParser, numberParserFactory);
    _adder = new Adder();
}

Just for good measure, I've added a file called IntegrationTests.cs and put in all the tests from our first attempt. And it all passed first go. Isn't that nice? :)

Post-mortem

So how'd this go? We ended up with a much more abstract design than our previous attempt. Because of this abstraction I felt code was easier to change, especially when it came to adding the custom delimiter behaviour. In that case we just picked the level of abstraction that seemed to work, then let the tests guide us. I also feel this level of abstractions helps prevent brittle tests. Because the majority of the test code focuses on the SUT's single responsibility, we only need to adjust the context to add or remove collaborators and change the behaviour.

I found that the BDD-style approach tends to force me to think more about design issues up front, and the tests give me guidance as to how to think through these issues. It makes it very easy to adhere to SOLID principles, and makes it clear when I am violating them.

I also found that when I got closer to the lowest levels of abstraction and I began hitting reality and the .NET framework code, I ended up reverting to a more TDD-style of testing. For example, the NumberParserSpecs test specific permutations of data, and it ends up just checking the use of the .NET String.Split() method and int.Parse(). I've had lots of trouble caused by focussing of testing data like this at the wrong levels of abstraction, so using the BDD-style was great for avoiding this until necessary. The majority of the time I could focus on telling dependencies to do something, not asking them for data and then acting on it.

On the negative side, we have what could reasonably be described as an over-engineered design . With 7 classes and 4 interfaces, perhaps "over-engineered" is an understatement. ;) That said, each class has a trivial implementation. With the possible exception of the DelimiterParser, you can pretty much just inspect each class for correctness. I didn't find that coding additional classes slowed me down at all. Because I got so much guidance from my tests the implementation was fairly quick. If it's just as fast and cheap to over-design, but you get the ability to more easily change your code, is that still over-design?

Of course, these posts have been more about looking at the how each approach influences the design direction, not if that level of design is a good idea. I'm quite sure you could easily end up with an identical design using either approach, but it is how each TDD style forces design decisions at different points that really interests me. I think the most illuminating part of this exercise from my perspective is that I need to work harder on my basic TDD skills -- at applying SOLID and being more aggressive in the refactoring step of the red-green-refactor cycle. Once I learn how to refactor both production and test code better, then I'll hopefully be able to move back and forward between styles as appropriate.

Hope you enjoyed this series. Would love to get your thoughts, so feel free to leave a comment, email me or tweet me @davetchepak.

Wednesday, 21 October 2009

Calculators and a tale of two TDDs - Pt 2: BDD-style

This is my second attempt at Roy Osherove's String Calculator kata. You can read the details at the original site, but the overall gist is a calculator that has an Add() method. This method takes a string input, adds the numbers in the string together, then returns the result.

My first attempt at the problem was using a traditional TDD approach. For this attempt I'll be using a variation of Behavioural Driven Development (BDD) as described to me by JP Boodhoo at NBDN. We'll pick up from the point where I have created a new solution containing a single C# project called CalculatorKata.

As this is likely to be quite different to what you've seen before (at least, it was for me), I'll take a bit of time to describe the details of how it works, so this instalment will be longer than the last.

Test infrastructure

To support writing tests in the style I want I'll create a little bit of test infrastructure:

[TestFixture]
public abstract class ConcernFor<T> {
    protected T sut;

    [SetUp]
    public void SetUp() {
        Context();
        sut = CreateSubjectUnderTest();
        Because();
    }

    protected virtual void Context() {}
    protected abstract T CreateSubjectUnderTest();
    protected virtual void Because() {}
}

This will be a base class for our tests, or specifications if you prefer. Before each test runs our unit testing framework will setup a context, create the subject under test (SUT), then execute the Because() method. This means we'll create our dependencies in Context(), create a subject under test that uses those depedencies, then call the sut in the Because() method. Our test itself will then assert that the right thing happened.

This is a fairly clumsy and simplistic (although reasonably understandable) version of the approach used in JP's developwithpassion.bdd library.

Designing the Add() method

Let's start top-down. How's our Calculator.Add() method going to work? We'll start writing a specification in CalculatorSpecs.cs:

public class CalculatorSpecs {
    public class When_given_a_string_to_add : ConcernFor<Calculator> {
        [Test]
        public void Should_return_the_result_of_adding_together_all_numbers_in_the_string() {
            Assert.That(result, Is.EqualTo(sum));
        }
    }
}

At an abstract level this states what I want to happen. When the calculator is given a string to add it should return the sum of all the numbers in that string. The implementation of that assertion is that the result we get is equal to the correct sum. We'll setup the specific instances of these variables later.

Why does this scenario occur? Because the Add() method was called with a given expression. Let's add that to our specification.

protected override void Because() {
 result = sut.Add(expression);
}

Building up a context for this to work

How are we going to get from a string expression to our sum integer? We've written the assertion and the action triggering off this scenario (the Because()), so now we've got to design a context that allows this to happen. Our test title gives us a bit of a hint as to what we need to do: we need to add together all the numbers in the string. Which implies we need to get the numbers out of the string, then sum them. Two responsibilities. So let's push each responsibility out into another object so we don't have to worry about it yet.

First, let's get something that will get some numbers out of our expression:

INumberParser numberParser;
/* ... snip ... */
protected override void Context() {
    numberParser = MockRepository.GenerateStub<INumberParser>();
    numberParser.Stub(x => x.GetNumbers(expression)).Return(numbers);
}

This has required our design to adopt an INumberParser with a GetNumbers() method on it. It is going to return some numbers from an expression. What expression? What numbers? Let's make some up:

INumberParser numberParser;
/* ... snip ... */
protected override void Context() {
    numberParser = MockRepository.GenerateStub<INumberParser>();
    expression = "(some numbers)";
    var numbers = new[] {1, 2, 3, 4};

    numberParser.Stub(x => x.GetNumbers(expression)).Return(numbers);
}
You may prefer to have our expression set to something like "1,2,3,4" so it matches the content of our numbers variable. Comes down to personal preference, but I prefer not to be prejudging exactly how our GetNumbers method is going to work. By putting in garbage then anyone reading this test will know that they have to look at the INumberParser implementation to see exactly how the numbers are getting parsed. Otherwise the parsing requirements could change and then this test would be giving misleading information.

Then we'll need something to add these numbers, right?

INumberParser numberParser;
IAdder adder;
/* ... snip ... */
protected override void Context() {
    numberParser = MockRepository.GenerateStub<INumberParser>();
    adder = MockRepository.GenerateStub<IAdder>();
    expression = "(some numbers)";
    var numbers = new[] {1, 2, 3, 4};

    numberParser.Stub(x => x.GetNumbers(expression)).Return(numbers);
    adder.Stub(x => x.Add(numbers)).Return(sum);
}

This has given us an IAdder interface with an Add() method on it that can sum numbers, as well as a variable name that sounds like a type of snake. When our adder is told to add the numbers we want to return a sum. It doesn't really matter what sum, we're just going to assume that it does its job properly. We haven't initialised our sum variable yet, so let's put in any old thing.

protected override void Context() {
 numberParser = MockRepository.GenerateStub<INumberParser>();
 adder = MockRepository.GenerateStub<IAdder>();
    expression = "(some numbers)";
 var numbers = new[] {1, 2, 3, 4};
 sum = 42;

 numberParser.Stub(x => x.GetNumbers(expression)).Return(numbers);
 adder.Stub(x => x.Add(numbers)).Return(sum);
}

All that's left now is to create our subject under test.

protected override Calculator CreateSubjectUnderTest() {
 return new Calculator(numberParser, adder);
}

Huh?

So what are we doing here? First, we've stated what our class does: when given a string it should add together all the numbers in that string. If we drill down into our context we can see how our class will accomplish this monumental feat. It will use a INumberParser to get numbers from our expression, and then use a IAdder to add those numbers.

So what is Calculator's single responsibility? It is just coordinating its two dependencies. This level of abstraction has just flowed fairly naturally from this way of thinking about and decomposing the problem while writing our test. If you're anything like me this will all look a little foreign, but I've found this style of TDD very addictive.

Those of you that have done mocking before may notice my favourite little side-effect of writing contexts like this: we have references to all the little pieces of data flowing through our class under test. We won't need to resort to any fancy Arg<int>.Matches(...) stuff, we just tell or mock/stub/substitute/test double exactly what to expect and what to return.

In case you are worried that this seems like a lot of work, it took me about 2 minutes to get this out.

Passing our specification

To pass this we switch to Calculator.cs. Our implementation will mirror the context we set up while writing the test:

public class Calculator {
    private readonly INumberParser _numberParser;
    private readonly IAdder _adder;

    public Calculator(INumberParser numberParser, IAdder adder) {
        _numberParser = numberParser;
        _adder = adder;
    }

    public int Add(string expression) {
        var numbers = _numberParser.GetNumbers(expression);
        return _adder.Add(numbers);
    }
}

This passes, and, as I can't see any refactoring to do, I think we're done. Because our class only has a single responsibility (coordinating a couple of dependencies), there aren't a whole lot of scenarios to set up, nor a whole lot of tests to write.

Driving out our IAdder implementation

We now have to continuing driving out the bits and pieces that the calculator needs to work. We have an INumberParser and IAdder that need implementations. IAdder sounds pretty straight forward, so let's knock that out of the way.

//AdderSpecs.cs
public class AdderSpecs {
    public class When_adding_numbers : ConcernFor<Adder> {
        private int result;
        private int sum;
        private IEnumerable<int> numbers;

        [Test]
        public void Should_return_the_sum_of_the_numbers() {
            Assert.That(result, Is.EqualTo(sum));
        }

        protected override void Because() {
            result = sut.Add(numbers);
        }

        protected override void Context() {
            numbers = new[] {1, 2, 3, 4};
            sum = 10;
        }

        protected override Adder CreateSubjectUnderTest() {
            return new Adder();    
        }
    }
}

//Adder.cs
public class Adder : IAdder {
    public int Add(IEnumerable<int> numbers) {
        return numbers.Sum();
    }
}

Creating a number parser

Finally, we need to tackle the INumberParser implementation. This one was driven out a requirement at a time, similar to the approach taken for my first attempt at the problem. First I added handling for the empty string case, then a string containing a single number, then multiple numbers separated by commas etc.

public class NumberParserSpecs {
    public abstract class Concern : ConcernFor<NumberParser> {
        protected string numberString;
        protected IEnumerable<int> result;

        protected override void Because() {
            result = sut.GetNumbers(numberString);
        }

        protected override NumberParser CreateSubjectUnderTest() {
            return new NumberParser();
        }
    }

    public class When_parsing_an_empty_string : Concern {

        [Test]
        public void Should_return_no_numbers() {
            Assert.That(result, Is.Empty);
        }

        protected override void Context() {
            numberString = "";
        }
    }

    public class When_parsing_a_string_containing_a_single_number : Concern {
        [Test]
        public void Should_return_the_number() {
            Assert.That(result, Is.EquivalentTo(new[] {1}));
        }

        protected override void Context() {
            numberString = "1";
        }
    }

    public class When_parsing_a_string_with_multiple_numbers_separated_by_commas : Concern {
        [Test]
        public void Should_return_all_the_numbers() {
            Assert.That(result, Is.EquivalentTo(new[] {1,2,3,4,5}));
        }

        protected override void Context() {
            numberString = "1,2,3,4,5";
        }
    }

    public class When_parsing_a_string_with_multiple_numbers_separated_by_newlines : Concern {
        [Test]
        public void Should_return_all_the_numbers() {
            Assert.That(result, Is.EquivalentTo(new[] { 1, 2, 3, 4, 5 }));
        }

        protected override void Context() {
            numberString = "1\n2\n3\n4\n5";
        }
    }
}

The main difference between this and the previous approach is that we are only testing the number parsing here, in isolation of the addition. This should hopefully give us more flexibility in changing the parsing later.

public class NumberParser : INumberParser {
    private static readonly char[] Delimiters = new[] {',', '\n'};

    public IEnumerable<int> GetNumbers(string expression) {
        if (expression == "") return new int[0];
        return expression.Split(Delimiters).Select(x => int.Parse(x));
    }
}

We still have two responsibilities here: splitting the expression and converting each part into an integer. That's a potential refactoring for us, but I'm happy enough with this for now.

What's left?

The remaining feature we have to implement is allowing a custom delimiter to be specified by beginning the expression with "//". This caused me a bit of trouble in the first instalment, so we'll tackle this in the next part of this series so we can look at it in a bit more depth.

Calculators and a tale of two TDDs - Pt 1: a traditional approach

I've seen TDD practiced a number of different ways. All of them use the basic "red, green, refactor" approach (i.e. write a failing test, pass it, refactor to clean up the code), but differ in the way the tests are written and in the focus of each test. Each way seems to lead the design differently, pushing design decisions at different times, requiring different amounts of refactoring and also focussing on quite different elements of design and implementation. I have a feeling that when practiced by masters each approach would converge to similar levels of awesomeness, but how far the rest of us get with each approach seems to vary greatly by how we naturally tend to approach problems.

To investigate this I thought I'd have a go at a coding exercise, and attempt it several different ways. You can't draw too many conclusions from this as the knowledge I get each time I go through the exercise will impact the later attempts, but I thought it could be an interesting exercise regardless.

The exercise I've picked is Roy Osherove's String Calculator kata. You can read the details at his site, but as it helps to avoid reading ahead I'll just describe it as we go. The overall gist is a calculator that has an Add() method. This method takes a string input, adds the numbers in the string together, then returns the result.

For this first attempt I'll be using a traditional Test Driven Development approach (or at least, traditional TDD as I understand it from reading some early-ish work on the topic by Kent Beck et al.). We'll pick up from the point where I have created a new solution containing a single C# project called CalculatorKata. I have two files: Calculator and CalculatorTests.

First test: empty strings

The first requirement for Add() is that it should return zero when given an empty string. Let's do that:

[TestFixture]
public class CalculatorTests {
    private Calculator calculator;

    [SetUp]
    public void SetUp() {
        calculator = new Calculator();
    }
    
    [Test]
    public void Takes_empty_string_and_returns_zero() {
        Assert.That(calculator.Add(""), Is.EqualTo(0));
    }
}

I've cheated a bit here by extracting a SetUp method in advance, but I'm pretty sure I'm going to need a new calculator for each test. Maybe this is evil. Feel free to point out the problems with this in the comments. This test doesn't pass, so let's pass it:

public class Calculator {
    public int Add(string expression) {
        return 0;
    }
}

It passes. Stay with me, things should start getting more interesting later (I hope...).

Test 2: Adding a single number

The next requirement is for when the string contains a single number. The test and passing implementation is below:

//CalculatorTests.cs
[Test]
public void Takes_a_single_number_and_returns_it() {
    Assert.That(calculator.Add("2"), Is.EqualTo(2));
}

//Calculator.cs
public int Add(string expression) {
    if (expression == "") return 0;
    return int.Parse(expression);
}

Summing multiple numbers

The next requirement is to actually sum multiple numbers separated by commas (','). We'll start of with the easy case: 2 numbers.

//CalculatorTests.cs
[Test]
public void Takes_two_numbers_delimited_by_commas_and_returns_the_sum() {
    Assert.That(calculator.Add("1,2"), Is.EqualTo(3));
}

//Calculator.cs
public int Add(string expression) {
    if (expression == "") return 0;
    return expression.Split(',').Sum(x => int.Parse(x));
}

Now you're quite welcome to debate me on whether this is actually the simplest thing that could possibly work. I could assume from my test that I'll only be dealing with two single digit numbers separated by a comma, get each digit via array access (expression[0] and expression[2]), parse each as an int and return the sum, but that hardly sounds any simpler to me. The current implementation will also pass our next test:

[Test]
public void Takes_multiple_numbers_separated_by_commas_and_returns_the_sum() {
    Assert.That(calculator.Add("1,2,3,4,5,6,7,8,9"), Is.EqualTo(45));
}

The only refactoring I can see is extracting the delimiter (',') into a constant. Once that's done we can go to the next step.

Allow commas and newlines as delimiters

//CalculatorTests.cs
[Test]
public void Takes_numbers_delimited_by_a_newline_and_returns_the_sum() {
    Assert.That(calculator.Add("1\n2\n3"), Is.EqualTo(6));
}

//Calculator.cs
public class Calculator {
    private string[] delimiters = new[] {",", "\n"};
    public int Add(string expression) {
        if (expression == "") return 0;
        return expression.Split(delimiters, StringSplitOptions.None).Sum(x => int.Parse(x));
    }
}

Allow a custom delimiter

The next requirement in the kata is more interesting. As input we can optionally specify a delimiter by using the following format: //[delimiter]\n[numbers]. The test is fairly simple, but the code got very messy:

//CalculatorTest.cs
[Test]
public void Can_set_the_delimiter_at_the_start_of_the_expression() {
 Assert.That(calculator.Add("//;\n1;2;3;4"), Is.EqualTo(10));
}

//Calculator.cs
public class Calculator {
    private const string CustomDelimiterToken = "//";
    private string[] DefaultDelimiters = new[] {",", "\n"};

    public int Add(string expression) {
        if (expression == "") return 0;
        var delimiters = DefaultDelimiters;
        if (expression.StartsWith(CustomDelimiterToken)) {
            var indexOfStartOfCustomDelimiter = CustomDelimiterToken.Length;
            var indexAfterCustomDelimiter = expression.IndexOf("\n");
            var customDelimiter = expression.Substring(indexOfStartOfCustomDelimiter, indexAfterCustomDelimiter - indexOfStartOfCustomDelimiter);
            delimiters = new[] {customDelimiter};
            expression = expression.Substring(indexAfterCustomDelimiter + 1);
            //Console.WriteLine(expression);
        }
        return expression.Split(delimiters, StringSplitOptions.None).Sum(x => int.Parse(x));
    }
}

That's beginning to look pretty intolerable. Our test passes, but this is in desperate need of refactoring. Now I'm pretty sure I've stuffed up the traditional TDD process here: this seems like much too big a step to get this test to pass. This could be an indication that I should have refactored first. Another indication is, as you may have noticed, I left a commented out Console.WriteLine() in there. That was because I had initially stuffed up the string escaping of the newline character, and couldn't find a good way to get inside my implementation to test what the value was at that point. This just feels plain dirty when you resort to that, but pragmatism won that battle and it helped me fix my error. All of this is really screaming out for a nice big refactoring, and I did refactor, but it was mainly tinkering around the edges using Extract Method to make the mess more understandable, rather than cutting through the mess itself.

public class Calculator {
    private const string CustomDelimiterToken = "//";
    private const string NewLine = "\n";
    private string[] DefaultDelimiters = new[] {",", NewLine};

    public int Add(string expression) {
        if (expression == "") return 0;
        var delimiters = DefaultDelimiters;
        if (HasCustomDelimiterSpecified(expression)) {
            delimiters = new[] {GetCustomDelimiter(expression)};
            expression = GetExpressionWithoutCustomDelimiterSpecification(expression);
        }
        return expression.Split(delimiters, StringSplitOptions.None).Sum(x => int.Parse(x));
    }

    private string GetExpressionWithoutCustomDelimiterSpecification(string expression) {
        return expression.Substring(IndexOfFirstNewLine(expression) + 1);
    }

    private string GetCustomDelimiter(string expression) {
        var indexOfStartOfCustomDelimiter = CustomDelimiterToken.Length;
        var indexAfterCustomDelimiter = IndexOfFirstNewLine(expression);
        var lengthOfCustomDelimiter = indexAfterCustomDelimiter - indexOfStartOfCustomDelimiter;
        var customDelimiter = expression.Substring(indexOfStartOfCustomDelimiter, lengthOfCustomDelimiter);
        return customDelimiter;
    }

    private int IndexOfFirstNewLine(string expression) {
        return expression.IndexOf(NewLine);
    }

    private bool HasCustomDelimiterSpecified(string expression) {
        return expression.StartsWith(CustomDelimiterToken);
    }
}

Not horribly impressive, is it? Perhaps a little easier to understand the Add() method itself, but this implementation hardly fills me with joy. The kata is not fully finished yet, but I think this is a decent place to stop and take stock.

Where did I go wrong?

Well, first up I missed the cue for a bigger refactoring. My tests weren't really screaming out a nice direction for me to go though. If you know your SOLID principles you'll have noticed right away that I am violating the Single Responsibility Principle (SRP). My class is doing lots of things: interpretting custom delimiters specified in a string, parsing numbers from the string, and adding the numbers.

Later on I went through and factored out a NumberParser class which took care of all the delimiter stuff and returning the numbers in the string, but that class still needed to be broken down further as well. Then I also needed to update the tests, so the calculator tests only test that it adds the numbers that come back from a mock NumberParser, and then adjust most of the current tests to test that the number parser is doing the parsing properly. Sure, I could leave the current tests as integration tests and forget about each unit, but I've this approach has given me troubles in the past.

Now I'm sure that if I were a better TDDer, refactorer, OOer etc that this would have turned out better. But my main complaint is that my tests aren't really helping me drive out my design. They help me to drive out an implementation that works (for which I am very grateful -- so much better than old school hack and slash :)), but by the time I am getting design feedback and test smells I've already dug myself in a bit of a hole. Luckily I've got tests to haul myself out, but is there a better way? Or do I need to just resign myself to the fact that I'm going to have to run through the gauntlet of SOLID principles and GRASP patterns etc. after each test to see when I need to refactor? But even then, refactor to what? I can start writing tests for the NumberParser, then join the bits back together, but this seems a bit hit-and-miss to me.

I'd love to get your thoughts on this. Anyone that wants to give the first part of the kata a try I'd love to read a blog post or have a chat with you and see how things ended up for you.

Next up I'll try a different flavour of TDD on the same problem.

Friday, 2 October 2009

Using Privoxy to block net nasties and ads in Chrome [Windows]

I love Google Chrome. It is so sleek, so responsive, and so fast. Problem is I've been using Firefox with AdBlock forever (well, rounded to the nearest eternity), and when I tried to switch to Chrome all the tracking cookies and ads turned my nice neat intraweb into a garish and frightening place. I finally decided to give Privoxy, a web proxy featuring junk filtering and privacy protections, a try.

Getting started with Privoxy

  1. Download and install Privoxy
  2. Set your HTTP proxy in Chrome (which uses the IE/Windows settings) to 127.0.0.1:8118. You can obviously use it from Firefox too if you like.
  3. Enjoy!

Wow, that was tough. :) The out of the box configuration seems to do pretty well. You can also delve into the config files and disable certain filters, allow ads for sites that aren't obnoxious, or switch to only blocking ads on sites you blacklist instead of using Privoxy's pattern matching. Privoxy is meant to be faster and easier than blocking this stuff in your hosts file as it can use patterns and cache lookups rather than running through each line in your host file every time you need a DNS lookup.

But I'm already behind a proxy at work! What do I do?

Glad you asked! You can configure Privoxy to forward to your corporate proxy. Go to your install directory (C:\Program Files\Privoxy for me) and edit the config.txt file (er, maybe back it up first :)). Search for the section called "forward" (section 5.1 on my install), uncomment the forward command and add your proxy:

forward / proxy.example.com:8080

This will forward all URLs to the given proxy and port number. Next search for "accept-intercepted-requests", read the explanation and warnings about the setting, uncomment the command and set it to 1. This will allow any traffic coming back via the corporate proxy to work with Privoxy.

accept-intercepted-requests 1

And with any luck you're now surfing in Chrome-filled bliss! :)

But I'm not always behind another proxy! How can I switch settings?

Oh, good question! I couldn't find any autoproxy.pac style approach for this, so I opted for a fairly basic approach. I created two config files, config.txt.work (with forwarding and accept-intercepted-requests on) and config.txt.nofoward (which is just the default config file for me). I then wrote a little PowerShell script to switch between these (use at your own risk! :)):

$location=$args[0]
$target=$env:ProgramFiles + "\Privoxy\config.txt"
if ($location -eq "work") {
    $source=$target + ".work"
} else {
    $source=$target + ".noforward"
}
cp -force "$source" "$target"

Based on the argument given to the script, this will just copy the required configuration file over Privoxy's config.txt file. I then setup two shortcuts pointing to this script, one to turn on work mode and the other to turn it off. The shortcut target looks something like this:

C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -noexit "C:\Users\davesquared\Documents\Applications\PrivoxySwitch\PrivoxySwitch.ps1" work

You don't need the -noexit option, I just had it in for debugging. Note the command ends with the "work" argument, so the other shortcut should have "noforward" or something similar. Finally I set both shortcuts to run as Administrator so it has permission to copy the files. I've got the shortcuts in a location indexed by Launchy, so now I can just activate Launchy, type "switch work", ok the admin prompt, and I'm in work mode.

Is it worth it?

Yes, yes it is. Chrome is so nice. And you can do lots of fancy stuff with Privoxy if you delve into it. Have a quick glance through the Privoxy Quickstart and give it a try! :)