Tuesday, 28 October 2008

Unusual formatting with test-first extension method abuse

My pair and I were looking at performing some unusual string formatting today. We kept finding that the extension methods in System.Linq.Enumerable were pretty helpful, but they often seemed to fall just short of what we needed to make the code really readable. Once I got home I thought I'd see how far I'd get by dumping some functionality into extension methods with blatant disregard for the potential consequences. (Unfortunately I had to miss the Sydney ALT.NET meeting tonight, so I had a bit of time to play around.)

Formatting arrays for acceptance tests

Here is the basic behaviour we're after. Given an array or other enumerable of integers (or of any type with a sensible ToString() method really), we want to return the items as a single, comma separated string. The strange part of it is that if every value in the enumeration is the same, we just want to return that one value as a single string. The reason for this unusual behaviour is to help get some easily usable output for the acceptance test framework we are using.

As this is just a helper for acceptance tests (i.e. we won't be polluting namespaces in production code) I'll dump this functionality onto any IEnumerable<T> using an extension method.

Starting test first

Let's start with an easy case: what should happen when we have an empty enumerable?

[TestFixture]
public class FixtureFormatterTests {
    [Test]
    public void Empty_array_should_format_as_empty_string() {
        var emptyArray = new int[0];
        Assert.That(emptyArray.ToFixtureString(), Is.EqualTo(string.Empty));
    }
} 

public static class HelperExtensions {
 public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
   return string.Empty;
 }
} 

After that monumentally brilliant piece of code, let's add the comma-separated string part of the requirement.

[Test]
public void Array_with_different_values_should_give_comma_separated_string() {
 var ints = new[] {1, 2, 3, 4};
 Assert.That(ints.ToFixtureString(), Is.EqualTo("1,2,3,4"));            
}

Now we'll get it to pass. We'll lean heavily on the built-in String.Join(String, String[]) method to do the work for us.

public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
 if (enumerable.Count() == 0) return string.Empty;
 return string.Join(",", enumerable.Select(item => item.ToString()).ToArray());
}

This passes, but it's a bit ugly. Let's look at refactoring.

First refactoring

First, I've got a feeling that if our enumerable is empty, String.Join(...) won't concatenate anything, and so will just return an empty string. This would render our first line redundant.

public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
 return string.Join(",", enumerable.Select(item => item.ToString()).ToArray());
}

It still passes both our tests so we are safe (I love unit tests :)). We also have that ugly bit of code where we are translating our IEnumerable<T> into an array of strings, using the Select() extension method. As I'm keen to start over using extensions methods, let's hide all that away in a Python-like join() method. Python's join() works like this:

>>> ints = [1,2,3,4]
>>> ",".join(str(i) for i in ints)
'1,2,3,4'

I'd like to do that, but abstract away the sequence/enumerable to string conversion. Let's do this using an extension method to char:

public static string Join(this char separator, IEnumerable enumerable) {
 return string.Join(separator.ToString(), enumerable.Select(item => item.ToString()).ToArray());
}

public static string ToFixtureString(this IEnumerable enumerable) {
 return ','.Join(enumerable);
}

Assuming you know the whole "Join" concept, our ToFixtureString() method is now pretty darned clean :). The original ugliness is now moved to the Join() method, but at least it is all directly related to the purpose of that method. In its original spot I think it obscured the intention behind the ToFixtureString() method.

Completing our ToFixtureString() requirements

The last requirement we have for this is to only show one value if all the items in the enumerable are the same.

[Test]
public void Array_with_the_same_values_should_return_that_value_as_a_single_string() {
 const int value = 2;
 var ints = new[] {value, value, value};
 Assert.That(ints.ToFixtureString(), Is.EqualTo(value.ToString()));
}

Here's an attempt at get this to pass.

public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
 var firstItem = enumerable.First();
 if (enumerable.All(item => item.Equals(firstItem)))
 {
  return firstItem.ToString();
 }
 return ','.Join(enumerable);
}

This fails our Empty_array_should_format_as_empty_string test because the enumerable.First() call throws with InvalidOperationException: Sequence contains no elements. So we're back to that enumerable.Count() == 0 line, which gets all our tests passing again.

public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
 if (enumerable.Count() == 0) return string.Empty;
 var firstItem = enumerable.First();
 if (enumerable.All(item => item.Equals(firstItem)))
 {
  return firstItem.ToString();
 }
 return ','.Join(enumerable);
}

Refactoring out the empty enumerable check

I don't like enumerable.Count(). It needs to go through the entire enumerator to get the count, when we really only care if the enumerable is empty. Sounds like time for some more extension method abuse. Here's some tests that require adding an IsEmpty() extension method to IEnumerable<T>:

[TestFixture]
public class IsEmptyEnumerableTests {
 [Test]
 public void Empty_enumerable() {
  Assert.That(new int[0].IsEmpty());
 }

 [Test]
 public void Non_empty_enumerable() {
  Assert.That(new[]{1,2,3}.IsEmpty(), Is.False);
 }
}

public static class HelperExtensions {
 //...  
 public static bool IsEmpty<T>(this IEnumerable<T> enumerable) {
  return !enumerable.GetEnumerator().MoveNext();
 }
}    

This is a bit hacky, but means we only need to to see if our enumerator has a single item to determine whether it is empty, and we can make our ToFixtureString() method a bit more expressive as a result:

public static string ToFixtureString<T>(this IEnumerable<T> enumerable) {
 if (enumerable.IsEmpty()) return string.Empty;
 var firstItem = enumerable.First();
 if (enumerable.All(item => item.Equals(firstItem))) {
  return firstItem.ToString();
 }
 return ','.Join(enumerable);
}

Vague semblance of a conclusion

We now have our unusual formatting covered, and IsEmpty() and Join() extension methods to help make our code a bit cleaner. I'm not advocating this kind of thing for every day use, but I think it shows how useful extension methods can be to make your code more expressive. It comes at the cost of changing classes that most .NET developers are familiar with, so it's definitely something to be careful with.


Share/Save/Bookmark

Wednesday, 22 October 2008

Learning C# lambda syntax from delegates

As a sweeping generalisation, I've found that developers who are really proficient with delegates / anonymous delegates seem to have a bit of an adverse reaction to seeing the lambda syntax in C# .NET 3.5. Luckily I'm proficient with very little, so the transition was easy for me :). This is a post to try and make the transition easier for people more competent than me :).

From delegates to lambdas

Say I have a Widget class, which just contains a Name and a WeightInGrams.

public class Widget {
        public Widget(string name, int weightInGrams) {
            Name = name;
            WeightInGrams = weightInGrams;
        }
        public string Name { get; set; }
        public int WeightInGrams { get; set; }
    }
}

Now we want to search through an array of these and find how many are under 300 grams. Why? Er, why not? We'll use Array.FindAll to do this old skool (as opposed to fancy LINQy stuff like using Where()). FindAll takes an array of type T and a Predicate<T>, which is a delegate that takes a T and returns a bool indicating whether the predicate has been matched.

[TestFixture]
public class LambdaTests {
 [Test]
 public void SearchArrayUsingDelegate() {
  var widgets = SixWidgetsFrom100GramsTo600Grams();
  var widgets300GramsOrLess = Array.FindAll(widgets, Weighs300GramsOrLess);
  Assert.That(widgets300GramsOrLess.Length, Is.EqualTo(NumberOfWidgets300GramsOrLess));
 }

 private bool Weighs300GramsOrLess(Widget widget) {
  return widget.WeightInGrams <= 300;
 }

 private const int NumberOfWidgets300GramsOrLess = 3;
 static Widget[] SixWidgetsFrom100GramsTo600Grams() {
  return new[] {
    new Widget("W1", 100), new Widget("W2", 200), new Widget("W3", 300),
    new Widget("W4", 400), new Widget("W5", 500), new Widget("W6", 600)
     };
 }
} 

As of .NET 2.0 we can use an anonymous delegate to do this inline:

[Test]
public void SearchArrayUsingAnonymousDelegate() {
 var widgets = SixWidgetsFrom100GramsTo600Grams();
 var widgets300GramsOrLess = 
  Array.FindAll(widgets, delegate(Widget widget) { return widget.WeightInGrams <= 300; });
 Assert.That(widgets300GramsOrLess.Length, Is.EqualTo(NumberOfWidgets300GramsOrLess));
}

As of .NET 3.5 we have lambda syntax, which provides a terser way of expressing our predicate function:

[Test]
public void SearchArrayUsingLambda() {
 var widgets = SixWidgetsFrom100GramsTo600Grams();
 var widgets300GramsOrLess = 
  Array.FindAll(widgets, widget => widget.WeightInGrams <= 300);
 Assert.That(widgets300GramsOrLess.Length, Is.EqualTo(NumberOfWidgets300GramsOrLess));
}

Clear as mud? Let's have a closer look at how we convert from delegate to a lambda expression.

//Original delegate:
delegate(Widget widget) { return widget.WeightInGrams <= 300; }

//Drop the "delegate" keyword, and add a funky "=>" operator, which goes by all sorts of creative names :)
(Widget widget) => { return widget.WeightInGrams <= 300; }

//The C# 3.0 compiler has type inference, so we can also drop the argument type and let the compiler figure it out.
//If we have a single statement to the right of the "=>" operator, this will be returned from the function,
//so we can also drop the braces, end-of-statement semicolon and the explicit return.
widget => widget.WeightInGrams <= 300

So what's the difference between our anonymous delegate and our lambda expression? In this example, absolutely nothing other than a terser (and somewhat addictive IMO) syntax. Let's compare the generated code for both just to prove this:

[CompilerGenerated]
private static bool <SearchArrayUsingAnonymousDelegate>b__0(Widget widget) {
    return (widget.WeightInGrams <= 300);
}
[CompilerGenerated]
private static bool <SearchArrayUsingLambda>b__2(Widget widget) {
    return (widget.WeightInGrams <= 300);
}

So based on this example anonymous delegates and lambdas are exactly the same, it's just a matter of getting used to writing a bit less noise code. :)

Exactly the same, except when they're different...

Of course there's a catch. Actually, I can think of two, and they both relate to expression trees. To support a lot of LINQ magic, lambda expressions can be converted to expression trees at compile time. An expression tree is basically just a bunch of objects representing each part of the lambda expression. A query provider, like the one provided by LINQ to SQL, can then process the expression tree and execute the expression in a different way, say, by converting it to TSQL and running it against database.

To get the compiler to generate an expression tree from a lambda expression we just need to specify the type differently:

Predicate<Widget> lambda = widget => widget.WeightInGrams <= 300;
Expression<Predicate<Widget>> expressionTree = widget => widget.WeightInGrams <= 300;

So how does this relate to differentiating anonymous delegates and lambdas?

//Compiles fine:
Expression<Predicate<Widget>> expressionTree = widget => widget.WeightInGrams <= 300;

//WON'T COMPILE:
Expression<Predicate<Widget>> expressionTree = delegate(Widget widget) { return widget.WeightInGrams <= 300; };
/* error CS1946: An anonymous method expression cannot be converted to an expression tree */

As you can see from the code sample above, the compiler will simply refuse to convert the delegate form to an expression tree. So the way the compiler handles the two are quite different as soon as you introduce expression trees. I also mentioned a second catch. Take a look at this modification of the previous example:

//WON'T COMPILE:
Expression<Predicate<Widget>> expressionTree = widget => { return widget.WeightInGrams <= 300; };
/*  error CS0834: A lambda expression with a statement body cannot be converted to an expression tree */

This second catch is that there is actually a difference between lambda expressions and lambda statements. A lambda statement contains braces and a function body, and can potentially have multiple lines like a standard delegate. A lambda expression is the single line with an implicit return. So in our original, Array-searching example, the following two statements are actually are different if you are trying to assign them to expression trees.

//Lambda statement
widget => { return widget.WeightInGrams <= 300; }

//Lambda expression
widget => widget.WeightInGrams <= 300

Aside: In case you were wondering, here is the expression tree generated by the compiler for the widget => widget.WeightInGrams <= 300 lambda expression, care of Reflector:

ParameterExpression CS$0$0000;
Expression<Predicate<Widget>> expressionTree = 
  Expression.Lambda<Predicate<Widget>>(
 Expression.LessThanOrEqual(
  Expression.Property(
   CS$0$0000 = Expression.Parameter(typeof(Widget), "widget"), 
   (MethodInfo) methodof(Widget.get_WeightInGrams)), 
   Expression.Constant(300, typeof(int))), 
   new ParameterExpression[] { CS$0$0000 }
  );

Conclusion

So in conclusion, lambdas are simply, for most intents and purposes, a neater syntax for defining delegates.

//Delegate:
delegate(Widget widget) { return widget.WeightInGrams <= 300; }

//Drop the delegate and add the "=>" operator to get a lambda statement
(Widget widget) => { return widget.WeightInGrams <= 300; }

//Use type inference and implicit return to get a lambda expression
widget => widget.WeightInGrams <= 300

The only differences that can bite you are when you are dealing with expression trees, either explicitly via the Expression<> type, or implicitly by using the LINQ operators. Hope this helps, or at least has caused no significant damage to your understanding of lambdas :)


Share/Save/Bookmark

Wednesday, 1 October 2008

The (very) basics of AAA with Rhino Mocks 3.5

A small contingent from my work made the trek out to the first Sydney ALT.NET meeting last night. It was great to be in a room full of people all intent on finding better ways to develop software. Afterward I was dragged kicking and screaming (</sarcasm> :)) by my colleagues to a local pub for debriefing over beers and a laptop. One topic discussed was the Arrange, Act, Assert (AAA) style of mocking using Rhino Mocks 3.5.

I thought I'd quickly run through my (admittedly basic) understanding of AAA, as I tend to use a slightly different approach to the one shown in the meeting. If I've got anything hideously wrong please leave a comment and let me know. I'm not going to cover anything about how to use mocking, but will just attempt to outline the difference between record/replay and AAA.

Record / Replay semantics

The traditional way of mocking has been to use record/replay. This means you record a number of expectations against a mock object, then change the mock to replay mode and exercise the subject under test (SUT). In replay mode, the mock will throw an exception if an unexpected method is called (for strict mocks). The final step is to verify the expectations you recorded, which will throw an exception if one of the expected methods was not called.

Let's have a look at one of my dodgy-as-usual examples (probably even worse than normal, as it was written in a pub around 11 pm after a long day :))

public interface IEmailService {
    void Send(MailMessage msg);
}

public class InvoiceSender {
    private readonly IEmailService emailService;
    public InvoiceSender(IEmailService emailService) {
        this.emailService = emailService;
    }

    public void SendInvoice(float amount, string to) {
        var msg = new MailMessage("me@me.com", to, "Invoice", string.Format("Please pay {0}", amount));
        emailService.Send(msg);
    }
}

Our subject under test is InvoiceSender, and we want to verify that the IEmailService.Send(MailMessage) method is being called from SendInvoice(float, string).

[Test]
public void Send_invoice_using_email_service_with_record_replay() {
 var mockRepo = new MockRepository();
 var mockEmailer = mockRepo.DynamicMock<IEmailService>();            
 var invoiceSender = new InvoiceSender(mockEmailer); 
 //Record expecations
 mockEmailer.Expect(service => service.Send(null)).IgnoreArguments() ; 
 mockRepo.ReplayAll();
 //Exercise SUT
 invoiceSender.SendInvoice(1.0F, "me@me.com");  
 //Verify expectations
 mockRepo.VerifyAll();            
}

Here we've used Rhino Mocks to generate a mock implementation of the IEmailService interface. We have recorded a single expectation against it: it is expecting to have its Send(...) method called with any argument (hence the IgnoreArguments() call -- for non-pub code we would probably want to check the argument).

We then use ReplayAll() to switch to replay mode, which tells our mocks that we have finished recording expectations and are ready to see what is really called on our mock. We then exercise the SUT, and verify that our expectations were met (i.e. Send() was called on our mock). The test passes -- victory is ours!

Writing the test using AAA

There's nothing really wrong with the record/replay approach. If you like it that's great! Some people find it confusing (or at least unnatural), probably because it doesn't quite fit the four phase test structure used for state-based testing (Setup, Exercise, Verify, Teardown).

The AAA approach lets us use a more state-based testing approach with our mocks. Let's rewrite our previous test using AAA:

[Test]
public void Send_invoice_using_email_service_with_AAA() {
    //Arrange
    var mockEmail = MockRepository.GenerateMock<IEmailService>();
    var invoiceSender = new InvoiceSender(mockEmail);
    //Act
    invoiceSender.SendInvoice(1.0F, "me@me.com");
    //Assert
    mockEmail.AssertWasCalled(service => service.Send(Arg<MailMessage>.Is.Anything)); 
}

Here we are creating our mock using the new static GenerateMock<T>() method introduced in Rhino Mocks 3.5. We then exercise the SUT with an identical line of code to the one used in the first test. Finally, we assert that the Send() method on our mock was called as we expected. The test passes -- again we are victorious!

First couple of things to notice are that we have no mention of recording or replaying expectations, and we've used less lines of code. Our test also fits in with the four phase test structure: Arrange -> Setup, Act -> Exercise, Assert -> Verify, with optional Teardown. This avoids mixing expectations and assertions throughout the test. Depending on your prior experience with record/replay, you might find this easier to read and understand than our first test.

Under the hood Rhino Mocks is still going off and doing pretty much the same thing as its always done. The static GenerateMock<T>() method simply creates a dynamic mock already in replay mode. The mock remembers all calls against it, and we can then use AssertWasCalled() and other methods to check these calls and make sure the ones we want are there.

Aside: One of the issues raised during the Sydney ALT.NET meeting was mocks vs. stubs. I think AAA makes the distinction a bit more apparent. You'll tend to use a stub during the Arrange part of your test to provide indirect inputs to the SUT, whereas you will use mocks to verify behaviour and indirect outputs of the SUT during the Assert phase. From the Rhino Mocks wiki page on 3.5 (rev. 40), "A stub will never cause a test to fail", whereas mocks will fail if the expectations on them aren't met.

This is only a very simplistic example to illustrate the basic differences for each approach. For more realistic cases the benefits of AAA become more apparent (see Jimmy Bogard's post on AAA with BDD tests for a good example).

These couple of lines of code are available from my Google Code repository if you want run the tests and have a bit of a play around: DaveSquared.MockSample.zip.


Share/Save/Bookmark