Tuesday, 16 August 2011

Functional programming newbie and something something monad something

Let me get one thing straight: I know absolutely nothing about monads. I have never intentionally used something I've recognised as a monad. I am dangerously unqualified to enhance your understanding of monads in any way. In fact reading this may damage you and prevent you from ever learning what a monad actually is!!!

The first reason I'm posting anything about monads at all is that I watched one of Robert "Uncle Bob" Martin's entertaining NDC 2011 talks titled "WTF is a monad" (video available from the NDC site). I'm unsure how approximated or mathematically correct he was intending the presentation to be, but I found it really interesting and was able to implement something I can only hope was vaguely monadic based on my interpretation of the information he presented. So I thought I'd share it with you in case you could correct me (it should go without saying, but any mistakes here are mine and have nothing to do with Bob or his presentation). Worst case is it gets you interested enough to look into the topic and find out all the stuff I got wrong. (Was that alright OJ? ;))

The second reason is that I like writing words like monad, monadic, and monoid because for a brief, shining moment it makes me feel like a real computer scientist. This moment generally comes crashing down as soon as I realise I have no idea what any of these terms mean, but it is a good couple of milliseconds. :)

Did I mention I don't know what I'm talking about? For this post especially I mean. Yes? Good, you should be safe to read on then...

Something something monad something

As far as I can gather, a monad is a structure that will let you use functions that take arguments of a certain type, and apply it to values from an another type (I'll call this the monadic type, but I could be misusing the term). We need to be able to map back and forth between these types. Bob roughly approximates a monad to an adapter; a monad is a way of adapting one type to another.

It is the form of this adapter that makes it a monad. A monad can be expressed as two functions: the unit function, normally called return or result, that takes an argument of the original type and returns the monadic type; and a bind function that takes the monadic type and a function that works on original types. (Technically monads should also obey the monad laws. I'm sure I've missed other important points about them too, but let's run with this for now.)

This structure has a few useful properties, mainly to do with being able to chain a sequence of functions that take arguments of the original type, then apply arguments of the monadic type to that chain. I think.

A dot monad?

Uncle Bob's first example was using a monad to manipulate a dots type using functions that normally work with integers. The dots type is simply a representation of an integer using '.' characters, so 5 maps to '.....' and back again. We'd like to be able to be able to use dots with standard integer operations like add, so that '..' + '...' gives '.....'.

Let's look at an example in Python:

class DotMonad:
    def result(self, i):
        return '.' * i
    def bind(self, dots, f):
        return f(len(dots))
Aside: If you haven't used Python before, the self arguments to the functions is required due to how instance methods work in Python. You can safely ignore them for this post, but if you know C# or Java self basically becomes like this in the context of an instance method.

Here our result function just translates integers (i) into dots. The bind function takes some dots and a function f that takes an integer. First it converts dots to integers (using the length of the string of dots) then calls f using the result.

This means that if we have an add function which takes integers, we can use our monad to adapt that function to take dots.

# Integer add function
def add(a, b):
    return a+b

# Monadic add function for dots
def addM(dotsA, dotsB):
    m = DotMonad()
    return m.bind(dotsA, 
        lambda a: m.bind(dotsB, 
        lambda b: m.result(a+b)
        )
    )

I've used a and b as the plain integer types, and dotsA and dotsB to represent our monadic dots type. We can now call addM('..', '...') and get '.....'.

So how's this work? Well remember that bind takes a dot for a first argument, and calls the function provided as a second argument after converting the dot to an int. The function we provide will be called with dotsA converted to integer a, then recursively call bind to convert dotsB in the same way. The last function in the chain is to the monad's result method which will convert the result of a+b back to dots.

Let's expand out and trace through the addM('..', '...') example to make sure we've got a handle on this:

return m.bind(dotsA,    # dotsA is '..', which is converted to int and passed to fn in 2nd arg
    lambda a:           # bind calls function with a = len('..'), which is 2 
        m.bind(dotsB,   # dotsB is '...', which is converted to int and passed to fn in 2nd arg
    lambda b:           # 2nd bind calls function with b = len('...'), which is 3 
        m.result(a+b)   # a+b is 2+3=5. m.result converts this back to '.....'
)

I think this is called lifting the add (+) function to work with our monad.

Lifting functions using monads

So we've now got a version of the basic integer add function that can work with our monadic dots type. But we'd like to be able to apply all integer functions to work with dots. In fact, we can generalise our addM function from before to lift any function which takes two arguments using a monad that can bind to that function's argument type .

Aside: We could also generalise to support functions with any number of arguments, but I'm struggling to keep up as it is. :\ :)

def liftm(m, op):
    return lambda a,b: m.bind(a,
            lambda ax: m.bind(b,
            lambda bx: m.result(op(ax, bx))
            )
    )

This is pretty much identical to our addM function, but we can now do some neat stuff. Let's import some standard Python operators and dot-erise them:

import operator

addM = liftm(DotMonad(), operator.add)
subM = liftm(DotMonad(), operator.sub)
divM = liftm(DotMonad(), operator.div)
mulM = liftm(DotMonad(), operator.mul)

#Interactive python session
>>> addM('..', '.')
'...'
>>> subM('....', '...')
'.'
>>> divM(mulM('..', '...'), subM('...', '.'))
'...'

Should we try again? Maybe...

Let's try another monad (again, from one Bob showed in his talk). This time we're going to try and represent a type that can either have or be missing a value as a monadic type. So something very similar to .NET's nullable types, Nullable<T>. The difference with the monadic form is that, because of the way we chain sequences of bind operations, we can actually perform operations involving missing values without throwing null reference exceptions everywhere.

class MaybeMonad:
    def result(self, x):
        return x
    def bind(self, maybe, f):
        if (maybe is None):
            return None
        else:
            return f(maybe)

Here our result function just returns whatever value it is given. If it has a value it will return that value; otherwise it will return None (Python's null or nil value).

Now we can lift our standard operators to work with our Maybe type:

>>> addm = liftm(MaybeMonad(), operator.add)
>>> mulm = liftm(MaybeMonad(), operator.mul)
>>> addm(2, 3)
5
>>> addm(4, None)
>>> mulm(6, 7)
42
>>> mulm(None, None)

Or we can lift null-safe versions of other functions:

def string_lens(a, b):
    return len(a) + len(b)

#Interactive python session
>>> string_lens("Hello", "World")
10
>>> string_lens("Hello", None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in string_lens
TypeError: object of type 'NoneType' has no len()

>>> safe = liftm(MaybeMonad(), string_lens)
>>> safe("Hello", "World")
10
>>> safe("Hello", None)

Here string_lens throws when we pass in None, but our safe lifted version takes them in its stride.

Real-world monads

Monads can actually be spotted out in the wild. They particularly enjoy frolicking with pure functional languages, where they can be used for (among other things) getting around the pesky limitation of not allowing side-effects in functions. Mutable state can be simulated by passing a State monad between functions. The I/O monad is used to encapsulate the side-effects of reading and writing from input and output.

Reading through the examples in the Wikipedia entry shows some collections can even be regarded as monads (for example, result can return a list from a single item, bind can map a function to each element in a list). In some instances LINQ statements can also be used as monads. I've even seen JQuery accused on monadishness (yes, I just made up a word).

So where's this leave us? If you're like me: dazed, confused, craving a cup of tea, and also quite eager to resume working through the excellent Learn you a Haskell tutorial. :)

Friday, 12 August 2011

Technical debt

tl;dr: Every design decision creates technical debt. Trying to come up with the purest, best design can in some cases lead to more debt than seemingly less-ideal, lower impact changes. It can help to keep this in mind while making design decisions; today's optimal abstraction may cause much more pain long term than the quicker, less intrusive fix.

I've started listening to the excellent Ruby Rogues podcast recently, and the recent episode on technical debt really got me thinking about the topic.

The technical debt metaphor

The technical debt metaphor has traditionally been used to explain the impact of using hacky approaches in software development. We can go into debt by trading some quality (be it design purity, code cleanliness, testing, documentation etc.) for a temporary boost in speed.

The metaphor points out that, like financial debt, technical debt also accrues interest. By implementing something in a sub-optimal way we are not simply deferring a fixed time period it would have taken to do the work properly in the first place; that time will increase the longer we leave the hack in and become more dependent on it, or have to work around it in the rest of the code. This means if we accrue too much debt then our project becomes bankrupt -- the quality comes so low we can no longer work effectively with the codebase, and the project grinds to a stand still.

The conclusion of the metaphor is that we can take on small amounts of technical debt as required to meet deadlines or other constraints, but we need to keep the debt at a manageable level by regularly paying it back with refactoring and rework to keep the project progressing well longer term.

Inadvertent debt

Martin Fowler points out that technical debt can be accrued both deliberately and inadvertently. Most of the time we just don't know the right approach in advance, and so end up using sub-optimal solutions that accrue interest and slow us down.

"The moment you realize what the design should have been, you also realize that you have an inadvertent debt... My view is this kind of debt is inevitable and thus should be expected. Even the best teams will have debt to deal with as a project goes on - even more reason not to recklessly overload it with crummy code."
-- Martin Fowler, from his post on The Technical Debt Quadrant

After thinking about this some more I've come to the opinion that this doesn't quite go far enough. We may decide another design may have worked better, but until we've done it we don't really know for sure. The new design will have its own share of problems. It might even be considerably better, but there is more than likely another, even better solution that will become apparent once we learn from the mistakes of the second attempt.

One may even make the outrageous assertion that there is no perfect design for any problem, there are only different sets of trade offs.

Would you like a silver bullet with your free lunch?

Wow, so there is no silver bullet, huh? Bet you've never heard that before. ;) While you're finishing up sniggering something about me and "Captain Obvious" into your RSS reader, let me try and explain why I find this is a very helpful realisation in the context of technical debt. :)

If we acknowledge there is no perfect design, then every design decision we make is incurring technical debt that we'll need to pay off. This means that if we try and optimise all our decisions for what appears to be the ideal design, we are quite likely to end up taking longer to strive for perfect when the outcome will be sub-optimal anyway.

That elegant object hierarchy, pattern, or great abstraction added to the application architecture that removes lots of duplication and provides a huge potential for future reuse may seem like the lowest debt solution, but it will not be completely debt free. It will accrue interest and end up costing us as the broad generalisation it encapsulates becomes less applicable as development continues. In my experience the larger the change and investment in that change, the greater the interest rate tends to be.

Rather than looking for the ideal, we may be better off favouring decisions that are quick to implement and reversible/easy to change. Low impact, loosely-coupled changes (i.e. ones that require minimal amounts of changes to existing code) that at first seem less ideal maybe actually give us less debt overall. They may not perfectly abstract away the details of the current problem, but they will be easy to change or remove (along with the debt they contribute) as the needs of the project change or become better understood. Instead of picking the perfect fit for current requirements, we can focus on trying to pick the options that let us iterate and adapt to future requirements.

This kind of thinking can lead us to picking options that at first seem counter-intuitive but end up proving remarkably useful, such as using loosely-grouped classes over strongly-layered architectures, Commands to model database calls via a micro-ORM over persisting large object graphs via full-fledged ORM, using a NoSQL data store rather than an RDBMS, or picking a Front Controller pattern or MVC for a web app over an abstraction like WebForms.

"Right" sometimes isn't

At least for the developers I talk to, there seems to be a lot of self-imposed pressure for us to pick the "right" design when faced with a design choice; the choice that seems to perfectly abstract the current requirement. The right choice seldom seems the quick, easy option, but is generally one that requires a large amount of work, much of it for the benefit that it looks pure and elegant from an OO and a architectural point of view. I hear (and have made) comments like "That's a bit ugly. We should extract a common interface, add a factory and (insert lots more work here)". It's as if taking the easy option makes us lazy, bad people; that we are somehow failing to live up to the ideals of software craftsmanship.

I think our efforts to do the right thing in these cases can mean we inadvertently do a worse job overall; we are optimising for our current needs but going into debt by borrowing from our future needs. Our decision quickly racks up a huge amount of interest that we have to fight every time we need to make a change that does not quite match the previous requirements. The cost quickly outstrips the smaller, more modular change that would have been long gone, quickly refactored away the moment requirements started to push the code in a new direction. It seems like a good example of perfect being the enemy of the good.

Conclusion

This is not intended as an excuse for hacky code. I am talking about choosing between changes that are made with appropriate tests, taking SOLID principles and good programming practices into account, etc. My point is that the common way of thinking about technical debt can lead us to optimise for the wrong things. Rather than aiming for a beautiful OO abstraction that seems a zero-debt choice, we should acknowledge all our design decisions will incur debt and accrue interest, and so also look for low-impact, reversible changes that will add a smaller debt burden. It's not a question of whether to take on debt or not; it's how much to take and where from.