Pondering a prescription for pattern matching prevalance

In which I ramble on about how my thoughts on pattern matching have changed over the years.

Glorified conditional?

At its most basic, pattern matching can be use to represent standard conditionals and switch statements. For example (in F#):

// Pattern matching syntax:
let menu s =
    match s with
    | "X" = exitCommand
    | "U" = moveUpCommand
    | "D" = moveDownCommand
    |  _  = ...

// Conditionals:
let menu2 s =
    if s = "X" then exitCommand
    else if s = "U" then moveUpCommand
    else if s = "D" then moveDownCommand
    else ...

This did not initially seem very exciting to me. There has to be more to it than this, right? (Spoiler: yes :) )

Pattern match all teh things!

Things get more interesting when we are dealing with types whose values can have different shapes. For example, Option<T> (similar to Nullable<T> in C#). In F# Option<T> has an IsSome property (like HasValue for Nullable<T>). If this is true then it is safe to access that value’s Value property. If IsSome is false, then accessing Value will throw a NullReferenceException. So we could (but please don’t) use option types like this:

// Don't do this!
let getKeyAndValue (key : string) (dict : Map<string,string>) =
    let result = dict.TryFind(key)
    if result.IsSome then
        Some (key, result.Value)  // please don't do this
    else
        None

I don’t like this. I’m not fond of null reference exceptions, and I don’t like checking IsSome before accessing values because I do silly things like messing up the conditional, or forgetting to check and crashing with a NullReferenceException (or if not forgetting, there are always those cases that "will never be null" which end up being just that due to a misunderstanding or a change somewhere along a call stack). And what about more complicated types, where we may have to check several different preconditions before accessing a number of different values?

Instead, we can use pattern matching to match all the possible shapes of our type:

// Better (but can still be improved)
let getKeyAndValue (key : string) (dict : Map<string,string>) =
    match dict.TryFind(key) with
    | Some value -> Some (key, value)
    | None       -> None

This is great because we don’t need to access the null reference-throwing .Value property. Instead the value is assigned as part of the pattern: Some value. For the None case there is no value we can access within the pattern. If we tried to add one, the compiler will stop and tell use we have the wrong pattern. What is extra great is that if we don’t cover all the possible allowable values of the type we are matching against the compiler will warn us.

So we’ve ruled out a whole bunch of errors, and have very explicit, compiler-checked documentation about valid ways to use values of each type.

This is awesome! Pattern match all teh things!

The "meh" of matching

Say we have a collection of key value pairs, where both keys and values are strings. Maybe we got this from a POST request, or a flattened JSON object or something. We want to get the value for a particular key, and convert it to an integer (or 0 if we can not do the conversion).

So we have two cases that can be None, looking up a value for a key that may not be in the JSON, and trying to convert the value to a valid integer.

Let’s start out with the conditional version:

let getRows (dict : Map<string, string>) : int =
    let rows = dict.TryFind("numberOfRows")
    if rows.IsSome then
        let result = tryParseInt(rows.Value)
        if result.IsSome then result.Value
        else 0
    else 0

Yuck, look at all those potentially catastrophic .Value calls! Let’s rewrite it with our new-found hammer:

let getRows2 (dict : Map<string, string>) : int =
    match dict.TryFind("numberOfRows") with
    | None -> 0
    | Some rows ->
        match tryParseInt rows with
        | None -> 0
        | Some result -> result

What isn’t so great is that we are still writing very similar code, just with safer pattern-matching instead of free-form conditionals. But we’re still going through the same code branches.

What I also found alarming when first starting out with this is a side-effect of the compiler warning us about unmatched values – we’re now forced to be explicit everywhere about how to handle all the values. Isn’t this going to get horribly verbose? We already have a good idea about when things are going to be null, so why trade concise code for a little safety?

Well, the good thing is we can have our safety and eat… er… code… concisely too!

Combinator all teh things!

Rather than digging into the details of a type by pattern matching all the time, we can define operations for using and combining values of these types. I often see these referred to as "combinators" (although that term seems overloaded). For example, we can rewrite our getRows function using Option.bind and Option.getOrElse1 without ever digging in to grab a value from an Option<T> type.

let getRows3 (dict : Map<string, string>) : int =
    dict.TryFind("numberOfRows")
    |> Option.bind tryParseInt
    |> Option.getOrElse 0

Under the hood this code is still doing exactly the same thing, but we are now expressing the operation in terms of other distinct operations, instead of via the details of deconstructing values2. This allows us to start thinking at a higher level of abstraction. Rather than thinking about things like "if this is Some value return that, or if it is None then return the second option", we start thinking in terms of the higher-level operations like or and map. These operations allow us to more easily and precisely express more complex ideas.

This was a huge turning point for me. Previously I was worried about things like Option<T> values propagating all over the code, and having to pattern match at each call site. Now we still get propagation (which is completely valid! If we are dealing with a call that can return an empty value, chances are the caller will also need to return an empty value), but there is no cost for this. Combinators make using these values almost as convenient as using the wrapped type3, with the benefit that we are now safely handling empty values instead of relying on us to remember which calls sometimes return null instead of a T.

An aside for pattern matching-less languages

If we mainly use combinators for combining types of values, this makes pattern matching a less essential part of a language. It is still a very nice feature to have, as it is pretty natural to implement combinators using pattern matching, and pattern matching seems to go hand-in-hand with sum types which I regard as an essential language feature. But for those who still do a lot of work in C# and similar languages this means that we can implement these combinators in others ways (sometimes messy ways, without as much compiler/type system help) and get a lot out of useful, oft-pattern-matched types like Option and Either.

Conclusion

My experience with pattern matching has gone from not understanding why it was useful, then to wanting to use it everywhere, now to favouring combinators and avoiding having to dig in to the details of a type as much as possible. Using these operations defined over types gives me a nice, high-level way of thinking about building up these values.

Pattern-matching is still really useful, particularly for defining operations over a type, but in general I try to use those defined operations instead, only falling back to pattern matching in the cases where it is much simpler (for example: cases like let (a,b) = foo instead of let a = fst foo; let b = snd foo).

If you currently use pattern matching all the time, maybe try to pull out the repeated operations the pattern matches represent and see if you prefer that style. Operations like map, flatMap, apply, reduce/fold, and other combining functions along the lines of +, and, and or are good places to start.


  1. getOrElse is not part of the Option module in F#3, but thankfully we can add members to modules.

  2. To me using combinators like this is similar to how we tend to use classes. The internal details of the class are stored in private fields, and we define methods to interact with instances of that class without having to know the details of those fields. Combinators give us the same level of abstraction – we can access operations over a type without knowing the patterns / specific constructors of that type.

  3. …and every bit as easy as using an object with methods hanging off it, which is one valid way of implementing these combinator functions

Comments