Brackets, braces, parentheses, and other such creatures

I’ve been fairly busy the last couple of weeks finishing up my current job and preparing for my new one (a really exciting role with a team practising Scrum, TDD etc.), so I haven’t been able to dedicate much time to the three potentially bloggable projects I am working on at the moment. The other day I found Part 13 of a series of coding challenges posted to dev102.com, and thought I’d give it a go as a quick little exercise.

Note: potential spoiler ahead, in the event you were intending to work through this yourself. That’s not to say my solution is even vaguely correct, but if you want to do it completely off your own bat then have a go before you have all my mistakes to help you :)

The problem

This aim of this challenge was to determine whether any given string has a legal bracket structure. Basically, make sure all the ‘(‘, ‘[‘, ‘{‘, and ’<' have matching ')', ']', '}' and '>’. We also need to take nesting into account, so that "({)}" is incorrect, and "({}<{}>)" will pass.

Part of the challenge is also to "Provide the most efficient, elegant and simple solution". I’m not even going to attempt that, but I’ll try and get something vaguely efficient and, much like this author, simple.

Some quick planning

Shahar Yair (who posted the challenge) notes "I think that this is a well known problem". I think (although without a good googling I can’t be sure due to my limited cranial capacity) one of the common implementations is to use a stack, and push brackets on as you find them, and pop them off when you close them. If you have an empty stack at the end your string is ok. If a closing bracket doesn’t match the bracket on the top of the stack then you have a problem. The beautiful image below (crafted in that great image editor, MS Word) shows the basic idea: pushing on open brackets and popping them off once they get closed.

Using a stack to keep track of open brackets.

In terms of efficiency, the worst case for this is when you have a valid string, as you will have had to checked every character at least once (to see if it is a bracket). Using a stack will let us make a single pass through the string.

With this basic plan in mind, let’s start writing some tests. We’re not doing real TDD here I guess, as we have an end implementation in mind, but I still feel more comfortable working with the protection of some tests. It will hopefully cut down the amount of silly mistakes I normally make :).

Starting small

Here are a few easy tests to start off with (I’m using XUnit.NET here. I’m kind of growing attached to the simplicity of it :)):

public class When_matching_brackets {
    BracketMatcher matcher = new BracketMatcher();
    [Fact]
    public void Empty_string_should_pass() {
        should_pass(String.Empty);
    }
    [Fact]
    public void Null_string_should_pass() {
        should_pass(null);
    }    
    private void should_pass(String input) {
        Assert.True(matcher.IsValid(input));
    }
    private void should_fail(String input) {
        Assert.False(matcher.IsValid(input));
    }    
}
public class BracketMatcher {
    public bool IsValid(string input) { return true; }
}

We can pass these by just returning true. Let’s write a test that will make this fail:

[Fact]
public void Single_bracket() {
    should_fail("(");
}

We can cheat again to pass this:

public class BracketMatcher {
    public bool IsValid(string input) { 
        if (input == null) return true;
        return input.Length < 1;
    }
}

To drive the next bit of our algorithm we can add another failing test:

[Fact]
public void Matched_brackets() {
    should_pass("()");
}

We could dodgy this up again, but let’s start writing something a bit more realistic – but only a little.

public class BracketMatcher {
    public bool IsValid(string input) {
        if (input == null) return true;
        int openBrackets = 0;

        foreach (var c in input) {
            if (c == '(') { openBrackets++; }
            else if (c == ')') { openBrackets--; }                                
        }
        return openBrackets == 0;
    }
}

This passes, but it is obviously going to incorrectly return true for an input like ")(". We’ll use this to write our next test. What we are doing here is slowly building up a number of cases that should have a specific result, and writing a minimal (and trivial) implementation to make the test cases pass. The actual time spent doing this is quite short. I doubt anyone would spend any more than a few minutes typing this out. What we gain is momentum, and a test suite for when the time comes to refactor. Maybe it’s just me, but if I were going straight to an implementation, I’d generally make great progress for a while but stuff up a sign or an end case or similar, and then spend and inordinate amount of time manually tracing through the code to find where I had gone wrong. The slow-and-steady approach here means less initial speed, but no speed bumps later, and usually a faster* trip overall.

* Caution: unsubstantiated anecdote! YMMV

Skipping ahead to our first major refactoring

After plugging away for a little while I’ve got the following test cases:

public class When_matching_brackets {
    BracketMatcher matcher = new BracketMatcher();

    [Fact]
    public void Empty_string_should_pass() {
        should_pass(String.Empty);
    }

    [Fact]
    public void Null_string_should_pass() {
        should_pass(null);
    }

    [Fact]
    public void Single_bracket_should_fail() {
        should_fail("(");
    }

    [Fact]
    public void Single_bracket_within_string_should_fail() {
        should_fail("abc(");
    }

    [Fact]
    public void Matched_brackets_should_pass() {
        should_pass("()");
    }

    [Fact]
    public void Matched_brackets_within_string_should_pass() {
        should_pass("This is (a test).");
    }

    [Fact]
    public void Matched_brackets_with_single_brace_should_fail() {
        should_fail("This {is (a test).");
    }

    [Fact]
    public void Matched_brackets_and_braces_should_pass() {
        should_pass("This {is} a (test).");
    }

    [Fact]
    public void Correct_number_but_wrong_nesting_of_brackets_and_braces_should_fail() {
        should_fail("This {is a (test}).");
    }

    private void should_pass(String input) {
        Assert.True(matcher.IsValid(input));
    }

    private void should_fail(String input) {
        Assert.False(matcher.IsValid(input));
    }
}

The last test (emphasised) is failing with the following implementation (which passes all the other tests), because it fails to take bracket nesting into account:

public class BracketMatcher {             
    public bool IsValid(string input) {
        if (input == null) return true;
        
        var bracketCount = 0;
        var braceCount = 0;
        foreach (var c in input) {                
            if (c == '(') {
                bracketCount++;
            } else if (c == ')') {
                bracketCount--;
            } else if (c == '{') {
                braceCount++;
            } else if (c == '}') {
                braceCount--;
            }
        }
        return bracketCount == 0 && braceCount == 0;
    }
}

You can also see I have duplicated the counters for braces and brackets, and we’ll only be adding to this problem when it comes time to add the other bracket types. Let’s refactor to remove this, and get in the stack idea we talked about earlier.

Aside: Strictly speaking this isn’t refactoring – I’m changing both design and behaviour in one big step (currently there is one failing test, afterward they all pass). In this case it works out ok, but I have gotten myself into trouble many, many times by doing this kind of premature refactoring (it’s a bad habit I’m trying to break). In general I’d suggest getting all your tests passing, then refactoring to remove duplication.
public class BracketMatcher {                  
    public bool IsValid(string input) {            
        if (input == null) return true;

        var bracketStack = new Stack<char>();
        foreach (var c in input) {                
            if (c == '(') {
                bracketStack.Push(c);
            } else if (c == ')') {
                if (bracketStack.Pop() != '(') return false;
            } else if (c == '{') {
                bracketStack.Push(c);
            } else if (c == '}') {
                if (bracketStack.Pop() != '{') return false;
            }
        }
        return bracketStack.Count == 0;
    }
}

This passes all our tests, but it’s still ugly. The pushes and pops obscure the algorithm we are using. Sure, you can pick it out by tracing through a few cases, but it would be nicer to get the gist of it by inspection alone. We also have some duplicated if (bracketStack.Pop() != '(') return false; style code happening, but now we are at least passing all our tests, we can continue to work toward removing the ugliness without breaking what we have.

public class BracketMatcher {
    private static Dictionary<char, char> bracketPairs
        = new Dictionary<char, char> { {'(', ')'}, {'{', '}'} };

    public bool IsValid(string input) {
        if (input == null) return true;

        var bracketStack = new Stack<char>();
        foreach (var c in input) {
            if (bracketPairs.ContainsKey(c)) {
                bracketStack.Push(c);
            } else if (bracketPairs.ContainsValue(c)) {
                char lastBracket = bracketStack.Pop();
                if (bracketPairs[lastBracket] != c) return false;
            }
        }
        return bracketStack.Count == 0;
    }
}

This version is a bit of an improvement: we have removed the duplication by adding a dictionary of bracket pairs. It still isn’t exactly intention revealing though with those key/value checks and pushing and popping. I’ll go out on a limb here though, and suggest that it’s much better than the implementation we opened this section with.

Finishing up

A few more tests, and a refactoring later, and here’s what I ended up with:

public class BracketMatcher {
    private static readonly Dictionary<char, char> bracketPairs
        = new Dictionary<char, char> { {'(', ')'}, {'{', '}'}, {'[', ']'}, {'<', '>'} };

    private Stack<char> bracketStack;

    public bool IsValid(string input) {        
        if (input == null) return true;
        clearOpenedBrackets();
        foreach (var c in input) {
            if (isOpenBracket(c)) {
                recordBracketOpening(c);
            } else if (isCloseBracket(c)) {
                if (!canCloseLastOpenBracket(c)) return false;
                closeLastOpenedBracket();                    
            }
        }
        return areAllBracketsClosed();
    }

    private void clearOpenedBrackets() {
      bracketStack = new Stack<char>();
    }

    private bool areAllBracketsClosed() {
      return bracketStack.Count == 0;
    }

    private void closeLastOpenedBracket() {
        bracketStack.Pop();
    }

    private bool canCloseLastOpenBracket(char bracket) {
        if (areAllBracketsClosed()) return false;
        char lastOpenBracket = bracketStack.Peek();
        char closingBracket = bracketPairs[lastOpenBracket];
        return bracket == closingBracket;
    }

    private void recordBracketOpening(char bracket) {
        bracketStack.Push(bracket);
    }

    private bool isOpenBracket(char c) {
        return bracketPairs.ContainsKey(c);
    }

    private bool isCloseBracket(char c) {
        return bracketPairs.ContainsValue(c);
    }
}

I think IsValid(...) is actually reading pretty well here. The intent of the operation is fairly clear from a quick read through, with no implementation details like pops and pushes scattered around the code. The implementation itself is pushed to fairly simple helper methods.

The canCloseLastOpenBracket(...) method is a bit ugly, but it is only four lines, has descriptive variable names, and is tucked away in a private method. It’s also a bit inefficient, as we are doing an unecessary Peek() before Pop()ing the value off the stack. I rewrote the code several ways, but any approach that combined the Pop() with matching the brackets tended to obscure what the code was doing. I’ve had a minor method-count explosion, but they are all small and self-explanatory (less lines that commenting the previous version in any event ;)).

We might be able to simplify this in a number of ways, like putting the required closing brackets on the stack to make the matching more straightforward, but the other approaches I tried all seemed to compromise readability to some extent, so I ended up sticking with the verbosity of this version. I’m sure you can do better (leave a comment!), but for now it’s passing the tests thrown at it and I can live with how it reads. Which means we’re as good as done. :)

Speaking of done, here’s the test suite it passes:

public class When_matching_brackets {
    BracketMatcher matcher = new BracketMatcher();

    [Fact]
    public void Empty_string() {
        should_pass(String.Empty);
    }

    [Fact]
    public void Null_string() {
        should_pass(null);
    }

    [Fact]
    public void Single_bracket() {
        should_fail("(");
    }

    [Fact]
    public void Single_closing_bracket() {
        should_fail(")");
    }

    [Fact]
    public void Single_bracket_within_string() {
        should_fail("abc(");
    }

    [Fact]
    public void Matched_brackets() {
        should_pass("()");
    }

    [Fact]
    public void Matched_brackets_within_string() {
        should_pass("This is (a test).");
    }

    [Fact]
    public void Missing_a_brace() {
        should_fail("This {is (a test).");
    }

    [Fact]
    public void Pairs_of_matched_brackets_and_braces() {
        should_pass("This {is} a (test).");
    }

    [Fact]
    public void Incorrect_nesting_of_brackets_and_braces() {
        should_fail("This {is a (test}).");
    }

    [Fact]
    public void Valid_nesting_with_multiple_types() {
        should_pass("<Hello (World), ({how} are) you today>?");
    }

    [Fact]
    public void Multiple_pairs() {
        should_pass("(Hello) {World}, <how> are (you) today?");
    }

    [Fact]
    public void Multiple_lines_with_valid_brackets() {
        should_pass(@"(Hello {World,
                        How are you}
                        ?)");
    }

    [Fact]
    public void Multiple_lines_with_invalid_brackets() {
        should_fail(@"(Hello {World,
                        How are you)
                        ?)");
    }

    [Fact]
    public void Valid_example_from_problem_statement() {
        should_pass("([](<{}>))");
    }

    [Fact]
    public void Invalid_example_from_problem_statement() {
        should_fail("({<)>}");
    }

    [Fact]
    public void Matching_square_brackets() {
        should_pass("[asdf]");
    }

    [Fact]
    public void Single_square_bracket() {
        should_fail("[");
    }

    [Fact]
    public void Single_angle_bracket() {
        should_fail("<");
    }
    [Fact]
    public void Matching_angle_brackets() {
        should_pass("<>");  
    }

    private void should_pass(String input) {
        Assert.True(matcher.IsValid(input));
    }

    private void should_fail(String input) {
        Assert.False(matcher.IsValid(input));
    }
}

Was a bit of fun for a half hour or so :) Please feel free to leave a comment if you have any suggested improvements (or better yet, some failing tests!).

Comments