Rules of the Source Tree

These are the fundamental rules I use when managing a source tree. To many these are a given, but others don’t know these things.

  1. There must be an automated build.
  2. If you break the build, fix it.
  3. Building the code must be trivial.
  4. The code should run from the source tree, with no installation steps required.
  5. Creating a new build machine be as easy as following the document you wrote saying how to do it.
  6. Things needed to build the code aside from the contents of the source tree must be minimal and well documented. Ideally, this is nothing more than your compiler and your build system.
  7. Source control is for source code. Checking in binaries is allowed only in the pursuit of #4, or when said binaries are a critical part of the software itself. Even then, tread carefully.
  8. If you can choose between a binary and textual representation for a source file, use text. Consider this when choosing your tools. (WiX good, InstallShield BAD)
  9. It must be trivial for developers to run the unit tests in the same way the automated build does.
  10. Be considerate. Think about your fellow developer.

My project is pretty good on this list, I think we’re at 7.5/10. I’ve had major headaches enforcing #7, to my great disappointment. And number 8 is a continuing struggle. We’ll get there, hopefully. The half point is from #10 – turns out it’s hard to create a culture of mindfulness out of nothing.

What’s the score on your project? Are you working any of these things?

Advertisements

On the Abuse of IEnumerable

If it’s worth doing, it’s worth overdoing!

The abuse context in this case is the IEnumerable interface and the new (in C# 2.0) ‘yield’ statement that goes with it. In case you don’t know, it’s a trick hot feature that lets you easily return a sequence of things without having to create a temporary object. I like to think of it as being ‘foreachable’. Typical uses of this simply iterate over a single list. I’m going to show you to traverse a tree.

By way of example, lets start with the tree structure we used in the last article:

class Node {  
    Node[] children;  
}

Let’s throw things up a little this time and look at the client code first. What I want to be able to do is traverse over the tree as transparently as possible. We got pretty close last time, but this method will be better:

void TestTraversal()
{
    Node root = GenerateTestTree();

    int nodeCount = 0;
    foreach(Node n in root.Traverse)
    {
        nodeCount++;
    }
}

This is my favorite kind of source code – it’s absolutely obvious what’s going on, with no indication of what needs to happen inside. Perhaps ‘Traverse’ could be a better name, though: ‘Nodes’, or something. But you get the idea.

To implement this, we’ll use an IEnumerable (foreach-able) property:

class Node {
    public Node[] children;

    public IEnumerable<Node> Traverse
    {
        get 
        {
            foreach(Node child in children)
                foreach(Node n in child.Traverse)
                    yield return n;

            yield return this;
        }
    }
}

This is the meat, and it bears further study. The two nested loops are the recursive step, with a twist: they must re-yield each child’s result up to the caller. This is hard to grasp at first. It may help to note that a more sensible way of saying the same thing would be:

// THIS WON'T BUILD!!!
foreach(Node child in children)
    yield return child.Traverse;

That is, in fact, what I tried to write the first time I did this. But you can’t yield anything to an IEnumerable but its defined type. This is clearly a design flaw in the C# language. The workaround for this is to iterate over the child and re-yield each item.

I have mixed feelings about this technique. Between it and the delegate vistor technique I wrote about previously, it’s really hard to say what’s better. For applications where clarity of implementation is most important, I lean towards the delegate version. It’s easy to explain to anybody who understand function pointers, which is most people.

But for applications where a good API is the most important, I like the IEnumerable version (this one). It’s as clean as an API can possibly be, and it retains all the run-time benefist of the visitor pattern: it uses no temporary storage, and the user’s code is run once at every node instead of stringing all the calls together at the end.

Delegates and Visitors

The GoF visitor pattern is useful but awkward. The usual c++ or Java-y versions typically involve some kind of visitor interface which your visitor must subclass. Java somewhat mitigates this with anonymous classes, but not fully. In C#, if we’re clever with the use of delegates, we can make tasks like tree traversal look about as simple as iteration.

Let’s start with a simple tree class:

class Node {
    Node[] children;
}

We’ll implement a depth-first search using the visitor pattern. Using delegates, we can eliminate the need for an additional class:

class Node {
    public Node[] children;

    public delegate void VisitorDelegate(Node currentNode);
    public void Traverse(VisitorDelegate visitor)
    {
        foreach(Node n in children)
        {
            Traverse(n);
        }
        visitor(this);
    }
}

This is a pretty simple recursion, and really isn’t anything special. The way delegates are typically used, there is still an additional method required for the delegate to be called. But with C# 2.0, we have the opportunity to use anonymous delegates. This is where we can write some really slick code:

void TestTraversal()
{
    Node root = GenerateTestTree();

    int nodeCount = 0;
    root.Traverse(delegate(Node n)
    {
        nodeCount++;
    });
}

The really cool thing about this is that, since anonymous delegates can act as closures, our visitor can pull in the nodeCount from the surrounding environment. This makes for a very succinct version of the visitor pattern.

Technorati Tags: ,