On the Abuse of IEnumerable

If it’s worth doing, it’s worth overdoing!

The abuse context in this case is the IEnumerable interface and the new (in C# 2.0) ‘yield’ statement that goes with it. In case you don’t know, it’s a trick hot feature that lets you easily return a sequence of things without having to create a temporary object. I like to think of it as being ‘foreachable’. Typical uses of this simply iterate over a single list. I’m going to show you to traverse a tree.

By way of example, lets start with the tree structure we used in the last article:

class Node {  
    Node[] children;  

Let’s throw things up a little this time and look at the client code first. What I want to be able to do is traverse over the tree as transparently as possible. We got pretty close last time, but this method will be better:

void TestTraversal()
    Node root = GenerateTestTree();

    int nodeCount = 0;
    foreach(Node n in root.Traverse)

This is my favorite kind of source code – it’s absolutely obvious what’s going on, with no indication of what needs to happen inside. Perhaps ‘Traverse’ could be a better name, though: ‘Nodes’, or something. But you get the idea.

To implement this, we’ll use an IEnumerable (foreach-able) property:

class Node {
    public Node[] children;

    public IEnumerable<Node> Traverse
            foreach(Node child in children)
                foreach(Node n in child.Traverse)
                    yield return n;

            yield return this;

This is the meat, and it bears further study. The two nested loops are the recursive step, with a twist: they must re-yield each child’s result up to the caller. This is hard to grasp at first. It may help to note that a more sensible way of saying the same thing would be:

foreach(Node child in children)
    yield return child.Traverse;

That is, in fact, what I tried to write the first time I did this. But you can’t yield anything to an IEnumerable but its defined type. This is clearly a design flaw in the C# language. The workaround for this is to iterate over the child and re-yield each item.

I have mixed feelings about this technique. Between it and the delegate vistor technique I wrote about previously, it’s really hard to say what’s better. For applications where clarity of implementation is most important, I lean towards the delegate version. It’s easy to explain to anybody who understand function pointers, which is most people.

But for applications where a good API is the most important, I like the IEnumerable version (this one). It’s as clean as an API can possibly be, and it retains all the run-time benefist of the visitor pattern: it uses no temporary storage, and the user’s code is run once at every node instead of stringing all the calls together at the end.


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s