Monday, August 26, 2013

A generic implementation of a finite state machine

A finite state machine (FSM for short) can be informally described as a network of states, like "the elevator door is open" and the connections between them. There is one state that is the start state and there might be one or more stop states (or none). A FSM has a current state, which can change as a reaction to an event.

Let me clarify things with an example: a very simple interface for a music player. This music player has two states: it can play music, or it can be silent. The starting state is the idle one:

  +------+     +---------+
  | idle |     | playing |
  +------+     +---------+

Put more formally, the network of states is a directed graph of nodes (aka states). The connections between the nodes are called edges, and they have a preferred direction - there will be FSMs where you can get from state A to state B, but not back to state A. (If you want to be able to get back to state A, simply add another directed edge from B to A.)

How does the current state change? To continue with the previous example, the state of the music player changes when you press a button. Here is an example:

  +------+ --- press Play --> +---------+
  | idle |                    | playing |
  +------+ <-- press Stop --- +---------+

The above schema is a compact representation of the following rules:

  • if the current state is idle and the Play button is pressed, change the state to playing
  • if the current state is playing and the Stop button is pressed, change the state to idle

Of course, just changing the state of the player doesn't help that much; we need to actually start (or stop) playing the music. A FSM helps by having actions associated with each state change:

             [start playing]
  +------+ --- press Play --> +---------+
  | idle |                    | playing |
  +------+ <-- press Stop --- +---------+
             [stop playing]

This would be the equivalent of

  • if the current state is idle and the Play button is pressed, start playing; change the state to playing
  • if the current state is playing and the Stop button is pressed, stop playing; change the state to idle

Each event can have zero or more actions associated with it; very importantly, an event doesn't have to change the current state. For example, what should happen when the player is idle and you press stop? Nothing; the current state remains idle.

                         [start playing]
              +------+ --- Play --> +---------+ --- Play --+
 +----------> | idle |              | playing | <----------+
 +-- Stop --- +------+ <-- Stop --- +---------+
                         [stop playing]

Note that, in the previous version of the FSM, pressing Play while playing (or Stop while not playing) would have been an error; the implementation I'm going to write would have thrown an exception in that case. (I will make that a boolean property, but I believe I would keep that active, as the alternative is to simply "eat" unexpected events without a trace. I think that would lead to nasty, hard-to-trace bugs; if you truly don't care about unknown events for a given state, just add a default handler that does nothing and leaves the current state unchanged.)

That reminds me: in addition to the outgoing edges associated with an event, a state can also have a default event handler, that is activated if an event occurs but it doesn't match any of the others. (I will implement the event handlers as a list of predicates Predicate<T> associated with each state, each predicate determining whether a given event has occurred. In the case of the music player above, the T will be the button that was pressed, and the predicates will determine whether it was the Play or the Stop button.)

To give an example of a FSM with a genuine need for a default, do-nothing handler, consider a FSM that needs to count the number of times each letter occurs in a given string. That FSM only cares about three cases: the next character is a letter, the string has been completely parsed, and "anything else" (in which case, do nothing). Note also that a default case does not necessarily mean "do nothing": I could change the previous FSM to individually count each letter BUT also count how many non-letter characters occur overall.

Anyway. The above description should be enough to write the code for representing an edge in this graph:

  public class Edge<T>
  {
    public Predicate<T> Predicate { get; private set; }
    public Action<T> Action { get; private set; }
    public Node<T> Node { get; private set; }

    public Edge(Predicate<T> predicate, Action<T> action, Node<T> node)
    {
      Predicate = predicate;
      Action = action;
      Node = node;
    }
  }

(If you have seen my previous article you might have spotted the if possible, move logic out of code and into data mantra. An Edge object is the equivalent of a case in a switch statement; it's another way of moving logic from code to data.)

The Node class is also quite simple:

  public class Node<T>
  {
    public string State { get; private set; }

    public Node(string state)
    {
      State = state;
      edges = new List<Edge<T>>();
    }

    public void Add(Edge<T> edge)
    {
      edges.Add(edge);
    }

    public Node<T> Handle(T input)
    {
      var edge = edges.FirstOrDefault(e => e.Predicate != null && e.Predicate(input)) ??
                 edges.FirstOrDefault(e => e.Predicate == null);
      if (edge == null)
        return null;

      if (edge.Action != null)
        edge.Action(input);

      return edge.Node;
    }

    //

    private readonly List<Edge<T>> edges;
  }

Finally, here's the FSM implementation:

  public class FSM<T>
  {
    public bool IgnoreErrors { get; set; }

    public FSM(Node<T> start)
    {
      current = start;
    }

    public void Handle(T input)
    {
      var next = current.Handle(input);
      if (next != null)
        current = next;
      else if (!IgnoreErrors)
        throw new Exception(string.Format("Unknown input [{0}]", input));
    }

    //

    private Node<T> current;
  }

Here's an example of using this FSM implementation to parse a number:

    private static void Main(string[] args)
    {
      var sign = 1;
      var number = 0.0;
      var divisor = 0.0;

      var start = new Node<char>("start");
      var wholePart = new Node<char>("int part");
      var fractPart = new Node<char>("decimal part");

      start.Add(new Edge<char>(c => c == '+', null, wholePart));
      start.Add(new Edge<char>(c => c == '-', c => sign = -1, wholePart));
      start.Add(new Edge<char>(char.IsDigit, c => number = number * 10 + (c - '0'), wholePart));
      start.Add(new Edge<char>(c => c == '.', c => divisor = 0.1, fractPart));

      wholePart.Add(new Edge<char>(char.IsDigit, c => number = number * 10 + (c - '0'), wholePart));
      wholePart.Add(new Edge<char>(c => c == '.', c => divisor = 0.1, fractPart));

      fractPart.Add(new Edge<char>(char.IsDigit, c =>
      {
        number += divisor * (c - '0');
        divisor /= 10;
      }, fractPart));

      do
      {
        sign = 1;
        number = 0.0;
        divisor = 0.0;

        Console.Write("Enter a number: ");
        var s = Console.ReadLine();
        if (string.IsNullOrEmpty(s))
          break;

        var fsm = new FSM<char>(start);
        foreach (var c in s)
          fsm.Handle(c);

        Console.WriteLine("The number is " + sign * number);
      } while (true);
    }

The result of a run would look like this:

Enter a number: -.1
The number is -0.1
Enter a number: .1
The number is 0.1
Enter a number: 1.23
The number is 1.23
Enter a number: -12.34
The number is -12.34
Enter a number: +56789.0
The number is 56789
Enter a number: 1.2.3

Unhandled Exception: System.Exception: Unknown input [.]
Press any key to continue . . .

Of course, now that I am using the code, I find that it would help to simplify the syntax. One place would be in the Node class; I don't need to explicitly create the Edge instance (the Node.Add method can do that) and I can also simplify the case where the predicate simply checks for equality:

    public void Add(Predicate<T> predicate, Action<T> action, Node<T> node = null)
    {
      edges.Add(new Edge<T>(predicate, action, node ?? this));
    }

    public void Add(T value, Action<T> action, Node<T> node = null)
    {
      Add(it => it.Equals(value), action, node);
    }

This allows me to simplify the creation of the nodes in the Main method to:

      start.Add('+', null, wholePart);
      start.Add('-', c => sign = -1, wholePart);
      start.Add(char.IsDigit, c => number = number * 10 + (c - '0'), wholePart);
      start.Add('.', c => divisor = 0.1, fractPart);

      wholePart.Add(char.IsDigit, c => number = number * 10 + (c - '0'));
      wholePart.Add('.', c => divisor = 0.1, fractPart);

      fractPart.Add(char.IsDigit, c =>
      {
        number += divisor * (c - '0');
        divisor /= 10;
      });

I am thinking of making a couple more changes to this code, like a Reset method in the FSM and possibly a couple of helper methods to avoid creating explicit Node objects, something like

  fsm.Add("start", '+', null, "whole");
  fsm.Add("start", char.IsDigit, c => number * 10 + (c - '0'), "whole");
  fsm.Add("whole", '.', c => divisor = 0.1, "fract");

and so on, but all in all I am pretty happy with how the code looks right now.

No comments: