Quantcast
Channel: keeping simple
Viewing all articles
Browse latest Browse all 195

Some State Machines

$
0
0

\(\newcommand{\ess}{\epsilon}
\newcommand{\concat}{\mbox{ concat }}\)

 

(Download pdf version)

State maps

Maps from finite sequences of events to “outputs” offer an alternative representation of Moore machines \cite{Hopcroft} with some advantages particularly for constructing automata products which can be used to describe computer systems constructed by connecting components. For a set of events \(E\) and a set of outputs \(X\) a \emph{state map} is a map \(f:E^*\to X\) , where \(E^*\) is the set of finite sequences over \(E\). The value \(f(s)\) is the output produced in the state reached by following \(s\) from the initial state. A \emph{composite state map} has the form \(f(s) = h(f_1(u_1)\dots ,f_n(u_n))\) where each \(u_i=u_i(s)\) is a dependent variable: the event sequence for factor \(i\) determined by \(s\) and the structure of the composite system. State maps and composite state maps can often be specified using primitive recursion on sequences. Moore machines and direct, cascade \cite{Holcombe}, and general \cite{Gecseg} products of Moore machines are equivalent to state maps under simple transformations.

Moore machines

A Moore machine is a tuple consisting of a set of events (the event alphabet) \(E\), a (not necessarily finite) state set \(S\) with a “start state” \(\sigma_0\in S\), a set of outputs \(X\), a transition map \(\delta:S\times E\to S\) and an output map \(\lambda:S\to X\) .

\[ M = (E,\sigma_{0}, S, X, \delta, \lambda)\]

\(M\) can be converted to a state map \(f_M: E^*\to X\) by extending \(\delta\) to sequences. Let \(\ess\) be the null sequence and write \(s.e\) for the sequence obtained by appending event \(e\) to sequence \(s\) on the right. Every \(s\in E^*\) is constructed in a finite number of steps by appending events to the null sequence. Then \(\delta^*:S\times E^*\to S\) is given by:
\[\delta^*(\sigma,\ess) = \sigma, \quad \delta^*(\sigma, s.e) = \delta(\delta^*( \sigma,s), e).\]
So \(\delta^*\) is defined by primitive recursion on sequences.
Finally, let \(f_M(s) = \lambda \delta^* (\sigma_0,s) \). If \(M\) and \(N\) are Moore machines and \(f_M = f_N\) then \(M\) and \(N\) are the same up to differences in names of states or unreachable or duplicate states (minimization).
Constructing Moore machines from state maps is also simple. Let \(s\mbox{ concat }z\) indicate concatenation of sequences. The Myhill-Nerode equivalence\cite{Hopcroft} is:

\[s \sim_f\ q \mbox{ if and only if } (\forall z\in E^*)(f (s \mbox{ concat } z) = f (q\mbox{ concat }z)) \].

A state set \(S_f \) can be constructed from the set of equivalence classes \(\{s\}_f = \{z: z\in E^*, z\sim_f s\} \). The initial state is \(\sigma_0 = \{\ess\}_f\). Then
\[\delta_f(\{s\}_f, e )= \{s.e\}_f\mbox{ and }\lambda_f(\{s\}_f) = f s.\]
Starting with \(f\) and then constructing \(M_f\) using the Myhill equivalence and then back preserves equality: \(f \mapsto M_f\mapsto f_M=f\). Say \(f\) is finite state if and only if \(S_f\) is a finite set.

The next section gives a couple of illustrative examples of state systems and then I’ll show how products work.

Examples

State maps can be constructed using the usual function composition and primitive recursion on sequences. As an example, for fixed positive integer \(k\), and alphabet \(E = \{1,-1\}\) a mod\(k\) counter is given by

\(
\newcommand{\Counter}{\mbox{COUNTER}}
\newcommand{\nope}{\mathop{\emptyset}}
\newcommand{\STORE}{\mathop{\mbox{STORE}}}
\newcommand{\QUEUE}{\mbox{QUEUE}}
\newcommand{\ARRAY}{\mbox{ARRAY}}
\)

\[ \Counter_k(\ess) = 0; \Counter_k(s.e) = (\Counter_k(s) + e )\bmod k\]

In this case, COUNTER is defined on every sequence and, so, in every state.

A storage cell is even simpler (and we may not want to specify output in the initial state):
\[ \STORE(s.e) = e \]

Consider a state map \(Q\) that models a fifo queue of maximum length \(k\) so that \(Q(s)(i)\) is the content of cell \(i\) in the queue in the state determined by \(s\). The head of queue is at position 1 so \(Q(s)(1)\) is the first element in the state determined by \(s\). The queue can hold values from a set \(V\) plus a special value \(\nope\not\in V\) so \(Q(s)(i)=\nope\) when there is no value from \(V\) stored in cell \(i\). The event alphabet can be \(E = \{ deq,v\in V\}\) where event \(v\) enqueues \(v\) (puts it on the tail of the queue).

\[Q(\ess)(i) = \nope\]
\[Q(s.e)(i) = \begin{cases} e &\mbox{if }e\in V \mbox{ and } i\mbox{ is the least }0 < i \leq k \mbox{ s.t. } Q(s)(i)=\nope;\\ Q(s)(i+1) &\mbox{if }e=deq\mbox{ and }i < k\\ \nope &\mbox{if }e=deq \mbox{ and }i=k\\ Q(s)(i)&\mbox{otherwise}\end{cases}\]

Specifying this queue as a conventional state machine would be tedious.

Products

Standard function composition corresponds to the direct product:

\[ f(s) = h(f_1(s),\dots f_n(s))\]

Anything beyond the simplest direct product involves dependent variables for the sequences of the components.

\[ f(s) = h(f_1(u_1),\dots f_n(u_n))\]

where \(u_i = u_i(s)\) depends on \(s\) and on previous values of \(f\). In this work, these variables are always defined via primitive recursion

\[ u_i(s)=\ess,\quad u(s.e) = u_i(s)\concat \phi_i(e,f_1(u_1(s)),\dots, f_1(u_n(s)) \]

This form assured that the components start in the initial state (if \(s=\ess\) then \(u_i =\ess\)), past history is never revised (\(u_i(s)\) must be a prefix of \(u_i(s.e)\)) and only previous/current states are used to compute future states. Each \(\phi_i(e,x_1,\dots x_n)\) calculates the sequence of events \(e\) generates for component \(i\) given the outputs of the other components in the current state (the state determined by \(s\)). When the \(\phi_i\) only depend on \(e\), the components are connected in some variation of a direct (cross) product. If every \(\phi_i\) depends only on \(x_j\) where \(j < i\) then the product is a cascade product – much studied in Krohn-Rhodes theory. Otherwise, the product is a general product and, as in many interesting computer systems, there is two or more way communication between components. The conventional state machine representation of this product is conceptually simple, although awkward at scale (this is all slightly modified from a monograph by Gecseg \cite{Gecseg}, mostly to allow factor machines to change state at different rates). Given \(M_1,\dots M_n\) where \[ M_i = (E_i,\delta_{i,0}, S_i, X_i, \delta_i, \lambda_i)\] and a new alphabet \(E\) we want to form a product. We need also \(\phi_1,\dots \phi_n\) where each \[ \phi_i: E\times X_1\dots \times X_n\to E_i^*\] The state set of the new state machine is the cross product of the \(S_i\) so each state is a vector or tuple of \(n\) states \(\vec{\sigma}= (\sigma_1,\dots \sigma_n)\). The new output map is just the piecewise application of the output maps of the factors: \(\vec{x}=(\lambda_1\sigma_1,\dots \lambda_n \sigma_n)\). The new transition map \(\delta\) is the complex one so \(\delta(\vec{\sigma},e)=(\sigma’_1,\dots \sigma’_n)\) where \(\sigma’_i = \delta_i^*(\sigma_i,\phi_i(e,\vec{x}))\). Note that the outputs of the factors given by \(\vec{x}\) feeds back into the input of each factor. Clearly, if each \(M_i\) is finite state, then \(M\) must be finite state. If each \(f_i\mapsto M_i\) by Myhill equivalence and \(M\) is the general product of the \(M_i\) for some \(\phi_1,\dots \phi_n\) and \(f = (f_1(u_1),\dots f_n(u_n))\) with the same \(\phi_i\), then \(f = f_M\). Therefore if all the \(f_i\) are finite state, so is \(f\).

Product examples

A tiny incremental step past basic cross product adds some filtering based on events (that is essentially what the process algebras do and it’s quite limited). For example, if \(E = \{a,b\}\) we can connect two counters so that one counts a’s and the other counts b’s. (In this example as in some of the other simple examples, I will implicitly define \(\phi\) to make the notation simpler.) The \(u_i\) maps filter the input: \[u_i(\ess) = \ess,\quad u_i(s.e) = \begin{cases}u_i(s).1&\mbox{if }(e=a \mbox{ and } i=1)\mbox{ or } (e=b\mbox{ and } i=2)\\ u_i(s)&\mbox{otherwise}.\end{cases}\] So we could now define a state map to say whether more a’s or b’s have been encountered. \[ \mbox{More}(s) = \begin{cases} a&\mbox{if } \Counter(u_1(s)) > \Counter(u_2(s))\\ b&\mbox{if } \Counter(u_2(s)) > \Counter(u_1(s))\\ “neither” &\mbox{otherwise}\end{cases}\]

If we let \(E = \{(i,v): 0 < i \leq k, v\in V\}\) where \((i,v)\) means “write \(v\) to cell \(i\)” then:

\[ ARRAY(s)(i) = \STORE(u_i(s)), (0 < i \leq k)\]

\[u_i(\ess) = \ess,\quad u_i(s.e) = \begin{cases} u_i(s).v&\mbox{if }e=(i,v) \mbox{ for some } v\in V\\ u_i(s)&\mbox{otherwise}.\end{cases}\]

These are both still cross products since there is no interaction between the components. But let’s add “Left” to \(E\) to indicate “rotate left”

\[ ARRAY'(s)(i) = \STORE(u_i(s)), (0 < i \leq k)\]

\[u_i(\ess) = \ess,\quad u_i(s.e) = \begin{cases} u_i(s).v&\mbox{if }e=(i,v) \mbox{ for some } v\in V\\ u_i(s).\STORE(u_{i+1}(s))&\mbox{if } i< k\mbox{ and }e=Left\\ u_i(s).\STORE(u_1(s))&\mbox{if }i=k\mbox{ and }e=Left\\ u_i(s)&\mbox{otherwise}.\end{cases}\]

This is a general product because there is a circular information flow: cell 1 depends, indirectly, on cell \(k\) but sometimes cell \(k\) depends on cell 1.

Go back the the counters with event alphabet \(\{1,-1\}\) and let:

\[u(\ess)=\ess,\quad u(s.e) = \begin{cases}u(s).e &\mbox{if }(\Counter_k(s) = k-1\mbox{ and }e=1);\\ &\mbox{or }\Counter_k(s)=0\mbox{ and }e= -1\\u(s)&\mbox{otherwise}\end{cases}\]

Then \(g(s) = \Counter_k(s) + k*\Counter_k(u(s))\) counts mod \(k^2\). This is an example of cascade product: where the input to the second state machine depends only on the output of the first and there is no feedback.

We can turn the array into a queue using a counter to track how many elements are in the queue. Suppose components \(1,\dots k\) are stores and component \(k+1\) is a counter. The counter has to count mod \(k+1\) or more (it should not ever roll over).

\[QUEUE(s)(i)= \begin{cases} \mbox{Unassigned}&\mbox{if } i \geq \Counter_{k+1}(u_{k+1}(s))\\ \STORE(u_i(s))&\mbox{otherwise}\end{cases} \]

where the counter state change is determined by the counter value and the event:

\[ u_{k+1}(\ess)=\ess,\quad u_{k+1}(s.e) = \begin{cases} u_{k+1}(s).1 &\mbox{if } e = v\mbox{ for some }v\in V\mbox{ and }\Counter_{k+1}(u_{k+1}(s)) < k;\\ u_{k+1}(s).-1&\mbox{if }e=deq\mbox{ and } \Counter_{k+1}(u_{k+1}(s)) > 0;\\ c(s)&\mbox{otherwise}\end{cases}\]

and the stores are controlled by the counter and the event:

\[(0< i \leq k) u_i(\ess) =\ess\]
\[ (0 < i \leq k)u_i(s.e) = \begin{cases} u_i(s).v&\mbox{if }e=v\in V\mbox{ and }\Counter_{k+1}(u_{k+1}(s))=i \leq k; \\ store(u_{i+1}&\mbox{if }e=deq\mbox{ and }i < k\mbox{ and }\Counter_{k+1}(u_{k+1}(s))>0\end{cases}\]

Notes

state machines as maps and primitive recursion on sequences

This used to be fairly common in the literature. For example, Arbib’s text \cite{Arbib} introduces state machines as maps \(E^*\to X^*\) etc. Primitive recursion on sequences is obvious but not developed much anywhere except in Peter’s book on Recursion for Computer Science\cite{PeterComputer} and not applied to state machines there.

primitive recursive state machines

Say \(f:E^*\to X\) is primitive recursive if \(f(\ess) = c\) for some \(c\) and \(f(s.e) = h(e,f(s))\). In this case we can further reduce \(f\) to \(f(s) = \lambda(\beta(s))\) where \(\beta(\ess)=\sigma_0\) and \(\beta(s.e) = \delta(\beta(s),e)\). That is, \(\beta = \delta^*\) for \(M_f\) the state machine generated from \(f\) via the Myhill-Nerode equivalence.

Moore machines

Moore machines make an essential distinction between internal state and the visible state (the output). It seems like there should be some measure of interest in the ratio, for finite Moore machines, between the size of the output set and the size of the state set (both minimized).

disambiguation of \(u\)

All the dependent event sequences here are named \(u\) or \(u_i\) which could get confusing when an expression involves multiple product form state maps. One answer is to abuse the dot notation and if \(f(s) = (f_1(u_1),\dots f_n(u_n))\) use \(f.u_i \) to disambiguate. For example if \(f(s)=(g(u_1(s))\dots, g(u_n(s)))\) and \(g(s) = (h(u_1(s)))\dots h(u_m(s)))\) then
\(h(g.u_i(f.u_j(s)))\) should work.

Algebra

Of course \(E^*\) is the free monoid over \(E\) and making the Myhill equivalence two sided produces a monoid under concatenation of representative elements.

modularity

Any \(n\) state Moore machine can be factored into \(\log_2 n +1\) stores machines over \(0,1\) using the general product. Let \(k = \log_2 n +1\) and number the states of the original state machine \(1\dots k\). Let \(\phi_i(e,x_1, \dots x_{k}) \) compute \(j = \Sigma_{i=1}^k 2^{i-1}x_i\) and then \(\delta(\sigma_j,e)\) for the next global state \(\sigma_m\), and then extract the \(i^{th} \) bit from \(m\) and use that as the input to machine \(i\). So at each state, the stores encode the index of the state of the original state machine. Note that the entire state of the every factor is communicated every step. Parnas\cite{ParnasCriteria} defined modularity in terms of information hiding and this example shows why brute force factorization is not modularity. On the other hand, the factorization of the queue above into a counter and simple stores is highly modular and reduces complexity from the original queue state machine. We could measure the complexity of a factored machine to be the sum of the states of the components times some measure of the information carried over the connections.

nondeterminism and partial specifications requiring solutions

For more complex examples, it is often useful to use state map definitions that have a possibly infinite class of solutions. For example, say a state map \(\alpha:E^*\to E\) is a storage cell over \(E\) if and only if \(\alpha(s)\in E\) and \(\alpha(s.e) = e\). Then if \(f\) and \(g\) are both storage cells over \(E\) they can have different initial values. State maps correspond to deterministic state systems (and deterministic state machines) but we can get what most people mean by non-determinism using incomplete specifications that may be satisfied by a class of state maps. As another example, a network device over a set \(G\) of messages might only be specified by event alphabet and so any network device \(d(s)\in G\).

temporal logic

This work was initially motivated by an attempt to apply temporal logic (Manna and Pnueli) to understand operating systems. Temporal logic semantics is usually given by a state machine where each state is a map from variable names to values (an “assignment map”). The status of “next” is problematic in such semantics because it essentially requires that all components advance by the same unit of time and only change state on a tick, while “next state” for state maps means “after some event e”. Furthermore, the semantics of the assignment maps is purely axiomatic, while for state maps, the semantics is algorithmic.
In TLA \cite{LamportTLA}, it’s easy to see that the model and the formal methods basis adds complexity to deal with elementary properties of arithmetic, logic, and sets – e.g. excluded middle has to be explicitly specified by axiom.

process algebra

The foundational works on process algebra are marked by a dismissive and limited view of state machines. One well known author asserts that two state machines that accept the same language are necessarily indistinguishable from the point of view of automata theory. Several claim that there is no notion of parallel operation or of communicating components in state machine (while, for example, Hartmanis-Stearns book \cite{HartmanisStearns} on state machine factoring and Krohn-Rhodes theory precede process algebra by many years).

References

[1] Michael A. Arbib. Algebraic theory of machines, languages, and semi-groups. Academic Press, 1968.
[2]Ferenc Gecseg. Products of Automata. Monographs in Theoretical Computer Science. Springer Verlag, 1986.
[3] J. Hartmanis and R. E. Stearns. Algebraic Structure Theory of Sequential Machines. Prentice-Hall, 1966.
[4] W.M.L. Holcombe. Algebraic Automata Theory. Cambridge University Press, 1983.
[5] John E. Hopcroft and Jeffrey D. Ullman. Introduction to Automata Theory, Languages, and Computation. Read-
ing MA: Addison-Welsey, 1979.
[6] L. Lamport. “The Temporal Logic of Actions”. In: ACM Transactions on Programming Languages and Systems
(TOPLAS) 16.3 (May 1994), pp. 872–923.
5A PREPRINT – A UGUST 15, 2019
[7] D. L. Parnas. “On the Criteria to Be Used in Decomposing Systems into Modules”. In: Commun. ACM 15.12
(Dec. 1972), pp. 1053–1058. ISSN : 0001-0782. DOI : 10.1145/361598.361623. URL : http://doi.acm.org/
10.1145/361598.361623.
[8] Rozsa Peter. Recursive Functions in Computer Theory. Ellis Horwood Series in Computers and Their Applica-
tions, 1982.


Viewing all articles
Browse latest Browse all 195

Trending Articles