+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+fir (State (xs,hs)) x =
+ (State (shiftInto x xs,hs), (x +> xs) *+* hs)
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:fir}
+ \end{example}
+\end{minipage}
+
+where the vector \hs{xs} contains the previous input samples, the vector
+\hs{hs} contains the \acro{FIR} coefficients, and \hs{x} is the current input
+sample. The concatenate operator (\hs{+>}) creates a new vector by placing the
+current sample (\hs{x}) in front of the previous samples vector (\hs{xs}). The
+code for the \hs{shiftInto} function, that adds the new input sample (\hs{x})
+to the list of previous input samples (\hs{xs}) and removes the oldest sample,
+is shown below:
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+shiftInto x xs = x +> init xs
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:shiftinto}
+ \end{example}
+\end{minipage}
+
+where the \hs{init} function returns all but the last element of a vector.
+The resulting netlist of a 4-taps \acro{FIR} filter, created by specializing
+the vectors of the \acro{FIR} code to a length of 4, is depicted in
+\Cref{img:4tapfir}.
+
+\begin{figure}
+\centerline{\includegraphics{4tapfir.svg}}
+\caption{4-taps \acrotiny{FIR} Filter}
+\label{img:4tapfir}
+\vspace{-1.5em}
+\end{figure}
+
+\subsection{Higher-order CPU}
+%format fun x = "\textit{fu}_" x
+This section discusses a somewhat more elaborate example in which user-defined
+higher-order function, partial application, lambda expressions, and pattern
+matching are exploited. The example concerns a \acro{CPU} which consists of
+four function units, \hs{fun 0,{-"\ldots"-},fun 3}, (see
+\Cref{img:highordcpu}) that each perform some binary operation.
+
+\begin{figure}
+\centerline{\includegraphics{highordcpu.svg}}
+\caption{CPU with higher-order Function Units}
+\label{img:highordcpu}
+\vspace{-1.5em}
+\end{figure}
+
+Every function unit has seven data inputs (of type \hs{Signed 16}), and two
+address inputs (of type \hs{Index 6}). The latter two addresses indicate
+which of the seven data inputs are to be used as operands for the binary
+operation the function unit performs.
+
+These seven data inputs consist of one external input \hs{x}, two fixed
+initialization values (0 and 1), and the previous outputs of the four function
+units. The output of the \acro{CPU} as a whole is the previous output of
+\hs{fun 3}.
+
+Function units \hs{fun 1}, \hs{fun 2}, and \hs{fun 3} can perform a fixed
+binary operation, whereas \hs{fun 0} has an additional input for an opcode to
+choose a binary operation out of a few possibilities. Each function unit
+outputs its result into a register, i.e., the state of the \acro{CPU}. This
+state can e.g. be defined as follows:
+
+\begin{code}
+type CpuState = State [Signed 16 | 4]
+\end{code}
+
+Every function unit can now be defined by the following higher-order function,
+\hs{fu}, which takes three arguments: the operation \hs{op} that the function
+unit should perform, the seven \hs{inputs}, and the address pair
+\hs{({-"a_0"-},{-"a_1"-})}. It selects two inputs, based on the
+addresses, and applies the given operation to them, returning the
+result:
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+fu op inputs ({-"a_0"-}, {-"a_1"-}) =
+ op (inputs!{-"a_0"-}) (inputs!{-"a_1"-})
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:functionunit}
+ \end{example}
+\end{minipage}
+
+\noindent Using partial application we now define:
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+fun 1 = fu add
+fun 2 = fu sub
+fun 3 = fu mul
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:functionunits1to3}
+ \end{example}
+\end{minipage}
+
+In order to define \hs{fun 0}, the \hs{Opcode} type and the \hs{multiop}
+function that chooses a specific operation given the opcode, are defined
+first. It is assumed that the binary functions \hs{shift} (where \hs{shift a
+b} shifts \hs{a} by the number of bits indicated by \hs{b}) and \hs{xor} (for
+the bitwise \hs{xor}) exist.
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+data Opcode = Shift | Xor | Equal
+
+multiop Shift = shift
+multiop Xor = xor
+multiop Equal = \a b -> if a == b then 1 else 0
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:multiop}
+ \end{example}
+\end{minipage}
+
+Note that the result of \hs{multiop} is a binary function; this is supported
+by \CLaSH. The complete definition of \hs{fun 0}, which takes an opcode as
+additional argument, is:
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+fun 0 c = fu (multiop c)
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:functionunit0}
+ \end{example}
+\end{minipage}
+
+\noindent Now comes the definition of the full \acro{CPU}. Its type is:
+
+\begin{code}
+cpu :: CpuState
+ -> (Signed 16, Opcode, [(Index 6, Index 6) | 4])
+ -> (CpuState, Signed 16)
+\end{code}
+
+\noindent Note that this type fits the requirements of the \hs{run}
+function (meaning it can be simulated and synthesized). The actual
+definition of the \hs{cpu} function is:
+
+\hspace{-1.7em}
+\begin{minipage}{0.93\linewidth}
+\begin{code}
+cpu (State s) (x,opc,addrs) = (State s', out)
+ where
+ inputs = x +> (0 +> (1 +> s))
+ s' = [{-"\;"-}fun 0 opc inputs (addrs!0)
+ ,{-"\;"-}fun 1 inputs (addrs!1)
+ ,{-"\;"-}fun 2 inputs (addrs!2)
+ ,{-"\;"-}fun 3 inputs (addrs!3)
+ ]
+ out = last s
+\end{code}
+\end{minipage}
+\begin{minipage}{0.07\linewidth}
+ \begin{example}
+ \label{code:cpu}
+ \end{example}
+\end{minipage}
+
+Due to space restrictions, \Cref{img:highordcpu} does not show the
+internals of each function unit, but note that e.g. \hs{multiop} is a
+subcomponent of \hs{fun 0}.
+
+While the \acro{CPU} has a simple (and maybe not very useful) design, it
+illustrates some possibilities that \CLaSH\ offers and suggests how to write
+actual designs.
+
+% Each of the function units has both its operands connected to all data
+% sources, and can be programmed to select any data source for either
+% operand. In addition, the leftmost function unit has an additional
+% opcode input to select the operation it performs. The previous output of the
+% rightmost function unit is the output of the entire \acro{CPU}.
+%
+% The code of the function unit (\ref{code:functionunit}), which arranges the
+% operand selection for the function unit, is shown below. Note that the actual
+% operation that takes place inside the function unit is supplied as the
+% (higher-order) argument \hs{op}, which is a function that takes two arguments.
+%
+%
+%
+% The \hs{multiop} function (\ref{code:multiop}) defines the operation that takes place in the leftmost function unit. It is essentially a simple three operation \acro{ALU} that makes good use of pattern matching and guards in its description. The \hs{shift} function used here shifts its first operand by the number of bits indicated in the second operand, the \hs{xor} function produces
+% the bitwise xor of its operands.
+%
+%
+% The \acro{CPU} function (\ref{code:cpu}) ties everything together. It applies
+% the function unit (\hs{fu}) to several operations, to create a different
+% function unit each time. The first application is interesting, as it does not
+% just pass a function to \hs{fu}, but a partial application of \hs{multiop}.
+% This demonstrates how one function unit can effectively get extra inputs
+% compared to the others.
+%
+% The vector \hs{inputs} is the set of data sources, which is passed to
+% each function unit as a set of possible operants. The \acro{CPU} also receives
+% a vector of address pairs, which are used by each function unit to select
+% their operand.
+% The application of the function units to the \hs{inputs} and
+% \hs{addrs} arguments seems quite repetitive and could be rewritten to use
+% a combination of the \hs{map} and \hs{zipwith} functions instead.
+% However, the prototype compiler does not currently support working with
+% lists of functions, so a more explicit version of the code is given instead.
+
+% While this is still a simple example, it could form the basis of an actual
+% design, in which the same techniques can be reused.