+ \subsection{Polymorphism}
+ A powerful feature of most (functional) programming languages is
+ polymorphism, it allows a function to handle values of different data
+ types in a uniform way. Haskell supports \emph{parametric
+ polymorphism}~\cite{polymorphism}, meaning functions can be written
+ without mention of any specific type and can be used transparently with
+ any number of new types.
+
+ As an example of a parametric polymorphic function, consider the type of
+ the following \hs{append} function, which appends an element to a
+ vector:\footnote{The \hs{::} operator is used to annotate a function
+ with its type in \CLaSH}
+
+ \begin{code}
+ append :: [a|n] -> a -> [a|n + 1]
+ \end{code}
+
+ This type is parameterized by \hs{a}, which can contain any type at
+ all. This means that \hs{append} can append an element to a vector,
+ regardless of the type of the elements in the list (as long as the type of
+ the value to be added is of the same type as the values in the vector).
+ This kind of polymorphism is extremely useful in hardware designs to make
+ operations work on a vector without knowing exactly what elements are
+ inside, routing signals without knowing exactly what kinds of signals
+ these are, or working with a vector without knowing exactly how long it
+ is. Polymorphism also plays an important role in most higher order
+ functions, as we will see in the next section.
+
+ Another type of polymorphism is \emph{ad-hoc
+ polymorphism}~\cite{polymorphism}, which refers to polymorphic
+ functions which can be applied to arguments of different types, but which
+ behave differently depending on the type of the argument to which they are
+ applied. In Haskell, ad-hoc polymorphism is achieved through the use of
+ type classes, where a class definition provides the general interface of a
+ function, and class instances define the functionality for the specific
+ types. An example of such a type class is the \hs{Num} class, which
+ contains all of Haskell's numerical operations. A designer can make use
+ of this ad-hoc polymorphism by adding a constraint to a parametrically
+ polymorphic type variable. Such a constraint indicates that the type
+ variable can only be instantiated to a type whose members supports the
+ overloaded functions associated with the type class.
+
+ As an example we will take a look at type signature of the function
+ \hs{sum}, which sums the values in a vector:
+ \begin{code}
+ sum :: Num a => [a|n] -> a
+ \end{code}
+
+ This type is again parameterized by \hs{a}, but it can only contain
+ types that are \emph{instances} of the \emph{type class} \hs{Num}, so that
+ we know that the addition (+) operator is defined for that type.
+ \CLaSH's built-in numerical types are also instances of the \hs{Num}
+ class, so we can use the addition operator (and thus the \hs{sum}
+ function) with \hs{SizedWords} as well as with \hs{SizedInts}.
+
+ In \CLaSH, parametric polymorphism is completely supported. Any function
+ defined can have any number of unconstrained type parameters. The \CLaSH\
+ compiler will infer the type of every such argument depending on how the
+ function is applied. There is however one constraint: the top level
+ function that is being translated can not have any polymorphic arguments.
+ The arguments can not be polymorphic as the function is never applied and
+ consequently there is no way to determine the actual types for the type
+ parameters.
+
+ \CLaSH\ does not support user-defined type classes, but does use some
+ of the standard Haskell type classes for its built-in function, such as:
+ \hs{Num} for numerical operations, \hs{Eq} for the equality operators, and
+ \hs{Ord} for the comparison/order operators.
+
+ \subsection{Higher-order functions \& values}
+ Another powerful abstraction mechanism in functional languages, is
+ the concept of \emph{higher-order functions}, or \emph{functions as
+ a first class value}. This allows a function to be treated as a
+ value and be passed around, even as the argument of another
+ function. The following example should clarify this concept:
+
+ \begin{code}
+ negateVector xs = map not xs
+ \end{code}
+
+ The code above defines the \hs{negateVector} function, which takes a
+ vector of booleans, \hs{xs}, and returns a vector where all the values are
+ negated. It achieves this by calling the \hs{map} function, and passing it
+ \emph{another function}, boolean negation, and the vector of booleans,
+ \hs{xs}. The \hs{map} function applies the negation function to all the
+ elements in the vector.
+
+ The \hs{map} function is called a higher-order function, since it takes
+ another function as an argument. Also note that \hs{map} is again a
+ parametric polymorphic function: it does not pose any constraints on the
+ type of the input vector, other than that its elements must have the same type as
+ the first argument of the function passed to \hs{map}. The element type of the
+ resulting vector is equal to the return type of the function passed, which
+ need not necessarily be the same as the element type of the input vector.
+ All of these characteristics can readily be inferred from the type
+ signature belonging to \hs{map}:
+
+ \begin{code}
+ map :: (a -> b) -> [a|n] -> [b|n]
+ \end{code}
+
+ So far, only functions have been used as higher-order values. In
+ Haskell, there are two more ways to obtain a function-typed value:
+ partial application and lambda abstraction. Partial application
+ means that a function that takes multiple arguments can be applied
+ to a single argument, and the result will again be a function (but
+ that takes one argument less). As an example, consider the following
+ expression, that adds one to every element of a vector:
+
+ \begin{code}
+ map (+ 1) xs
+ \end{code}
+
+ Here, the expression \hs{(+ 1)} is the partial application of the
+ plus operator to the value \hs{1}, which is again a function that
+ adds one to its (next) argument. A lambda expression allows one to introduce an
+ anonymous function in any expression. Consider the following expression,
+ which again adds one to every element of a vector:
+
+ \begin{code}
+ map (\x -> x + 1) xs
+ \end{code}
+
+ Finally, not only built-in functions can have higher order
+ arguments, but any function defined in \CLaSH can have function
+ arguments. This allows the hardware designer to use a powerful
+ abstraction mechanism in his designs and have an optimal amount of
+ code reuse. The only exception is again the top-level function: if a
+ function-typed argument is not applied with an actual function, no
+ hardware can be generated.
+
+ % \comment{TODO: Describe ALU example (no code)}
+
+ \subsection{State}
+ A very important concept in hardware is the concept of state. In a
+ stateful design, the outputs depend on the history of the inputs, or the
+ state. State is usually stored in registers, which retain their value
+ during a clock cycle. As we want to describe more than simple
+ combinational designs, \CLaSH\ needs an abstraction mechanism for state.
+
+ An important property in Haskell, and in most other functional languages,
+ is \emph{purity}. A function is said to be \emph{pure} if it satisfies two
+ conditions:
+ \begin{inparaenum}
+ \item given the same arguments twice, it should return the same value in
+ both cases, and
+ \item when the function is called, it should not have observable
+ side-effects.
+ \end{inparaenum}
+ % This purity property is important for functional languages, since it
+ % enables all kinds of mathematical reasoning that could not be guaranteed
+ % correct for impure functions.
+ Pure functions are as such a perfect match for combinaionial circuits,
+ where the output solely depends on the inputs. When a circuit has state
+ however, it can no longer be simply described by a pure function.
+ % Simply removing the purity property is not a valid option, as the
+ % language would then lose many of it mathematical properties.
+ In \CLaSH\ we deal with the concept of state in pure functions by making
+ current value of the state an additional argument of the function and the
+ updated state part of result. In this sense the descriptions made in
+ \CLaSH\ are the combinaionial parts of a mealy machine.
+
+ A simple example is adding an accumulator register to the earlier
+ multiply-accumulate circuit, of which the resulting netlist can be seen in
+ \Cref{img:mac-state}:
+
+ \begin{code}
+ macS (State c) a b = (State c', c')
+ where
+ c' = mac a b c
+ \end{code}
+
+ \begin{figure}
+ \centerline{\includegraphics{mac-state.svg}}
+ \caption{Stateful Multiply-Accumulate}
+ \label{img:mac-state}
+ \end{figure}
+
+ The \hs{State} keyword indicates which arguments are part of the current
+ state, and what part of the output is part of the updated state. This
+ aspect will also be reflected in the type signature of the function.
+ Abstracting the state of a circuit in this way makes it very explicit:
+ which variables are part of the state is completely determined by the
+ type signature. This approach to state is well suited to be used in
+ combination with the existing code and language features, such as all the
+ choice elements, as state values are just normal values. We can simulate
+ stateful descriptions using the recursive \hs{run} function:
+
+ \begin{code}
+ run f s (i : inps) = o : (run f s' inps)
+ where
+ (s', o) = f s i
+ \end{code}
+
+ The \hs{(:)} operator is the list concatenation operator, where the
+ left-hand side is the head of a list and the right-hand side is the
+ remainder of the list. The \hs{run} function applies the function the
+ developer wants to simulate, \hs{f}, to the current state, \hs{s}, and the
+ first input value, \hs{i}. The result is the first output value, \hs{o},
+ and the updated state \hs{s'}. The next iteration of the \hs{run} function
+ is then called with the updated state, \hs{s'}, and the rest of the
+ inputs, \hs{inps}. It is assumed that there is one input per clock cycle.
+ Also note how the order of the input, output, and state in the \hs{run}
+ function corresponds with the order of the input, output and state of the
+ \hs{macS} function described earlier.
+
+ As both the \hs{run} function, the hardware description, and the test
+ inputs are plain Haskell, the complete simulation can be compiled to an
+ executable binary by an optimizing Haskell compiler, or executed in an
+ Haskell interpreter. Both simulation paths are much faster than first
+ translating the description to \VHDL\ and then running a \VHDL\
+ simulation, where the executable binary has an additional simulation speed
+ bonus in case there is a large set of test inputs.
+
+\section{\CLaSH\ compiler}
+An important aspect in this research is the creation of the prototype compiler, which allows us to translate descriptions made in the \CLaSH\ language as described in the previous section to synthesizable \VHDL, allowing a designer to actually run a \CLaSH\ design on an \acro{FPGA}.
+
+The Glasgow Haskell Compiler (\GHC) is an open-source Haskell compiler that
+also provides a high level API to most of its internals. The availability of
+this high-level API obviated the need to design many of the tedious parts of
+the prototype compiler, such as the parser, semantic checker, and especially
+the type-checker. The parser, semantic checker, and type-checker together form
+the front-end of the prototype compiler pipeline, as depicted in
+\Cref{img:compilerpipeline}.
+
+\begin{figure}
+\centerline{\includegraphics{compilerpipeline.svg}}
+\caption{\CLaSHtiny\ compiler pipeline}
+\label{img:compilerpipeline}
+\end{figure}
+
+The output of the \GHC\ front-end is the original Haskell description
+translated to \emph{Core}~\cite{Sulzmann2007}, which is smaller, typed,
+functional language that is relatively easier to process than the larger Haskell
+language. A description in \emph{Core} can still contain properties which have
+no direct translation to hardware, such as polymorphic types and
+function-valued arguments. Such a description needs to be transformed to a
+\emph{normal form}, which only contains properties that have a direct
+translation. The second stage of the compiler, the \emph{normalization} phase
+exhaustively applies a set of \emph{meaning-preserving} transformations on the
+\emph{Core} description until this description is in a \emph{normal form}.
+This set of transformations includes transformations typically found in
+reduction systems for lambda calculus~\cite{lambdacalculus}, such a
+$\beta$-reduction and $\eta$-expansion, but also includes self-defined
+transformations that are responsible for the reduction of higher-order
+functions to `regular' first-order functions.
+
+The final step in the compiler pipeline is the translation to a \VHDL\
+\emph{netlist}, which is a straightforward process due to resemblance of a
+normalized description and a set of concurrent signal assignments. We call the
+end-product of the \CLaSH\ compiler a \VHDL\ \emph{netlist} as the resulting
+\VHDL\ resembles an actual netlist description and not idiomatic \VHDL.
+
+\section{Use cases}
+
+\subsection{FIR Filter}
+\label{sec:usecases}
+As an example of a common hardware design where the use of higher-order
+functions leads to a very natural description is a FIR filter, which is
+basically the dot-product of two vectors:
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_{t - i} \cdot h_i }
+\end{equation}
+
+A FIR filter multiplies fixed constants ($h$) with the current
+and a few previous input samples ($x$). Each of these multiplications
+are summed, to produce the result at time $t$. The equation of a FIR
+filter is indeed equivalent to the equation of the dot-product, which is
+shown below:
+
+\begin{equation}
+\mathbf{a}\bullet\mathbf{b} = \sum\nolimits_{i = 0}^{n - 1} {a_i \cdot b_i }
+\end{equation}
+
+We can easily and directly implement the equation for the dot-product
+using higher-order functions:
+
+\begin{code}
+as *+* bs = foldl1 (+) (zipWith (*) as bs)
+\end{code}
+
+The \hs{zipWith} function is very similar to the \hs{map} function seen
+earlier: It takes a function, two vectors, and then applies the function to
+each of the elements in the two vectors pairwise (\emph{e.g.}, \hs{zipWith (*)
+[1, 2] [3, 4]} becomes \hs{[1 * 3, 2 * 4]}).
+
+The \hs{foldl1} function takes a binary function, a single vector, and applies
+the function to the first two elements of the vector. It then applies the
+function to the result of the first application and the next element in the
+vector. This continues until the end of the vector is reached. The result of
+the \hs{foldl1} function is the result of the last application. It is obvious
+that the \hs{zipWith (*)} function is pairwise multiplication and that the
+\hs{foldl1 (+)} function is summation.
+
+Returning to the actual FIR filter, we will slightly change the equation
+describing it, so as to make the translation to code more obvious and concise.
+What we do is change the definition of the vector of input samples and delay
+the computation by one sample. Instead of having the input sample received at
+time $t$ stored in $x_t$, $x_0$ now always stores the newest sample, and $x_i$
+stores the $ith$ previous sample. This changes the equation to the following
+(note that this is completely equivalent to the original equation, just with a
+different definition of $x$ that will better suit the transformation to code):
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot h_i }
+\end{equation}
+
+The complete definition of the FIR filter in code then becomes:
+
+\begin{code}
+fir (State (xs,hs)) x = (State (x >> xs,hs), xs *+* hs)
+\end{code}
+
+Where the vector \hs{hs} contains the FIR coefficients and the vector \hs{xs}
+contains the latest input sample in front and older samples behind. The code
+for the shift (\hs{>>}) operator that adds the new input sample (\hs{x}) to
+the list of previous input samples (\hs{xs}) and removes the oldest sample is
+shown below:
+
+\begin{code}
+x >> xs = x +> init xs
+\end{code}
+
+The \hs{init} function returns all but the last element of a vector, and the
+concatenate operator ($\succ$) adds a new element to the front of a vector. The
+resulting netlist of a 4-taps FIR filter, created by specializing the vectors of the above definition to a length of 4, is depicted in \Cref{img:4tapfir}.
+
+\begin{figure}
+\centerline{\includegraphics{4tapfir.svg}}
+\caption{4-taps \acrotiny{FIR} Filter}
+\label{img:4tapfir}
+\end{figure}
+
+\subsection{Higher order CPU}
+
+\begin{code}
+type FuState = State Word
+fu :: (a -> a -> a)
+ -> [a]:n
+ -> (RangedWord n, RangedWord n)
+ -> FuState
+ -> (FuState, a)
+fu op inputs (addr1, addr2) (State out) =
+ (State out', out)
+ where
+ in1 = inputs!addr1
+ in2 = inputs!addr2
+ out' = op in1 in2
+\end{code}
+
+\begin{code}
+type CpuState = State [FuState]:4
+cpu :: Word
+ -> [(RangedWord 7, RangedWord 7)]:4
+ -> CpuState
+ -> (CpuState, Word)
+cpu input addrs (State fuss) =
+ (State fuss', out)
+ where
+ fures = [ fu const inputs!0 fuss!0
+ , fu (+) inputs!1 fuss!1
+ , fu (-) inputs!2 fuss!2
+ , fu (*) inputs!3 fuss!3
+ ]
+ (fuss', outputs) = unzip fures
+ inputs = 0 +> 1 +> input +> outputs
+ out = head outputs
+\end{code}