- completely new type, for which we provide the \VHDL\ translation
- below. Type synonyms and renamings only define new names for
- existing types (where synonyms are completely interchangeable and
- renamings need explicit conversion). Therefore, these do not need
- any particular \VHDL\ translation, a synonym or renamed type will
- just use the same representation as the original type. The
- distinction between a renaming and a synonym does no longer matter
- in hardware and can be disregarded in the generated \VHDL.
-
- For algebraic types, we can make the following distinction:
-
- \begin{description}
-
- \item[Product types]
- A product type is an algebraic datatype with a single constructor with
- two or more fields, denoted in practice like (a,b), (a,b,c), etc. This
- is essentially a way to pack a few values together in a record-like
- structure. In fact, the built-in tuple types are just algebraic product
- types (and are thus supported in exactly the same way).
-
- The \quote{product} in its name refers to the collection of values
- belonging to this type. The collection for a product type is the
- Cartesian product of the collections for the types of its fields.
-
- These types are translated to \VHDL\ record types, with one field for
- every field in the constructor. This translation applies to all single
- constructor algebraic data-types, including those with just one
- field (which are technically not a product, but generate a VHDL
- record for implementation simplicity).
- \item[Enumerated types]
- An enumerated type is an algebraic datatype with multiple constructors, but
- none of them have fields. This is essentially a way to get an
- enumeration-like type containing alternatives.
-
- Note that Haskell's \hs{Bool} type is also defined as an
- enumeration type, but we have a fixed translation for that.
-
- These types are translated to \VHDL\ enumerations, with one value for
- each constructor. This allows references to these constructors to be
- translated to the corresponding enumeration value.
- \item[Sum types]
- A sum type is an algebraic datatype with multiple constructors, where
- the constructors have one or more fields. Technically, a type with
- more than one field per constructor is a sum of products type, but
- for our purposes this distinction does not really make a
- difference, so this distinction is note made.
-
- The \quote{sum} in its name refers again to the collection of values
- belonging to this type. The collection for a sum type is the
- union of the the collections for each of the constructors.
-
- Sum types are currently not supported by the prototype, since there is
- no obvious \VHDL\ alternative. They can easily be emulated, however, as
- we will see from an example:
-
- \begin{verbatim}
- data Sum = A Bit Word | B Word
- \end{verbatim}
-
- An obvious way to translate this would be to create an enumeration to
- distinguish the constructors and then create a big record that
- contains all the fields of all the constructors. This is the same
- translation that would result from the following enumeration and
- product type (using a tuple for clarity):
-
- \begin{verbatim}
- data SumC = A | B
- type Sum = (SumC, Bit, Word, Word)
- \end{verbatim}
-
- Here, the \hs{SumC} type effectively signals which of the latter three
- fields of the \hs{Sum} type are valid (the first two if \hs{A}, the
- last one if \hs{B}), all the other ones have no useful value.
-
- An obvious problem with this naive approach is the space usage: the
- example above generates a fairly big \VHDL\ type. Since we can be
- sure that the two \hs{Word}s in the \hs{Sum} type will never be valid
- at the same time, this is a waste of space.
-
- Obviously, duplication detection could be used to reuse a
- particular field for another constructor, but this would only
- partially solve the problem. If two fields would be, for
- example, an array of 8 bits and an 8 bit unsigned word, these are
- different types and could not be shared. However, in the final
- hardware, both of these types would simply be 8 bit connections,
- so we have a 100\% size increase by not sharing these.
- \end{description}
-
-
-\section{\CLaSH\ prototype}
-
-foo\par bar
+ completely new type. Type synonyms and type renaming only define new
+ names for existing types, where synonyms are completely interchangeable
+ and type renaming requires explicit conversions. Therefore, these do not
+ need any particular translation, a synonym or renamed type will just use
+ the same representation as the original type. For algebraic types, we can
+ make the following distinctions:
+
+ \begin{xlist}
+ \item[\bf{Single constructor}]
+ Algebraic datatypes with a single constructor with one or more
+ fields, are essentially a way to pack a few values together in a
+ record-like structure. Haskell's built-in tuple types are also defined
+ as single constructor algebraic types An example of a single
+ constructor type is the following pair of integers:
+ \begin{code}
+ data IntPair = IntPair Int Int
+ \end{code}
+ % These types are translated to \VHDL\ record types, with one field
+ % for every field in the constructor.
+ \item[\bf{No fields}]
+ Algebraic datatypes with multiple constructors, but without any
+ fields are essentially a way to get an enumeration-like type
+ containing alternatives. Note that Haskell's \hs{Bool} type is also
+ defined as an enumeration type, but that there a fixed translation for
+ that type within the \CLaSH\ compiler. An example of such an
+ enumeration type is the type that represents the colors in a traffic
+ light:
+ \begin{code}
+ data TrafficLight = Red | Orange | Green
+ \end{code}
+ % These types are translated to \VHDL\ enumerations, with one
+ % value for each constructor. This allows references to these
+ % constructors to be translated to the corresponding enumeration
+ % value.
+ \item[\bf{Multiple constructors with fields}]
+ Algebraic datatypes with multiple constructors, where at least
+ one of these constructors has one or more fields are currently not
+ supported.
+ \end{xlist}
+
+ \subsection{Polymorphism}
+ A powerful feature of most (functional) programming languages is
+ polymorphism, it allows a function to handle values of different data
+ types in a uniform way. Haskell supports \emph{parametric
+ polymorphism}~\cite{polymorphism}, meaning functions can be written
+ without mention of any specific type and can be used transparently with
+ any number of new types.
+
+ As an example of a parametric polymorphic function, consider the type of
+ the following \hs{append} function, which appends an element to a vector:
+
+ \begin{code}
+ append :: [a|n] -> a -> [a|n + 1]
+ \end{code}
+
+ This type is parameterized by \hs{a}, which can contain any type at
+ all. This means that \hs{append} can append an element to a vector,
+ regardless of the type of the elements in the list (as long as the type of
+ the value to be added is of the same type as the values in the vector).
+ This kind of polymorphism is extremely useful in hardware designs to make
+ operations work on a vector without knowing exactly what elements are
+ inside, routing signals without knowing exactly what kinds of signals
+ these are, or working with a vector without knowing exactly how long it
+ is. Polymorphism also plays an important role in most higher order
+ functions, as we will see in the next section.
+
+ Another type of polymorphism is \emph{ad-hoc
+ polymorphism}~\cite{polymorphism}, which refers to polymorphic
+ functions which can be applied to arguments of different types, but which
+ behave differently depending on the type of the argument to which they are
+ applied. In Haskell, ad-hoc polymorphism is achieved through the use of
+ type classes, where a class definition provides the general interface of a
+ function, and class instances define the functionality for the specific
+ types. An example of such a type class is the \hs{Num} class, which
+ contains all of Haskell's numerical operations. A designer can make use
+ of this ad-hoc polymorphism by adding a constraint to a parametrically
+ polymorphic type variable. Such a constraint indicates that the type
+ variable can only be instantiated to a type whose members supports the
+ overloaded functions associated with the type class.
+
+ As an example we will take a look at type signature of the function
+ \hs{sum}, which sums the values in a vector:
+ \begin{code}
+ sum :: Num a => [a|n] -> a
+ \end{code}
+
+ This type is again parameterized by \hs{a}, but it can only contain
+ types that are \emph{instances} of the \emph{type class} \hs{Num}, so that
+ we know that the addition (+) operator is defined for that type.
+ \CLaSH's built-in numerical types are also instances of the \hs{Num}
+ class, so we can use the addition operator on \hs{SizedWords} as
+ well as on \hs{SizedInts}.
+
+ In \CLaSH, parametric polymorphism is completely supported. Any function
+ defined can have any number of unconstrained type parameters. The \CLaSH\
+ compiler will infer the type of every such argument depending on how the
+ function is applied. There is however one constraint: the top level
+ function that is being translated can not have any polymorphic arguments.
+ The arguments can not be polymorphic as they are never applied and
+ consequently there is no way to determine the actual types for the type
+ parameters.
+
+ \CLaSH\ does not support user-defined type classes, but does use some
+ of the standard Haskell type classes for its built-in function, such as:
+ \hs{Num} for numerical operations, \hs{Eq} for the equality operators, and
+ \hs{Ord} for the comparison/order operators.
+
+ \subsection{Higher-order functions \& values}
+ Another powerful abstraction mechanism in functional languages, is
+ the concept of \emph{higher-order functions}, or \emph{functions as
+ a first class value}. This allows a function to be treated as a
+ value and be passed around, even as the argument of another
+ function. The following example should clarify this concept:
+
+ \begin{code}
+ negateVector xs = map not xs
+ \end{code}
+
+ The code above defines the \hs{negateVector} function, which takes a
+ vector of booleans, \hs{xs}, and returns a vector where all the values are
+ negated. It achieves this by calling the \hs{map} function, and passing it
+ \emph{another function}, boolean negation, and the vector of booleans,
+ \hs{xs}. The \hs{map} function applies the negation function to all the
+ elements in the vector.
+
+ The \hs{map} function is called a higher-order function, since it takes
+ another function as an argument. Also note that \hs{map} is again a
+ parametric polymorphic function: it does not pose any constraints on the
+ type of the vector elements, other than that it must be the same type as
+ the input type of the function passed to \hs{map}. The element type of the
+ resulting vector is equal to the return type of the function passed, which
+ need not necessarily be the same as the element type of the input vector.
+ All of these characteristics can readily be inferred from the type
+ signature belonging to \hs{map}:
+
+ \begin{code}
+ map :: (a -> b) -> [a|n] -> [b|n]
+ \end{code}
+
+ So far, only functions have been used as higher-order values. In
+ Haskell, there are two more ways to obtain a function-typed value:
+ partial application and lambda abstraction. Partial application
+ means that a function that takes multiple arguments can be applied
+ to a single argument, and the result will again be a function (but
+ that takes one argument less). As an example, consider the following
+ expression, that adds one to every element of a vector:
+
+ \begin{code}
+ map (+ 1) xs
+ \end{code}
+
+ Here, the expression \hs{(+ 1)} is the partial application of the
+ plus operator to the value \hs{1}, which is again a function that
+ adds one to its argument. A lambda expression allows one to introduce an
+ anonymous function in any expression. Consider the following expression,
+ which again adds one to every element of a vector:
+
+ \begin{code}
+ map (\x -> x + 1) xs
+ \end{code}
+
+ Finally, higher order arguments are not limited to just built-in
+ functions, but any function defined by a developer can have function
+ arguments. This allows the hardware designer to use a powerful
+ abstraction mechanism in his designs and have an optimal amount of
+ code reuse. The only exception is again the top-level function: if a
+ function-typed argument is not applied with an actual function, no
+ hardware can be generated.
+
+ % \comment{TODO: Describe ALU example (no code)}
+
+ \subsection{State}
+ A very important concept in hardware is the concept of state. In a
+ stateful design, the outputs depend on the history of the inputs, or the
+ state. State is usually stored in registers, which retain their value
+ during a clock cycle. As we want to describe more than simple
+ combinatorial designs, \CLaSH\ needs an abstraction mechanism for state.
+
+ An important property in Haskell, and in most other functional languages,
+ is \emph{purity}. A function is said to be \emph{pure} if it satisfies two
+ conditions:
+ \begin{inparaenum}
+ \item given the same arguments twice, it should return the same value in
+ both cases, and
+ \item when the function is called, it should not have observable
+ side-effects.
+ \end{inparaenum}
+ % This purity property is important for functional languages, since it
+ % enables all kinds of mathematical reasoning that could not be guaranteed
+ % correct for impure functions.
+ Pure functions are as such a perfect match for combinatorial circuits,
+ where the output solely depends on the inputs. When a circuit has state
+ however, it can no longer be simply described by a pure function.
+ % Simply removing the purity property is not a valid option, as the
+ % language would then lose many of it mathematical properties.
+ In \CLaSH\ we deal with the concept of state in pure functions by making
+ current value of the state an additional argument of the function and the
+ updated state part of result. In this sense the descriptions made in
+ \CLaSH\ are the combinatorial parts of a mealy machine.
+
+ A simple example is adding an accumulator register to the earlier
+ multiply-accumulate circuit, of which the resulting netlist can be seen in
+ \Cref{img:mac-state}:
+
+ \begin{code}
+ macS (State c) a b = (State c', c')
+ where
+ c' = mac a b c
+ \end{code}
+
+ \begin{figure}
+ \centerline{\includegraphics{mac-state.svg}}
+ \caption{Stateful Multiply-Accumulate}
+ \label{img:mac-state}
+ \end{figure}
+
+ The \hs{State} keyword indicates which arguments are part of the current
+ state, and what part of the output is part of the updated state. This
+ aspect will also be reflected in the type signature of the function.
+ Abstracting the state of a circuit in this way makes it very explicit:
+ which variables are part of the state is completely determined by the
+ type signature. This approach to state is well suited to be used in
+ combination with the existing code and language features, such as all the
+ choice constructs, as state values are just normal values. We can simulate
+ stateful descriptions using the recursive \hs{run} function:
+
+ \begin{code}
+ run f s (i : inps) = o : (run f s' inps)
+ where
+ (s', o) = f s i
+ \end{code}
+
+ The \hs{(:)} operator is the list concatenation operator, where the
+ left-hand side is the head of a list and the right-hand side is the
+ remainder of the list. The \hs{run} function applies the function the
+ developer wants to simulate, \hs{f}, to the current state, \hs{s}, and the
+ first input value, \hs{i}. The result is the first output value, \hs{o},
+ and the updated state \hs{s'}. The next iteration of the \hs{run} function
+ is then called with the updated state, \hs{s'}, and the rest of the
+ inputs, \hs{inps}. It is assumed that there is one input per clock cycle.
+ Also note how the order of the input, output, and state in the \hs{run}
+ function corresponds with the order of the input, output and state of the
+ \hs{macS} function described earlier.
+
+ As both the \hs{run} function, the hardware description, and the test
+ inputs are plain Haskell, the complete simulation can be compiled to an
+ executable binary by an optimizing Haskell compiler, or executed in an
+ Haskell interpreter. Both simulation paths are much faster than first
+ translating the description to \VHDL\ and then running a \VHDL\
+ simulation, where the executable binary has an additional simulation speed
+ bonus in case there is a large set of test inputs.
+
+\section{\CLaSH\ compiler}
+An important aspect in this research is the creation of the prototype compiler, which allows us to translate descriptions made in the \CLaSH\ language as described in the previous section to synthesizable \VHDL, allowing a designer to actually run a \CLaSH\ design on an \acro{FPGA}.
+
+The Glasgow Haskell Compiler (\GHC) is an open-source Haskell compiler that also provides a high level API to most of its internals. The availability of this high-level API obviated the need to design many of the tedious parts of the prototype compiler, such as the parser, semantic checker, and especially the type-checker. The parser, semantic checker, and type-checker together form the front-end of the prototype compiler pipeline, as depicted in \Cref{img:compilerpipeline}.
+
+\begin{figure}
+\centerline{\includegraphics{compilerpipeline.svg}}
+\caption{\CLaSHtiny\ compiler pipeline}
+\label{img:compilerpipeline}
+\end{figure}
+
+The output of the \GHC\ front-end is the original Haskell description translated to \emph{Core}~\cite{Sulzmann2007}, which is smaller, functional, typed language that is relatively easier to process than the larger Haskell language. A description in \emph{Core} can still contain properties which have no direct translation to hardware, such as polymorphic types and function-valued arguments. Such a description needs to be transformed to a \emph{normal form}, which only contains properties that have a direct translation. The second stage of the compiler, the \emph{normalization} phase exhaustively applies a set of \emph{meaning-preserving} transformations on the \emph{Core} description until this description is in a \emph{normal form}. This set of transformations includes transformations typically found in reduction systems for lambda calculus, such a $\beta$-reduction and $\eta$-expansion, but also includes \emph{defunctionalization} transformations which reduce higher-order functions to `regular' first-order functions.
+
+The final step in the compiler pipeline is the translation to a \VHDL\ \emph{netlist}, which is a straightforward process due to resemblance of a normalized description and a set of concurrent signal assignments. We call the end-product of the \CLaSH\ compiler a \VHDL\ \emph{netlist} as the resulting \VHDL\ resembles an actual netlist description and not idiomatic \VHDL.
+
+\section{Use cases}
+\label{sec:usecases}
+As an example of a common hardware design where the use of higher-order
+functions leads to a very natural description is a FIR filter, which is
+basically the dot-product of two vectors:
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_{t - i} \cdot h_i }
+\end{equation}
+
+A FIR filter multiplies fixed constants ($h$) with the current
+and a few previous input samples ($x$). Each of these multiplications
+are summed, to produce the result at time $t$. The equation of a FIR
+filter is indeed equivalent to the equation of the dot-product, which is
+shown below:
+
+\begin{equation}
+\mathbf{x}\bullet\mathbf{y} = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot y_i }
+\end{equation}
+
+We can easily and directly implement the equation for the dot-product
+using higher-order functions:
+
+\begin{code}
+xs *+* ys = foldl1 (+) (zipWith (*) xs hs)
+\end{code}
+
+The \hs{zipWith} function is very similar to the \hs{map} function seen
+earlier: It takes a function, two vectors, and then applies the function to
+each of the elements in the two vectors pairwise (\emph{e.g.}, \hs{zipWith (*)
+[1, 2] [3, 4]} becomes \hs{[1 * 3, 2 * 4]} $\equiv$ \hs{[3,8]}).
+
+The \hs{foldl1} function takes a function, a single vector, and applies
+the function to the first two elements of the vector. It then applies the
+function to the result of the first application and the next element from
+the vector. This continues until the end of the vector is reached. The
+result of the \hs{foldl1} function is the result of the last application.
+As you can see, the \hs{zipWith (*)} function is just pairwise
+multiplication and the \hs{foldl1 (+)} function is just summation.
+
+Returning to the actual FIR filter, we will slightly change the
+equation belong to it, so as to make the translation to code more obvious.
+What we will do is change the definition of the vector of input samples.
+So, instead of having the input sample received at time
+$t$ stored in $x_t$, $x_0$ now always stores the current sample, and $x_i$
+stores the $ith$ previous sample. This changes the equation to the
+following (Note that this is completely equivalent to the original
+equation, just with a different definition of $x$ that will better suit
+the transformation to code):
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot h_i }
+\end{equation}
+
+Consider that the vector \hs{hs} contains the FIR coefficients and the
+vector \hs{xs} contains the current input sample in front and older
+samples behind. The function that shifts the input samples is shown below:
+
+\begin{code}
+x >> xs = x +> tail xs
+\end{code}
+
+Where the \hs{tail} function returns all but the first element of a
+vector, and the concatenate operator ($\succ$) adds a new element to the
+left of a vector. The complete definition of the FIR filter then becomes:
+
+\begin{code}
+fir (State (xs,hs)) x = (State (x >> xs,hs), xs *+* hs)
+\end{code}
+
+The resulting netlist of a 4-taps FIR filter based on the above definition
+is depicted in \Cref{img:4tapfir}.
+
+\begin{figure}
+\centerline{\includegraphics{4tapfir.svg}}
+\caption{4-taps \acrotiny{FIR} Filter}
+\label{img:4tapfir}
+\end{figure}
+
+
+\subsection{Higher order CPU}
+
+
+\begin{code}
+type FuState = State Word
+fu :: (a -> a -> a)
+ -> [a]:n
+ -> (RangedWord n, RangedWord n)
+ -> FuState
+ -> (FuState, a)
+fu op inputs (addr1, addr2) (State out) =
+ (State out', out)
+ where
+ in1 = inputs!addr1
+ in2 = inputs!addr2
+ out' = op in1 in2
+\end{code}
+
+\begin{code}
+type CpuState = State [FuState]:4
+cpu :: Word
+ -> [(RangedWord 7, RangedWord 7)]:4
+ -> CpuState
+ -> (CpuState, Word)
+cpu input addrs (State fuss) =
+ (State fuss', out)
+ where
+ fures = [ fu const inputs!0 fuss!0
+ , fu (+) inputs!1 fuss!1
+ , fu (-) inputs!2 fuss!2
+ , fu (*) inputs!3 fuss!3
+ ]
+ (fuss', outputs) = unzip fures
+ inputs = 0 +> 1 +> input +> outputs
+ out = head outputs
+\end{code}