%
\documentclass[conference,pdf,a4paper,10pt,final,twoside,twocolumn]{IEEEtran}
+\IEEEoverridecommandlockouts
% Add the compsoc option for Computer Society conferences.
%
% If IEEEtran.cls has not been installed into the LaTeX system files,
\IEEEauthorblockA{%Computer Architecture for Embedded Systems (CAES)\\
Department of EEMCS, University of Twente\\
P.O. Box 217, 7500 AE, Enschede, The Netherlands\\
-c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}}
+c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}
+\thanks{Supported through the FP7 project: S(o)OS (248465)}
+}
% \and
% \IEEEauthorblockN{Homer Simpson}
% \IEEEauthorblockA{Twentieth Century Fox\\
\begin{abstract}
%\boldmath
\CLaSH\ is a functional hardware description language that borrows both its
-syntax and semantics from the functional programming language Haskell. Circuit
-descriptions can be translated to synthesizable VHDL using the prototype
-\CLaSH\ compiler. As the circuit descriptions are made in plain Haskell,
-simulations can also be compiled by a Haskell compiler.
-
-The use of polymorphism and higher-order functions allow a circuit designer to
-describe more abstract and general specifications than are possible in the
-traditional hardware description languages.
+syntax and semantics from the functional programming language Haskell. Due to
+the abstraction and generality offered by polymorphism and higher-order
+functions, a circuit designer can describe circuits in a more natural way than
+he could in the traditional hardware description languages.
+
+Circuit descriptions can be translated to synthesizable VHDL using the
+prototype \CLaSH\ compiler. As the circuit descriptions, simulation code, and
+test input are plain Haskell, complete simulations can be compiled to an
+executable binary by a Haskell compiler allowing high-speed simulation and
+analysis.
+
+Stateful descriptions are supported by explicitly making the current state an
+argument of the function, and the updated state part of the result. In this
+sense, the descriptions made in \CLaSH\ are the combinational parts of a mealy
+machine.
\end{abstract}
% IEEEtran.cls defaults to using nonbold math in the Abstract.
% This preserves the distinction between vectors and scalars. However,
be converted to an actual netlist format by an (optimizing) \VHDL\ synthesis
tool.
-Besides trivial circuits such as variants of both the FIR filter and the
-simple CPU shown in \Cref{sec:usecases}, the \CLaSH\ compiler has also been
-shown to work for non-trivial descriptions. \CLaSH\ has been able to
+Besides trivial circuits such as variants of both the \acro{FIR} filter and
+the simple \acro{CPU} shown in \Cref{sec:usecases}, the \CLaSH\ compiler has
+also been shown to work for non-trivial descriptions. \CLaSH\ has been able to
successfully translate the functional description of a streaming reduction
circuit~\cite{reductioncircuit} for floating point numbers.
compiler. The \CLaSH\ compiler has generic translation rules to
translated the user-defined types described below.
- The \CLaSH compiler is able to infer unspecified types,
+ The \CLaSH\ compiler is able to infer unspecified types,
meaning that a developer does not have to annotate every function with a
- type signature (though it is good practice to do so anyway).
+ type signature (even if it is good practice to do so).
% Translation of two most basic functional concepts has been
% discussed: function application and choice. Before looking further
the rest of paper is: \hs{[a|n]}. Where the \hs{a} is the element
type, and \hs{n} is the length of the vector. Note that this is
a notation used in this paper only, vectors are slightly more
- elaborate in real \CLaSH programs.
+ verbose in real \CLaSH\ descriptions.
% The state type of an 8 element register bank would then for example
% be:
which have no direct translation to hardware, such as polymorphic types and
function-valued arguments. Such a description needs to be transformed to a
\emph{normal form}, which only contains properties that have a direct
-translation. The second stage of the compiler, the \emph{normalization} phase
+translation. The second stage of the compiler, the \emph{normalization} phase,
exhaustively applies a set of \emph{meaning-preserving} transformations on the
\emph{Core} description until this description is in a \emph{normal form}.
This set of transformations includes transformations typically found in
\VHDL\ resembles an actual netlist description and not idiomatic \VHDL.
\section{Use cases}
-
-\subsection{FIR Filter}
\label{sec:usecases}
+\subsection{FIR Filter}
As an example of a common hardware design where the use of higher-order
functions leads to a very natural description is a \acro{FIR} filter, which is
basically the dot-product of two vectors:
\end{code}
Where the vector \hs{hs} contains the \acro{FIR} coefficients and the vector
-\hs{xs} contains the latest input sample in front and older samples behind.
-The code for the shift (\hs{>>}) operator that adds the new input sample
+\hs{xs} contains the previous input sample in front and older samples behind.
+The code for the shift (\hs{>>}) operator, that adds the new input sample
(\hs{x}) to the list of previous input samples (\hs{xs}) and removes the
-oldest sample is shown below:
+oldest sample, is shown below:
\begin{code}
x >> xs = x +> init xs
\end{figure}
\subsection{Higher order CPU}
+The following simple CPU is an example of user-defined higher order
+functions and pattern matching. The CPU consists of four function units,
+of which three have a fixed function and one can perform some less
+common operations.
+
+The CPU contains a number of data sources, represented by the horizontal
+lines in figure TODO:REF. These data sources offer the previous outputs
+of each function units, along with the single data input the cpu has and
+two fixed intialization values.
+
+Each of the function units has both its operands connected to all data
+sources, and can be programmed to select any data source for either
+operand. In addition, the leftmost function unit has an additional
+opcode input to select the operation it performs. Its output is also the
+output of the entire cpu.
+
+Looking at the code, the function unit is the most simple. It arranges
+the operand selection for the function unit. Note that it does not
+define the actual operation that takes place inside the function unit,
+but simply accepts the (higher order) argument "op" which is a function
+of two arguments that defines the operation.
\begin{code}
-fu :: (a -> a -> a)
- -> [a | n]
- -> (Index (n - 1), Index (n - 1))
- -> a
- -> (a, a)
-fu op inputs (addr1, addr2) (State out) =
- (State out', out)
+fu op inputs (addr1, addr2) = regIn
where
- in1 = inputs!addr1
- in2 = inputs!addr2
- out' = op in1 in2
+ in1 = inputs!addr1
+ in2 = inputs!addr2
+ regIn = op in1 in2
+\end{code}
+
+The multiop function defines the operation that takes place in the
+leftmost function unit. It is essentially a simple three operation alu
+that makes good use of pattern matching and guards in its description.
+The \hs{shift} function used here shifts its first operand by the number
+of bits indicated in the second operand, the \hs{xor} function produces
+the bitwise xor of its operands.
+
+\begin{code}
+data Opcode = Shift | Xor | Equal
+
+multiop :: Opcode -> Word -> Word -> Word
+multiop opc a b = case opc of
+ Shift -> shift a b
+ Xor -> xor a b
+ Equal | a == b -> 1
+ | otherwise -> 0
\end{code}
+The cpu function ties everything together. It applies the \hs{fu}
+function four times, to create a different function unit each time. The
+first application is interesting, because it does not just pass a
+function to \hs{fu}, but a partial application of \hs{multiop}. This
+shows how the first funcition unit effectively gets an extra input,
+compared to the others.
+
+The vector \hs{inputs} is the set of data sources, which is passed to
+each function unit for operand selection. The cpu also receives a vector
+of address pairs, which are used by each function unit to select their
+operand. The application of the function units to the \hs{inputs} and
+\hs{addrs} arguments seems quite repetive and could be rewritten to use
+a combination of the \hs{map} and \hs{zipwith} functions instead.
+However, the prototype does not currently support working with lists of
+functions, so the more explicit version of the code is given instead).
+
\begin{code}
type CpuState = State [Word | 4]
-cpu :: Word
- -> [(Index 6, Index 6) | 4]
- -> CpuState
- -> (CpuState, Word)
-cpu input addrs (State fuss) = (State fuss', out)
+cpu :: CpuState -> Word -> [(Index 6, Index 6) | 4]
+ -> Opcode -> (CpuState, Word)
+cpu (State s) input addrs opc = (State s', out)
where
- fures = [ fu const inputs (addrs!0) (fuss!0)
- , fu (+) inputs (addrs!1) (fuss!1)
- , fu (-) inputs (addrs!2) (fuss!2)
- , fu (*) inputs (addrs!3) (fuss!3)
+ s' = [ fu (multiop opc) inputs (addrs!0)
+ , fu add inputs (addrs!1)
+ , fu sub inputs (addrs!2)
+ , fu mul inputs (addrs!3)
]
- (fuss', outputs) = unzip fures
- inputs = 0 +> (1 +> (input +> outputs))
- out = head outputs
+ inputs = 0 +> (1 +> (input +> s))
+ out = head s'
\end{code}
+Of course, this is still a simple example, but it could form the basis
+of an actual design, in which the same techniques can be reused.
+
\section{Related work}
+This section describes the features of existing (functional) hardware
+description languages and highlights the advantages that this research has
+over existing work.
+
Many functional hardware description languages have been developed over the
years. Early work includes such languages as $\mu$\acro{FP}~\cite{muFP}, an
extension of Backus' \acro{FP} language to synchronous streams, designed
functional language \acro{ML}, and has support for polymorphic types and
higher-order functions. Published work suggests that there is no direct
simulation support for \acro{HML}, but that a description in \acro{HML} has to
-be translated to \VHDL\ and that the translated description can than be
+be translated to \VHDL\ and that the translated description can then be
simulated in a \VHDL\ simulator. Also not all of the mentioned language
features of \acro{HML} could be translated to hardware. The \CLaSH\ compiler
on the other hand can correctly translate all of the language constructs
The merits of polymorphic typing, combined with higher-order functions, are
now also recognized in the `main-stream' hardware description languages,
-exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 support for generics has been extended to types, allowing a developer to describe
-polymorphic components. Note that those types still require an explicit
-generic map, whereas types can be automatically inferred in \CLaSH.
+exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 support
+for generics has been extended to types and subprograms, allowing a developer to describe components with polymorphic ports and function-valued arguments. Note that the types and subprograms still require an explicit generic map, whereas types can be automatically inferred, and function-values can be automatically propagated by the \CLaSH\ compiler. There are also no (generally available) \VHDL\ synthesis tools that currently support the \VHDL-2008 standard, and thus the synthesis of polymorphic types and function-valued arguments.
% Wired~\cite{Wired},, T-Ruby~\cite{T-Ruby}, Hydra~\cite{Hydra}.
%
\section{Conclusion}
-The conclusion goes here.
-
-
-
+This research demonstrates once more that functional languages are well suited
+for hardware descriptions: function applications provide an elegant notation
+for component instantiation. Where this research goes beyond the existing
+(functional) hardware descriptions languages is the inclusion of various
+choice elements, such as patter matching, that are well suited to describe the
+conditional assignments in control-oriented hardware. Besides being able to
+translate these basic constructs to synthesizable \VHDL, the prototype
+compiler can also correctly translate descriptions that contain both
+polymorphic types and function-valued arguments.
+
+Where recent functional hardware description languages have mostly opted to
+embed themselves in an existing functional language, this research features a
+`true' compiler. As a result there is a clear distinction between compile-time
+and run-time, which allows a myriad of choice constructs to be part of the
+actual circuit description; a feature the embedded hardware description
+languages do not offer.
+
+\section{Future Work}
+The choice of describing state explicitly as extra arguments and results can
+be seen as a mixed blessing. Even though the description that use state are
+usually very clear, one finds that dealing with unpacking, passing, receiving
+and repacking can become tedious and even error-prone, especially in the case
+of sub-states. Removing this boilerplate, or finding a more suitable
+abstraction mechanism would make \CLaSH\ easier to use.
+
+The transformations in normalization phase of the prototype compiler were
+developed in an ad-hoc manner, which makes the existence of many desirable
+properties unclear. Such properties include whether the complete set of
+transformations will always lead to a normal form or if the normalization
+process always terminates. Though various use cases suggests that these
+properties usually hold, they have not been formally proven. A systematic
+approach to defining the set of transformations allows one to proof that the
+earlier mentioned properties do indeed exist.
% conference papers do not normally have an appendix
% use section* for acknowledgement
-\section*{Acknowledgment}
-
-
-The authors would like to thank...
-
-
-
-
+% \section*{Acknowledgment}
+%
+% The authors would like to thank...
% trigger a \newpage just before the given reference
% number - used to balance the columns on the last page
% adjust value as needed - may need to be readjusted if
% the document is modified later
-%\IEEEtriggeratref{8}
+\IEEEtriggeratref{14}
% The "triggered" command can be changed if desired:
%\IEEEtriggercmd{\enlargethispage{-5in}}