Use function names instead of operators in the higher order CPU.

[matthijs/master-project/dsd-paper.git] / cλash.lhs
diff --git a/cλash.lhs b/cλash.lhs

index 3ac0a7ceb6a1c80b0b4cc3cd76655a0c5d6dadc4..447e3ac171045a47d42c71f5f61853a4d6f07e18 100644 (file)
--- a/cλash.lhs
+++ b/cλash.lhs
@@ -65,6 +65,7 @@
  %
  
  \documentclass[conference,pdf,a4paper,10pt,final,twoside,twocolumn]{IEEEtran}
+\IEEEoverridecommandlockouts
  % Add the compsoc option for Computer Society conferences.
  %
  % If IEEEtran.cls has not been installed into the LaTeX system files,
@@ -398,7 +399,9 @@
  \IEEEauthorblockA{%Computer Architecture for Embedded Systems (CAES)\\ 
  Department of EEMCS, University of Twente\\
  P.O. Box 217, 7500 AE, Enschede, The Netherlands\\
-c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}}
+c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}
+\thanks{Supported through the FP7 project: S(o)OS (248465)}
+}
  % \and
  % \IEEEauthorblockN{Homer Simpson}
  % \IEEEauthorblockA{Twentieth Century Fox\\
@@ -449,14 +452,21 @@ c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}}
  \begin{abstract}
  %\boldmath
  \CLaSH\ is a functional hardware description language that borrows both its 
-syntax and semantics from the functional programming language Haskell. Circuit 
-descriptions can be translated to synthesizable VHDL using the prototype 
-\CLaSH\ compiler. As the circuit descriptions are made in plain Haskell, 
-simulations can also be compiled by a Haskell compiler.
-
-The use of polymorphism and higher-order functions allow a circuit designer to 
-describe more abstract and general specifications than are possible in the 
-traditional hardware description languages.
+syntax and semantics from the functional programming language Haskell. Due to 
+the abstraction and generality offered by polymorphism and higher-order 
+functions, a circuit designer can describe circuits in a more natural way than 
+he could in the traditional hardware description languages.
+
+Circuit descriptions can be translated to synthesizable VHDL using the 
+prototype \CLaSH\ compiler. As the circuit descriptions, simulation code, and 
+test input are plain Haskell, complete simulations can be compiled to an 
+executable binary by a Haskell compiler allowing high-speed simulation and 
+analysis.
+
+Stateful descriptions are supported by explicitly making the current state an 
+argument of the function, and the updated state part of the result. In this 
+sense, the descriptions made in \CLaSH\ are the combinational parts of a mealy 
+machine.
  \end{abstract}
  % IEEEtran.cls defaults to using nonbold math in the Abstract.
  % This preserves the distinction between vectors and scalars. However,
@@ -548,9 +558,9 @@ the Haskell code to equivalently behaving synthesizable \VHDL\ code, ready to
  be converted to an actual netlist format by an (optimizing) \VHDL\ synthesis 
  tool.
  
-Besides trivial circuits such as variants of both the FIR filter and the 
-simple CPU shown in \Cref{sec:usecases}, the \CLaSH\ compiler has also been 
-shown to work for non-trivial descriptions. \CLaSH\ has been able to 
+Besides trivial circuits such as variants of both the \acro{FIR} filter and 
+the simple \acro{CPU} shown in \Cref{sec:usecases}, the \CLaSH\ compiler has 
+also been shown to work for non-trivial descriptions. \CLaSH\ has been able to 
  successfully translate the functional description of a streaming reduction 
  circuit~\cite{reductioncircuit} for floating point numbers.
  
@@ -702,9 +712,9 @@ circuit~\cite{reductioncircuit} for floating point numbers.
      compiler. The \CLaSH\ compiler has generic translation rules to
      translated the user-defined types described below.
  
-    The \CLaSH compiler is able to infer unspecified types,
+    The \CLaSH\ compiler is able to infer unspecified types,
      meaning that a developer does not have to annotate every function with a 
-    type signature (though it is good practice to do so anyway).
+    type signature (even if it is good practice to do so).
    
      % Translation of two most basic functional concepts has been
      % discussed: function application and choice. Before looking further
@@ -762,7 +772,7 @@ circuit~\cite{reductioncircuit} for floating point numbers.
          the rest of paper is: \hs{[a|n]}. Where the \hs{a} is the element 
          type, and \hs{n} is the length of the vector. Note that this is
          a notation used in this paper only, vectors are slightly more
-        elaborate in real \CLaSH programs.
+        verbose in real \CLaSH\ descriptions.
          % The state type of an 8 element register bank would then for example 
          % be:
  
@@ -1091,7 +1101,7 @@ Haskell language. A description in \emph{Core} can still contain properties
  which have no direct translation to hardware, such as polymorphic types and 
  function-valued arguments. Such a description needs to be transformed to a 
  \emph{normal form}, which only contains properties that have a direct 
-translation. The second stage of the compiler, the \emph{normalization} phase 
+translation. The second stage of the compiler, the \emph{normalization} phase, 
  exhaustively applies a set of \emph{meaning-preserving} transformations on the 
  \emph{Core} description until this description is in a \emph{normal form}. 
  This set of transformations includes transformations typically found in 
@@ -1107,9 +1117,8 @@ end-product of the \CLaSH\ compiler a \VHDL\ \emph{netlist} as the resulting
  \VHDL\ resembles an actual netlist description and not idiomatic \VHDL.
  
  \section{Use cases}
-
-\subsection{FIR Filter}
  \label{sec:usecases}
+\subsection{FIR Filter}
  As an example of a common hardware design where the use of higher-order
  functions leads to a very natural description is a \acro{FIR} filter, which is 
  basically the dot-product of two vectors:
@@ -1169,10 +1178,10 @@ fir (State (xs,hs)) x = (State (x >> xs,hs), xs *+* hs)
  \end{code}
  
  Where the vector \hs{hs} contains the \acro{FIR} coefficients and the vector 
-\hs{xs} contains the latest input sample in front and older samples behind. 
-The code for the shift (\hs{>>}) operator that adds the new input sample 
+\hs{xs} contains the previous input sample in front and older samples behind. 
+The code for the shift (\hs{>>}) operator, that adds the new input sample 
  (\hs{x}) to the list of previous input samples (\hs{xs}) and removes the 
-oldest sample is shown below:
+oldest sample, is shown below:
  
  \begin{code}
  x >> xs = x +> init xs  
@@ -1191,41 +1200,94 @@ the vectors of the \acro{FIR} code to a length of 4, is depicted in
  \end{figure}
  
  \subsection{Higher order CPU}
+The following simple CPU is an example of user-defined higher order
+functions and pattern matching. The CPU consists of four function units,
+of which three have a fixed function and one can perform some less
+common operations.
+
+The CPU contains a number of data sources, represented by the horizontal
+lines in figure TODO:REF. These data sources offer the previous outputs
+of each function units, along with the single data input the cpu has and
+two fixed intialization values.
+
+Each of the function units has both its operands connected to all data
+sources, and can be programmed to select any data source for either
+operand. In addition, the leftmost function unit has an additional
+opcode input to select the operation it performs. Its output is also the
+output of the entire cpu.
+
+Looking at the code, the function unit is the most simple. It arranges
+the operand selection for the function unit. Note that it does not
+define the actual operation that takes place inside the function unit,
+but simply accepts the (higher order) argument "op" which is a function
+of two arguments that defines the operation.
  
  \begin{code}
-fu :: (a -> a -> a)
-      -> [a | n]
-      -> (Index (n - 1), Index (n - 1))
-      -> a
-      -> (a, a)
-fu op inputs (addr1, addr2) (State out) =
-  (State out', out)
+fu op inputs (addr1, addr2) = regIn
    where
-    in1  = inputs!addr1
-    in2  = inputs!addr2
-    out' = op in1 in2
+    in1     = inputs!addr1
+    in2     = inputs!addr2
+    regIn   = op in1 in2
+\end{code}
+
+The multiop function defines the operation that takes place in the
+leftmost function unit. It is essentially a simple three operation alu
+that makes good use of pattern matching and guards in its description.
+The \hs{shift} function used here shifts its first operand by the number
+of bits indicated in the second operand, the \hs{xor} function produces
+the bitwise xor of its operands.
+
+\begin{code}
+data Opcode = Shift | Xor | Equal
+
+multiop :: Opcode -> Word -> Word -> Word
+multiop opc a b = case opc of
+  Shift             -> shift a b
+  Xor               -> xor a b 
+  Equal | a == b    -> 1
+        | otherwise -> 0
  \end{code}
  
+The cpu function ties everything together. It applies the \hs{fu}
+function four times, to create a different function unit each time. The
+first application is interesting, because it does not just pass a
+function to \hs{fu}, but a partial application of \hs{multiop}. This
+shows how the first funcition unit effectively gets an extra input,
+compared to the others.
+
+The vector \hs{inputs} is the set of data sources, which is passed to
+each function unit for operand selection. The cpu also receives a vector
+of address pairs, which are used by each function unit to select their
+operand. The application of the function units to the \hs{inputs} and
+\hs{addrs} arguments seems quite repetive and could be rewritten to use
+a combination of the \hs{map} and \hs{zipwith} functions instead.
+However, the prototype does not currently support working with lists of
+functions, so the more explicit version of the code is given instead).
+
  \begin{code}
  type CpuState = State [Word | 4]
  
-cpu :: Word 
-       -> [(Index 6, Index 6) | 4]
-       -> CpuState
-       -> (CpuState, Word)
-cpu input addrs (State fuss) = (State fuss', out)
+cpu :: CpuState -> Word -> [(Index 6, Index 6) | 4] 
+       -> Opcode -> (CpuState, Word)
+cpu (State s) input addrs opc = (State s', out)
    where
-    fures =   [ fu const  inputs (addrs!0) (fuss!0)
-              , fu (+)    inputs (addrs!1) (fuss!1)
-              , fu (-)    inputs (addrs!2) (fuss!2)
-              , fu (*)    inputs (addrs!3) (fuss!3)
+    s'    =   [ fu (multiop opc)  inputs (addrs!0)
+              , fu add            inputs (addrs!1)
+              , fu sub            inputs (addrs!2)
+              , fu mul            inputs (addrs!3)
                ]
-    (fuss', outputs)  = unzip fures
-    inputs            = 0 +> (1 +> (input +> outputs))
-    out               = head outputs
+    inputs    =   0 +> (1 +> (input +> s))
+    out       =   head s'
  \end{code}
  
+Of course, this is still a simple example, but it could form the basis
+of an actual design, in which the same techniques can be reused.
+
  \section{Related work}
+This section describes the features of existing (functional) hardware 
+description languages and highlights the advantages that this research has 
+over existing work.
+
  Many functional hardware description languages have been developed over the 
  years. Early work includes such languages as $\mu$\acro{FP}~\cite{muFP}, an 
  extension of Backus' \acro{FP} language to synchronous streams, designed 
@@ -1237,7 +1299,7 @@ circuits, and has a particular focus on layout.
  functional language \acro{ML}, and has support for polymorphic types and 
  higher-order functions. Published work suggests that there is no direct 
  simulation support for \acro{HML}, but that a description in \acro{HML} has to 
-be translated to \VHDL\ and that the translated description can than be 
+be translated to \VHDL\ and that the translated description can then be 
  simulated in a \VHDL\ simulator. Also not all of the mentioned language 
  features of \acro{HML} could be translated to hardware. The \CLaSH\ compiler 
  on the other hand can correctly translate all of the language constructs 
@@ -1279,9 +1341,8 @@ mentioned in this section.
  
  The merits of polymorphic typing, combined with higher-order functions, are 
  now also recognized in the `main-stream' hardware description languages, 
-exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 support for generics has been extended to types, allowing a developer to describe 
-polymorphic components. Note that those types still require an explicit 
-generic map, whereas types can be automatically inferred in \CLaSH.
+exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 support 
+for generics has been extended to types and subprograms, allowing a developer to describe components with polymorphic ports and function-valued arguments. Note that the types and subprograms still require an explicit generic map, whereas types can be automatically inferred, and function-values can be automatically propagated by the \CLaSH\ compiler. There are also no (generally available) \VHDL\ synthesis tools that currently support the \VHDL-2008 standard, and thus the synthesis of polymorphic types and function-valued arguments.
  
  % Wired~\cite{Wired},, T-Ruby~\cite{T-Ruby}, Hydra~\cite{Hydra}. 
  % 
@@ -1374,29 +1435,53 @@ generic map, whereas types can be automatically inferred in \CLaSH.
  
  
  \section{Conclusion}
-The conclusion goes here.
-
-
-
+This research demonstrates once more that functional languages are well suited 
+for hardware descriptions: function applications provide an elegant notation 
+for component instantiation. Where this research goes beyond the existing 
+(functional) hardware descriptions languages is the inclusion of various 
+choice elements, such as patter matching, that are well suited to describe the 
+conditional assignments in control-oriented hardware. Besides being able to 
+translate these basic constructs to synthesizable \VHDL, the prototype 
+compiler can also correctly translate descriptions that contain both 
+polymorphic types and function-valued arguments.
+
+Where recent functional hardware description languages have mostly opted to 
+embed themselves in an existing functional language, this research features a 
+`true' compiler. As a result there is a clear distinction between compile-time 
+and run-time, which allows a myriad of choice constructs to be part of the 
+actual circuit description; a feature the embedded hardware description 
+languages do not offer.
+
+\section{Future Work}
+The choice of describing state explicitly as extra arguments and results can 
+be seen as a mixed blessing. Even though the description that use state are 
+usually very clear, one finds that dealing with unpacking, passing, receiving 
+and repacking can become tedious and even error-prone, especially in the case 
+of sub-states. Removing this boilerplate, or finding a more suitable 
+abstraction mechanism would make \CLaSH\ easier to use.
+
+The transformations in normalization phase of the prototype compiler were 
+developed in an ad-hoc manner, which makes the existence of many desirable 
+properties unclear. Such properties include whether the complete set of 
+transformations will always lead to a normal form or if the normalization 
+process always terminates. Though various use cases suggests that these 
+properties usually hold, they have not been formally proven. A systematic 
+approach to defining the set of transformations allows one to proof that the 
+earlier mentioned properties do indeed exist.
  
  % conference papers do not normally have an appendix
  
  
  % use section* for acknowledgement
-\section*{Acknowledgment}
-
-
-The authors would like to thank...
-
-
-
-
+% \section*{Acknowledgment}
+% 
+% The authors would like to thank...
  
  % trigger a \newpage just before the given reference
  % number - used to balance the columns on the last page
  % adjust value as needed - may need to be readjusted if
  % the document is modified later
-%\IEEEtriggeratref{8}
+\IEEEtriggeratref{14}
  % The "triggered" command can be changed if desired:
  %\IEEEtriggercmd{\enlargethispage{-5in}}