Fix source indentation.

[matthijs/master-project/report.git] / Chapters / HardwareDescription.tex
diff --git a/Chapters/HardwareDescription.tex b/Chapters/HardwareDescription.tex

index 508c11bef4a4dd3f5028ae08cf7befc99a7c4d56..9e4061c9a4d059ee7fd8182c9d1e04565ecf94df 100644 (file)
--- a/Chapters/HardwareDescription.tex
+++ b/Chapters/HardwareDescription.tex
@@ -76,6 +76,281 @@ and3 a b c = and (and a b) c
  
    TODO: Define top level function and subfunctions/circuits.
  
+  \section{Choice}
+    Since describing components and connections allows us to describe a lot of
+    hardware designs already, there is an obvious thing missing: Choice. We
+    need some way to be able to choose between values based on another value.
+    In Haskell, choice is achieved by case expressions, if expressions or
+    pattern matching (all of which can be reduced to case expressions).
+
+    An obvious way to add choice to our language without having to recognize
+    any of Haskell's syntax, would be to add a primivite \quote{\hs{if}}
+    function. This function would take three arguments: The condition, the
+    value to return when the condition is true and the value to return when
+    the condition is false.
+
+    This \hs{if} function would then essentially describe a multiplexer and
+    allows us to describe any architecture that uses multiplexers. (TODO: Are
+    there other mechanisms of choice in hardware?)
+
+    However, to be able to describe our hardware in a more convenient way, we
+    also need to translate Haskell's choice mechanisms. The easiest of these
+    are of course case expressions (and if expressions, which can be very
+    directly translated to case expressions). A case expression can in turn
+    simply be translated to a conditional assignment, where the conditions use
+    equality comparisons against the constructors in the case expressions.
+
+    In \in{example}[ex:CaseInv] a simple case statement is shown,
+    scrutinizing a boolean value. If we would translate a Boolean to a bit
+    value, we could of course remove the comparator and directly feed 'in'
+    into the multiplex (or even use an inverter instead of a multiplexer).
+    However, we will try to make a general translation, which works for all
+    possible case statements. Optimizations such as these are left for the
+    \VHDL synthesizer, which handles them very well.
+
+    \startbuffer[CaseInv]
+    inv :: Bool -> Bool
+    inv in = case in of
+      True -> False
+      False -> True
+    \stopbuffer
+
+    \startuseMPgraphic{CaseInv}
+      save in, truecmp, falseout, trueout, out, cmp, mux;
+
+      % I/O ports
+      newCircle.in(btex $in$ etex) "framed(false)";
+      newCircle.out(btex $out$ etex) "framed(false)";
+      % Constants
+      newBox.truecmp(btex $True$ etex) "framed(false)";
+      newBox.trueout(btex $True$ etex) "framed(false)";
+      newBox.falseout(btex $False$ etex) "framed(false)";
+
+      % Components
+      newCircle.cmp(btex $==$ etex);
+      newMux.mux;
+
+      in.c       = origin;
+      cmp.c      = in.c + (3cm, 0cm);
+      truecmp.c  = cmp.c + (-1cm, 1cm);
+      mux.sel    = cmp.e + (1cm, -1cm);
+      falseout.c = mux.inpa - (2cm, 0cm);
+      trueout.c  = mux.inpb - (2cm, 0cm);
+      out.c      = mux.out + (2cm, 0cm);
+
+      % Draw objects and lines
+      drawObj(in, out, truecmp, trueout, falseout, cmp, mux);
+
+      ncline(in)(cmp);
+      ncarc(truecmp)(cmp);
+      nccurve(cmp.e)(mux.sel) "angleA(0)", "angleB(-90)";
+      ncline(falseout)(mux) "posB(inpa)";
+      ncline(trueout)(mux) "posB(inpb)";
+      ncline(mux)(out) "posA(out)";
+    \stopuseMPgraphic
+
+    \placeexample[here][ex:CaseInv]{Simple inverter.}
+      \startcombination[2*1]
+        {\typebufferhs{CaseInv}}{Haskell description using a Case expression.}
+        {\boxedgraphic{CaseInv}}{The architecture described by the Haskell description.}
+      \stopcombination
+
+    Slightly more complex (but very powerful) forms of choice are pattern
+    matches. A function can be defined in multiple clauses, where each clause
+    specifies a pattern. When the arguments match the pattern, the
+    corresponding clause will be used.
+
+    \startbuffer[PatternInv]
+    inv :: Bool -> Bool
+    inv True = False
+    inv False = True
+    \stopbuffer
+
+    \placeexample[here][ex:PatternInv]{Simple inverter using pattern matching.}
+        {\typebufferhs{CaseInv}}
+
+    The architecture described in \in{example}[ex:PatternInv] is of course the
+    same one as the one in \in{example}[ex:CaseInv]. The general interpration
+    of pattern matching is also similar to that of case expressions: Generate
+    hardware for each of the clause (like each of the clauses of a case
+    expression) and connect them to the function output through (a number of
+    nested) multiplexers. These multiplexers are driven by comparators and
+    other logic, that check each pattern in turn.
+
+  \section{Types}
+    Apart from giving structure to our hardware, we'll also need to provide
+    some way to translate the values used to hardware equivalents. In
+    particular, this means having to come up with a hardware equivalent for
+    every \emph{type} used in our program.
+
+    Since most functional languages have a lot of standard types that are
+    hard to translate (integers without a fixed size, lists without a static
+    length, etc.), we will start out by defining a number of \quote{builtin}
+    types ourselves. These types are builtin in the sense that our compiler
+    will have a fixed VHDL type for these. User defined types, on the other
+    hand, will have their hardware type derived directly from their Haskell
+    declaration automatically, according to the rules we sketch here.
+
+    \subsection{Builtin types}
+      \startdesc{Bit}
+        This is the most basic type available. It is mapped directly onto
+        the \type{std_logic} \small{VHDL} type. Mapping this to the
+        \type{bit} type might make more sense (since the Haskell version
+        only has two values), but using \type{std_logic} is more standard
+        (and allowed for some experimentation with don't care values)
+      \stopdesc
+      \startdesc{Bool}
+        This is a builtin Haskell type, but is translated exactly like the
+        Bit type. Supporting the Bool type is particularly useful to support
+        \hs{if ... then ... else ...} expressions, which always have a
+        \hs{Bool} value for the condition.
+
+        A \hs{Bool} is translated to a \type{std_logic}, just like \hs{Bit}.
+      \stopdesc
+      \startdesc{SizedWord, SizedInt}
+        These are types to represent integers. \hs{SizedWord} is unsigned,
+        while \hs{SizedInt} is signed. These types are parameterized by a
+        length type, so you can define an unsigned word of 32 bits wide as
+        follows:
+        
+        \starthaskell
+          type Word32 = SizedWord D32
+        \stophaskell
+
+        Here, \hs{D32} is the \emph{type level representation} of the number
+        32. TODO: Introduce dependent types somewhere.
+
+        These types are translated to the \small{VHDL} \type{unsigned} and
+        \type{signed} respectively.
+      \stopdesc
+      \startdesc{RangedWord}
+        This is another type to describe integers, but unlike the previous
+        two it has no specific width, but an upper bound. This means that
+        its range is not limited to powers of two, but can be any number.
+        A \hs{RangedWord} only has an upper bound, its lower bound is
+        implicitly zero. There is a lot of added implementation complexity
+        when adding a lower bound and having just an upper bound was enough
+        for the primary purpose of this type: Typesafely indexing vectors.
+
+        This type is translated to the \type{unsigned} \small{VHDL} type.
+      \stopdesc
+      \startdesc{TFVec}
+        This is a vector type, with a fixed length. It has two type
+        parameters: Its length and the type of the elements contained in it.
+
+        A fixed size vector is translated to a \small{VHDL} array type.
+       
+        The \hs{TF} in its name is a reference to the implementation used in
+        the prototype, which uses \emph{type families}.
+      \stopdesc
+
+      TODO: Reference Christiaan / describe dependent typing
+    \subsection{User-defined types}
+      There are three ways to define new types in Haskell: Algebraic
+      datatypes with the \hs{data} keyword, type synonyms with the \hs{type}
+      keyword and type renamings with the \hs{newtype} keyword. This
+      expclitely excludes more advanced type creation from \GHC extensions
+      such as type families, existential typing, \small{GADT}s, etc.
+
+      The first of these actually introduces a new type, for which we provide
+      the \VHDL translation below. The latter two only define new names for
+      existing types (where synonyms are completely interchangeable and
+      renamings need explicit conversion). Therefore, these don't need any
+      particular \VHDL translation, a synonym or renamed type will just use
+      the same representation as the equivalent type.
+
+      For algebraic types, we can make the following distinction:
+
+      \startdesc{Product types}
+        A product type is an algebraic datatype with a single constructor with
+        two or more fields. This is essentially a way to pack a few values
+        together in a record-like structure. In fact, the builtin tuple types
+        are just product types (and are thus supported in exactly the same
+        way).
+
+        The "product" in its name refers to the collection of values belonging
+        to this type. The collection for a product type is the cartesic
+        product of the collections for the types of its fields.
+
+        These types are translated to \VHDL record types, with one field for
+        every field in the constructor. This translation applies to all single
+        constructor algebraic datatypes, including those with no fields (unit
+        types) and just one field (which are technically not a product).
+      \stopdesc
+      \startdesc{Enumerated types}
+        An enumerated type is an algebraic datatype with multiple constructors, but
+        none of them have fields. This is essentially a way to get an
+        enum-like type containing alternatives.
+
+        Note that Haskell's \hs{Bool} type is also defined as an
+        enumeration type, but we have a fixed translation for that.
+        
+        These types are translated to \VHDL enumerations, with one value for
+        each constructor. This allows references to these constructors to be
+        translated to the corresponding enumeration value.
+      \stopdesc
+      \startdesc{Sum types}
+        A sum type is an algebraic datatype with multiple constructors, where
+        the constructors have one or more fields. Technically, a type with
+        more than one field per constructor is a sum of products type, but
+        for our purposes this distinction does not really make a difference,
+        so we'll leave it out.
+
+        Sum types are currently not supported by the prototype, since its
+        translation is not so obvious as before. The can easily be emulated,
+        as we will see from an example:
+
+        \starthaskell
+        data Sum = A Bit Word | B Word 
+        \stophaskell
+
+        An obvious way to translate this would be to create an enumeration to
+        distinguish the constructors and then create a big record that
+        contains all the fields of all the constructors. This is the same
+        translation that would result from the following enumeration and
+        product type (using a tuple for clarity):
+
+        \starthaskell
+        data SumC = A | B
+        type Sum = (SumC, Bit, Word, Word)
+        \stophaskell
+
+        Here, the \hs{SumC} type effectively signals which of the fields of
+        the \hs{Sum} type are valid, all the other ones have no useful value.
+
+        An obvious problem with this naive approach is the space usage: The
+        example above generates a fairly big \VHDL type. However, we can be
+        sure that the two \hs{Word}s in the \hs{Sum} type will never be valid
+        at the same time, so this is a waste of space.
+
+        Obviously, we could do some duplication detection here to reuse a
+        particular field for another constructor, but this would only
+        partially solve the problem. If I would, for example, have an array of
+        8 bits and a 8 bit unsiged word, these are different types and could
+        not be shared. However, in the final hardware, both of these types
+        would simply be 8 bit connections, so we have a 100\% size increase by
+        not sharing these!
+      \stopdesc
+
+      Another interesting case is that of recursive types. In Haskell, an
+      algebraic datatype can be recursive: Any of its field types can be (or
+      contain) the type being defined. The most well-known recursive type is
+      probably the list type, which is defined is:
+      
+      \starthaskell
+      data List a = Empty | Cons a (List a)
+      \stophaskell
+
+      Where \hs{Empty} is usually written as \hs{[]} and \hs{Cons} as \hs{:}.
+      This immediately shows the problem with recursive types: What hardware
+      type to allocate here? There is no way to statically determine how long
+      the list will be
+      
+      In general, we can say we can never properly translated recursive types:
+      All recursive types have a potentially infinite value (even though in
+      practice they will have a bounded value, there is no way for the
+      compiler to determine an upper bound on its size.
+
    \subsection{Partial application}
    It should be obvious that we cannot generate hardware signals for all
    expressions we can express in Haskell. The most obvious criterium for this
@@ -633,6 +908,3 @@ acc in (State s) = (State s', out)
  
    Due to these complications, we leave other forms of recursion as
    future work as well.
-  
-  \section{Supported types}
-    TODO