X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FPrototype.tex;h=85e7fb2816ffb04aeb898b4ffebb5e36cd23fac4;hp=1db9c6d73aff1d766dabffeed186d63bd43cebcc;hb=dde172499703d199e396b84bb6d5c13dae8cdd8d;hpb=d957c07d7ba3f6b2bccc6aafb1c8012882556c6e

diff --git a/Chapters/Prototype.tex b/Chapters/Prototype.tex
index 1db9c6d..85e7fb2 100644
--- a/Chapters/Prototype.tex
+++ b/Chapters/Prototype.tex
@@ -1,12 +1,720 @@
-\chapter{Prototype}
-        Choice of Haskell
-        Core - description of the language (appendix?)
-	Stages (-> Core, Normalization, -> VHDL)
-        Implementation issues
+\chapter[chap:prototype]{Prototype}
+  An important step in this research is the creation of a prototype compiler.
+  Having this prototype allows us to apply the ideas from the previous chapter
+  to actual hardware descriptions and evaluate their usefulness. Having a
+  prototype also helps to find new techniques and test possible
+  interpretations.
+
+  Obviously the prototype was not created after all research
+  ideas were formed, but its implementation has been interleaved with the
+  research itself. Also, the prototype described here is the final version, it
+  has gone through a number of design iterations which we will not completely
+  describe here.
+
+  \section{Choice of language}
+    When implementing this prototype, the first question to ask is: What
+    (functional) language will we use to describe our hardware? (Note that
+    this does not concern the \emph{implementation language} of the compiler,
+    just the language \emph{translated by} the compiler).
+
+    On the highest level, we have two choices:
+
+    \startitemize
+      \item Create a new functional language from scratch. This has the
+      advantage of having a language that contains exactly those elements that
+      are convenient for describing hardware and can contain special
+      constructs that might.
+      \item Use an existing language and create a new backend for it. This has
+      the advantage that existing tools can be reused, which will speed up
+      development.
+    \stopitemize
+
+    Considering that we required a prototype which should be working quickly,
+    and that implementing parsers, semantic checkers and especially
+    typcheckers isn't exactly the core of this research (but it is lots and
+    lots of work!), using an existing language is the obvious choice. This
+    also has the advantage that a large set of language features is available
+    to experiment with and it is easy to find which features apply well and
+    which don't. A possible second prototype could use a custom language with
+    just the useful features (and possibly extra features that are specific to
+    the domain of hardware description as well).
+
+    The second choice is to pick one of the many existing languages. As
+    mentioned before, this language is Haskell.  This choice has not been the
+    result of a thorough comparison of languages, for the simple reason that
+    the requirements on the language were completely unclear at the start of
+    this language. The fact that Haskell is a language with a broad spectrum
+    of features, that it is commonly used in research projects and that the
+    primary compiler, GHC, provides a high level API to its internals, made
+    Haskell an obvious choice.
+
+    TODO: Was Haskell really a good choice? Perhaps say this somewhere else?
+
+    \subsection{Output language or format}
+        VHDL / Verilog / EDIF etc. Why VHDL?
+
+  \section{Prototype design}
+    As stated above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
+    implement our prototype compiler. To understand the design of the
+    compiler, we will first dive into the \small{GHC} compiler a bit. It's
+    compilation consists of the following steps (slightly simplified):
+
+    \startuseMPgraphic{ghc-pipeline}
+      % Create objects
+      save inp, front, desugar, simpl, back, out;
+      newEmptyBox.inp(0,0);
+      newBox.front(btex Parser etex);
+      newBox.desugar(btex Desugarer etex);
+      newBox.simpl(btex Simplifier etex);
+      newBox.back(btex Backend etex);
+      newEmptyBox.out(0,0);
+
+      % Space the boxes evenly
+      inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c 
+        = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
+      out.c = origin;
+
+      % Draw lines between the boxes. We make these lines "deferred" and give
+      % them a name, so we can use ObjLabel to draw a label beside them.
+      ncline.inp(inp)(front) "name(haskell)";
+      ncline.front(front)(desugar) "name(ast)";
+      ncline.desugar(desugar)(simpl) "name(core)";
+      ncline.simpl(simpl)(back) "name(simplcore)";
+      ncline.back(back)(out) "name(native)";
+      ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
+      ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
+      ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
+      ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
+      ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
+
+      % Draw the objects (and deferred labels)
+      drawObj (inp, front, desugar, simpl, back, out);
+    \stopuseMPgraphic
+    \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
+
+    \startdesc{Frontend}
+      This step takes the Haskell source files and parses them into an
+      abstract syntax tree (\small{AST}). This \small{AST} can express the
+      complete Haskell language and is thus a very complex one (in contrast
+      with the Core \small{AST}, later on). All identifiers in this
+      \small{AST} are resolved by the renamer and all types are checked by the
+      typechecker.
+    \stopdesc
+    \startdesc{Desugaring}
+      This steps takes the full \small{AST} and translates it to the
+      \emph{Core} language. Core is a very small functional language with lazy
+      semantics, that can still express everything Haskell can express. Its
+      simpleness makes Core very suitable for further simplification and
+      translation. Core is the language we will be working on as well.
+    \stopdesc
+    \startdesc{Simplification}
+      Through a number of simplification steps (such as inlining, common
+      subexpression elimination, etc.) the Core program is simplified to make
+      it faster or easier to process further.
+    \stopdesc
+    \startdesc{Backend}
+      This step takes the simplified Core program and generates an actual
+      runnable program for it. This is a big and complicated step we will not
+      discuss it any further, since it is not required for our prototype.
+    \stopdesc
+
+    In this process, there a number of places where we can start our work.
+    Assuming that we don't want to deal with (or modify) parsing, typechecking
+    and other frontend business and that native code isn't really a useful
+    format anymore, we are left with the choice between the full Haskell
+    \small{AST}, or the smaller (simplified) core representation.
+
+    The advantage of taking the full \small{AST} is that the exact structure
+    of the source program is preserved. We can see exactly what the hardware
+    descriiption looks like and which syntax constructs were used. However,
+    the full \small{AST} is a very complicated datastructure. If we are to
+    handle everything it offers, we will quickly get a big compiler.
+
+    Using the core representation gives us a much more compact datastructure
+    (a core expression only uses 9 constructors). Note that this does not mean
+    that the core representation itself is smaller, on the contrary. Since the
+    core language has less constructs, a lot of things will take a larger
+    expression to express.
+
+    However, the fact that the core language is so much smaller, means it is a
+    lot easier to analyze and translate it into something else. For the same
+    reason, \small{GHC} runs its simplifications and optimizations on the core
+    representation as well.
+
+    However, we will use the normal core representation, not the simplified
+    core. Reasons for this are detailed below.
+    
+    The final prototype roughly consists of three steps:
+    
+    \startuseMPgraphic{ghc-pipeline}
+      % Create objects
+      save inp, front, norm, vhdl, out;
+      newEmptyBox.inp(0,0);
+      newBox.front(btex \small{GHC} frontend + desugarer etex);
+      newBox.norm(btex Normalization etex);
+      newBox.vhdl(btex \small{VHDL} generation etex);
+      newEmptyBox.out(0,0);
+
+      % Space the boxes evenly
+      inp.c - front.c = front.c - norm.c = norm.c - vhdl.c 
+        = vhdl.c - out.c = (0, 1.5cm);
+      out.c = origin;
+
+      % Draw lines between the boxes. We make these lines "deferred" and give
+      % them a name, so we can use ObjLabel to draw a label beside them.
+      ncline.inp(inp)(front) "name(haskell)";
+      ncline.front(front)(norm) "name(core)";
+      ncline.norm(norm)(vhdl) "name(normal)";
+      ncline.vhdl(vhdl)(out) "name(vhdl)";
+      ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
+      ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
+      ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
+      ObjLabel.vhdl(btex \small{VHDL} description etex) "labpathname(vhdl)", "labdir(rt)";
+
+      % Draw the objects (and deferred labels)
+      drawObj (inp, front, norm, vhdl, out);
+    \stopuseMPgraphic
+    \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
+
+    \startdesc{Frontend}
+      This is exactly the frontend and desugarer from the \small{GHC}
+      pipeline, that translates Haskell sources to a core representation.
+    \stopdesc
+    \startdesc{Normalization}
+      This is a step that transforms the core representation into a normal
+      form. This normal form is still expressed in the core language, but has
+      to adhere to an extra set of constraints. This normal form is less
+      expressive than the full core language (e.g., it can have limited higher
+      order expressions, has a specific structure, etc.), but is also very
+      close to directly describing hardware.
+    \stopdesc
+    \startdesc{\small{VHDL} generation}
+      The last step takes the normal formed core representation and generates
+      \small{VHDL} for it. Since the normal form has a specific, hardware-like
+      structure, this final step is very straightforward.
+    \stopdesc
+    
+    The most interesting step in this process is the normalization step. That
+    is where more complicated functional constructs, which have no direct
+    hardware interpretation, are removed and translated into hardware
+    constructs. This step is described in a lot of detail at
+    \in{chapter}[chap:normalization].
+    
+  \section{The Core language}
+    Most of the prototype deals with handling the program in the Core
+    language. In this section we will show what this language looks like and
+    how it works.
+
+    The Core language is a functional language that describes
+    \emph{expressions}. Every identifier used in Core is called a
+    \emph{binder}, since it is bound to a value somewhere. On the highest
+    level, a Core program is a collection of functions, each of which bind a
+    binder (the function name) to an expression (the function value, which has
+    a function type).
+
+    The Core language itself does not prescribe any program structure, only
+    expression structure. In the \small{GHC} compiler, the Haskell module
+    structure is used for the resulting Core code as well. Since this is not
+    so relevant for understanding the Core language or the Normalization
+    process, we'll only look at the Core expression language here.
+
+    Each Core expression consists of one of these possible expressions.
+
+    \startdesc{Variable reference}
+\startlambda
+a
+\stoplambda
+      This is a simple reference to a binder. It's written down as the
+      name of the binder that is being referred to, which should of course be
+      bound in a containing scope (including top level scope, so a reference
+      to a top level function is also a variable reference). Additionally,
+      constructors from algebraic datatypes also become variable references.
+
+      The value of this expression is the value bound to the given binder.
+
+      Each binder also carries around its type, but this is usually not shown
+      in the Core expressions. Occasionally, the type of an entire expression
+      or function is shown for clarity, but this is only informational. In
+      practice, the type of an expression is easily determined from the
+      structure of the expression and the types of the binders and occasional
+      cast expressions. This minimize the amount of bookkeeping needed to keep
+      the typing consistent.
+    \stopdesc
+    \startdesc{Literal}
+\startlambda
+10
+\stoplambda
+      This is a simple literal. Only primitive types are supported, like
+      chars, strings, ints and doubles. The types of these literals are the
+      \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
+      normal Haskell versions (but there are builtin conversion functions).
+    \stopdesc
+    \startdesc{Application}
+\startlambda
+func arg
+\stoplambda
+      This is simple function application. Each application consists of two
+      parts: The function part and the argument part. Applications are used
+      for normal function \quote{calls}, but also for applying type
+      abstractions and data constructors.
+      
+      The value of an application is the value of the function part, with the
+      first argument binder bound to the argument part.
+    \stopdesc
+    \startdesc{Lambda abstraction}
+\startlambda
+Î»bndr.body
+\stoplambda
+      This is the basic lambda abstraction, as it occurs in labmda calculus.
+      It consists of a binder part and a body part.  A lambda abstraction
+      creates a function, that can be applied to an argument. 
+     
+      Note that the body of a lambda abstraction extends all the way to the
+      end of the expression, or the closing bracket surrounding the lambda. In
+      other words, the lambda abstraction \quote{operator} has the lowest
+      priority of all.
+
+      The value of an application is the value of the body part, with the
+      binder bound to the value the entire lambda abstraction is applied to.
+    \stopdesc
+    \startdesc{Non-recursive let expression}
+\startlambda
+let bndr = value in body
+\stoplambda
+      A let expression allows you to bind a binder to some value, while
+      evaluating to some other value (where that binder is in scope). This
+      allows for sharing of subexpressions (you can use a binder twice) and
+      explicit \quote{naming} of arbitrary expressions. Note that the binder
+      is not in scope in the value bound to it, so it's not possible to make
+      recursive definitions with the normal form of the let expression (see
+      the recursive form below).
+
+      Even though this let expression is an extension on the basic lambda
+      calculus, it is easily translated to a lambda abstraction. The let
+      expression above would then become:
+
+\startlambda
+(Î»bndr.body) value
+\stoplambda
+
+      This notion might be useful for verifying certain properties on
+      transformations, since a lot of verification work has been done on
+      lambda calculus already.
+
+      The value of a let expression is the value of the body part, with the
+      binder bound to the value. 
+    \stopdesc
+    \startdesc{Recursive let expression}
+\startlambda
+letrec
+  bndr1 = value1
+  \vdots
+  bndrn = valuen
+in 
+  body
+\stoplambda
+
+      This is the recursive version of the let expression. In \small{GHC}'s
+      Core implementation, non-recursive and recursive lets are not so
+      distinct as we present them here, but this provides a clearer overview.
+      
+      The main difference with the normal let expression is that each of the
+      binders is in scope in each of the values, in addition to the body. This
+      allows for self-recursive definitions or mutually recursive definitions.
+
+      It should also be possible to express a recursive let using normal
+      lambda calculus, if we use the \emph{least fixed-point operator},
+      \lam{Y}.
+    \stopdesc
+    \startdesc{Case expression}
+\startlambda
+  case scrut of bndr
+    DEFAULT -> defaultbody
+    C0 bndr0,0 ... bndr0,m -> body0
+    \vdots
+    Cn bndrn,0 ... bndrn,m -> bodyn
+\stoplambda
+
+TODO: Define WHNF
+
+    A case expression is the only way in Core to choose between values. A case
+    expression evaluates its scrutinee, which should have an algebraic
+    datatype, into weak head normal form (\small{WHNF}) and (optionally) binds
+    it to \lam{bndr}. It then chooses a body depending on the constructor of
+    its scrutinee. If none of the constructors match, the \lam{DEFAULT}
+    alternative is chosen. 
+    
+    Since we can only match the top level constructor, there can be no overlap
+    in the alternatives and thus order of alternatives is not relevant (though
+    the \lam{DEFAULT} alternative must appear first for implementation
+    efficiency).
+    
+    Any arguments to the constructor in the scrutinee are bound to each of the
+    binders after the constructor and are in scope only in the corresponding
+    body.
+
+    To support strictness, the scrutinee is always evaluated into WHNF, even
+    when there is only a \lam{DEFAULT} alternative. This allows a strict
+    function argument to be written like:
+
+\startlambda
+function (case argument of arg
+  DEFAULT -> arg)
+\stoplambda
+
+    This seems to be the only use for the extra binder to which the scrutinee
+    is bound. When not using strictness annotations (which is rather pointless
+    in hardware descriptions), \small{GHC} seems to never generate any code
+    making use of this binder. The current prototype does not handle it
+    either, which probably means that code using it would break.
+
+    Note that these case statements are less powerful than the full Haskell
+    case statements. In particular, they do not support complex patterns like
+    in Haskell. Only the constructor of an expression can be matched, complex
+    patterns are implemented using multiple nested case expressions.
+
+    Case statements are also used for unpacking of algebraic datatypes, even
+    when there is only a single constructor. For examples, to add the elements
+    of a tuple, the following Core is generated:
+
+\startlambda
+sum = Î»tuple.case tuple of
+  (,) a b -> a + b
+\stoplambda
+  
+    Here, there is only a single alternative (but no \lam{DEFAULT}
+    alternative, since the single alternative is already exhaustive). When
+    it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
+    (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
+  \stopdesc
+  \startdesc{Cast expression}
+\startlambda
+body :: targettype
+\stoplambda
+    A cast expression allows you to change the type of an expression to an
+    equivalent type. Note that this is not meant to do any actual work, like
+    conversion of data from one format to another, or force a complete type
+    change. Instead, it is meant to change between different representations
+    of the same type, \eg switch between types that are provably equal (but
+    look different).
+    
+    In our hardware descriptions, we typically see casts to change between a
+    Haskell newtype and its contained type, since those are effectively
+    different representations of the same type.
 
-        Haskell language coverage / constraints
-                Recursion
-                Builtin types
-                Custom types (Sum types, product types)
-                Function types / higher order expressions
+    More complex are types that are proven to be equal by the typechecker,
+    but look different at first glance. To ensure that, once the typechecker
+    has proven equality, this information sticks around, explicit casts are
+    added. In our notation we only write the target type, but in reality a
+    cast expressions carries around a \emph{coercion}, which can be seen as a
+    proof of equality. TODO: Example
+
+    The value of a cast is the value of its body, unchanged. The type of this
+    value is equal to the target type, not the type of its body.
+
+    Note that this syntax is also used sometimes to indicate that a particular
+    expression has a particular type, even when no cast expression is
+    involved. This is then purely informational, since the only elements that
+    are explicitely typed in the Core language are the binder references and
+    cast expressions, the types of all other elements are determined at
+    runtime.
+  \stopdesc
+  \startdesc{Note}
+
+    The Core language in \small{GHC} allows adding \emph{notes}, which serve
+    as hints to the inliner or add custom (string) annotations to a core
+    expression. These shouldn't be generated normally, so these are not
+    handled in any way in the prototype.
+  \stopdesc
+  \startdesc{Type}
+\startlambda
+@type
+\stoplambda
+    It is possibly to use a Core type as a Core expression. This is done to
+    allow for type abstractions and applications to be handled as normal
+    lambda abstractions and applications above. This means that a type
+    expression in Core can only ever occur in the argument position of an
+    application, and only if the type of the function that is applied to
+    expects a type as the first argument. This happens for all polymorphic
+    functions, for example, the \lam{fst} function:
+
+\startlambda
+fst :: \forall a. \forall b. (a, b) -> a
+fst = Î»tup.case tup of (,) a b -> a
+
+fstint :: (Int, Int) -> Int
+fstint = Î»a.Î»b.fst @Int @Int a b
+\stoplambda
+    
+    The type of \lam{fst} has two universally quantified type variables. When
+    \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
+    (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
+    the type of \lam{fst} actual type of arguments and result can be found:
+    \lam{fst @Int @Int :: (Int, Int) -> Int}).
+  \stopdesc
+
+  TODO: Core type system
+
+  \section[sec:prototype:statetype]{State annotations in Haskell}
+      Ideal: Type synonyms, since there is no additional code overhead for
+      packing and unpacking. Downside: there is no explicit conversion in Core
+      either, so type synonyms tend to get lost in expressions (they can be
+      preserved in binders, but this makes implementation harder, since that
+      statefulness of a value must be manually tracked).
+
+      Less ideal: Newtype. Requires explicit packing and unpacking of function
+      arguments. If you don't unpack substates, there is no overhead for
+      (un)packing substates. This will result in many nested State constructors
+      in a nested state type. \eg: 
+
+  \starttyping
+  State (State Bit, State (State Word, Bit), Word)
+  \stoptyping
+
+      Alternative: Provide different newtypes for input and output state. This
+      makes the code even more explicit, and typechecking can find even more
+      errors. However, this requires defining two type synomyms for each
+      stateful function instead of just one. \eg:
+  \starttyping
+  type AccumStateIn = StateIn Bit
+  type AccumStateOut = StateOut Bit
+  \stoptyping
+      This also increases the possibility of having different input and output
+      states. Checking for identical input and output state types is also
+      harder, since each element in the state must be unpacked and compared
+      separately.
+
+      Alternative: Provide a type for the entire result type of a stateful
+      function, not just the state part. \eg:
+
+  \starttyping
+  newtype Result state result = Result (state, result)
+  \stoptyping
+      
+      This makes it easy to say "Any stateful function must return a
+      \type{Result} type, without having to sort out result from state. However,
+      this either requires a second type for input state (similar to
+      \type{StateIn} / \type{StateOut} above), or requires the compiler to
+      select the right argument for input state by looking at types (which works
+      for complex states, but when that state has the same type as an argument,
+      things get ambiguous) or by selecting a fixed (\eg, the last) argument,
+      which might be limiting.
+
+      \subsubsection{Example}
+      As an example of the used approach, a simple averaging circuit, that lets
+      the accumulation of the inputs be done by a subcomponent.
+
+      \starttyping
+        newtype State s = State s
+
+        type AccumState = State Bit
+        accum :: Word -> AccumState -> (AccumState, Word)
+        accum i (State s) = (State (s + i), s + i)
+
+        type AvgState = (AccumState, Word)
+        avg :: Word -> AvgState -> (AvgState, Word)
+        avg i (State s) = (State s', o)
+          where
+            (accums, count) = s
+            -- Pass our input through the accumulator, which outputs a sum
+            (accums', sum) = accum i accums
+            -- Increment the count (which will be our new state)
+            count' = count + 1
+            -- Compute the average
+            o = sum / count'
+            s' = (accums', count')
+      \stoptyping
+
+      And the normalized, core-like versions:
+
+      \starttyping
+        accum i spacked = res
+          where
+            s = case spacked of (State s) -> s
+            s' = s + i
+            spacked' = State s'
+            o = s + i
+            res = (spacked', o)
+
+        avg i spacked = res
+          where
+            s = case spacked of (State s) -> s
+            accums = case s of (accums, \_) -> accums
+            count = case s of (\_, count) -> count
+            accumres = accum i accums
+            accums' = case accumres of (accums', \_) -> accums'
+            sum = case accumres of (\_, sum) -> sum
+            count' = count + 1
+            o = sum / count'
+            s' = (accums', count')
+            spacked' = State s'
+            res = (spacked', o)
+      \stoptyping
+
+
+
+      As noted above, any component of a function's state that is a substate,
+      \eg passed on as the state of another function, should have no influence
+      on the hardware generated for the calling function. Any state-specific
+      \small{VHDL} for this component can be generated entirely within the called
+      function. So,we can completely leave out substates from any function.
+      
+      From this observation, we might think to remove the substates from a
+      function's states alltogether, and leave only the state components which
+      are actual states of the current function. While doing this would not
+      remove any information needed to generate \small{VHDL} from the function, it would
+      cause the function definition to become invalid (since we won't have any
+      substate to pass to the functions anymore). We could solve the syntactic
+      problems by passing \type{undefined} for state variables, but that would
+      still break the code on the semantic level (\ie, the function would no
+      longer be semantically equivalent to the original input).
+
+      To keep the function definition correct until the very end of the process,
+      we will not deal with (sub)states until we get to the \small{VHDL} generation.
+      Here, we are translating from Core to \small{VHDL}, and we can simply not generate
+      \small{VHDL} for substates, effectively removing the substate components
+      alltogether.
+
+      There are a few important points when ignore substates.
+
+      First, we have to have some definition of "substate". Since any state
+      argument or return value that represents state must be of the \type{State}
+      type, we can simply look at its type. However, we must be careful to
+      ignore only {\em substates}, and not a function's own state.
+
+      In the example above, this means we should remove \type{accums'} from
+      \type{s'}, but not throw away \type{s'} entirely. We should, however,
+      remove \type{s'} from the output port of the function, since the state
+      will be handled by a \small{VHDL} procedure within the function.
+
+      When looking at substates, these can appear in two places: As part of an
+      argument and as part of a return value. As noted above, these substates
+      can only be used in very specific ways.
+
+      \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
+      completely ignore the argument and generate no input port for it.
+
+      \desc{State variables can be extracted from other state variables.} When
+      extracting a state variable from another state variable, this always means
+      we're extracting a substate, which we can ignore. So, we simply generate no
+      \small{VHDL} for any extraction operation that has a state variable as a result.
+
+      \desc{State variables can be passed to functions.} When passing a
+      state variable to a function, this always means we're passing a substate
+      to a subcomponent. The entire argument can simply be ingored in the
+      resulting port map.
+
+      \desc{State variables can be returned from functions.} When returning a
+      state variable from a function (probably as a part of an algebraic
+      datatype), this always mean we're returning a substate from a
+      subcomponent. The entire state variable should be ignored in the resulting
+      port map. The type binder of the binder that the function call is bound
+      to should not include the state type either.
+
+      \startdesc{State variables can be inserted into other variables.} When inserting
+      a state variable into another variable (usually by constructing that new
+      variable using its constructor), we can identify two cases: 
+
+      \startitemize
+        \item The state is inserted into another state variable. In this case,
+        the inserted state is a substate, and can be safely left out of the
+        constructed variable.
+        \item The state is inserted into a non-state variable. This happens when
+        building up the return value of a function, where you put state and
+        retsult variables together in an algebraic type (usually a tuple). In
+        this case, we should leave the state variable out as well, since we
+        don't want it to be included as an output port.
+      \stopitemize
+
+      So, in both cases, we can simply leave out the state variable from the
+      resulting value. In the latter case, however, we should generate a state
+      proc instead, which assigns the state variable to the input state variable
+      at each clock tick.
+      \stopdesc
+      
+      \desc{State variables can appear as (part of) a function result.} When
+      generating \small{VHDL}, we can completely ignore any part of a function result
+      that has a state type. If the entire result is a state type, this will
+      mean the entity will not have an output port. Otherwise, the state
+      elements will be removed from the type of the output port.
+
+
+      Now, we know how to handle each use of a state variable separately. If we
+      look at the whole, we can conclude the following:
+
+      \startitemize
+      \item A state unpack operation should not generate any \small{VHDL}. The binder
+      to which the unpacked state is bound should still be declared, this signal
+      will become the register and will hold the current state.
+      \item A state pack operation should not generate any \small{VHDL}. The binder th
+      which the packed state is bound should not be declared. The binder that is
+      packed is the signal that will hold the new state.
+      \item Any values of a State type should not be translated to \small{VHDL}. In
+      particular, State elements should be removed from tuples (and other
+      datatypes) and arguments with a state type should not generate ports.
+      \item To make the state actually work, a simple \small{VHDL} proc should be
+      generated. This proc updates the state at every clockcycle, by assigning
+      the new state to the current state. This will be recognized by synthesis
+      tools as a register specification.
+      \stopitemize
+
+
+      When applying these rules to the example program (in normal form), we will
+      get the following result. All the parts that don't generate any value are
+      crossed out, leaving some very boring assignments here and there.
+      
+    
+  \starthaskell
+    avg i --spacked-- = res
+      where
+        s = --case spacked of (State s) -> s--
+        --accums = case s of (accums, \_) -> accums--
+        count = case s of (--\_,-- count) -> count
+        accumres = accum i --accums--
+        --accums' = case accumres of (accums', \_) -> accums'--
+        sum = case accumres of (--\_,-- sum) -> sum
+        count' = count + 1
+        o = sum / count'
+        s' = (--accums',-- count')
+        --spacked' = State s'--
+        res = (--spacked',-- o)
+  \stophaskell
+          
+      When we would really leave out the crossed out parts, we get a slightly
+      weird program: There is a variable \type{s} which has no value, and there
+      is a variable \type{s'} that is never used. Together, these two will form
+      the state proc of the function. \type{s} contains the "current" state,
+      \type{s'} is assigned the "next" state. So, at the end of each clock
+      cycle, \type{s'} should be assigned to \type{s}.
+
+      Note that the definition of \type{s'} is not removed, even though one
+      might think it as having a state type. Since the state type has a single
+      argument constructor \type{State}, some type that should be the resulting
+      state should always be explicitly packed with the State constructor,
+      allowing us to remove the packed version, but still generate \small{VHDL} for the
+      unpacked version (of course with any substates removed).
+      
+      As you can see, the definition of \type{s'} is still present, since it
+      does not have a state type (The State constructor. The \type{accums'} substate has been removed,
+      leaving us just with the state of \type{avg} itself.
+    \subsection{Initial state}
+      How to specify the initial state? Cannot be done inside a hardware
+      function, since the initial state is its own state argument for the first
+      call (unless you add an explicit, synchronous reset port).
+
+      External init state is natural for simulation.
+
+      External init state works for hardware generation as well.
+
+      Implementation issues: state splitting, linking input to output state,
+      checking usage constraints on state variables.
+
+        Implementation issues
+          \subsection[sec:prototype:separate]{Separate compilation}
+          - Simplified core?
 
+  \section{Haskell language coverage and constraints}
+    Recursion
+    Builtin types
+    Custom types (Sum types, product types)
+    Function types / higher order expressions