Fix source indentation.

[matthijs/master-project/report.git] / Chapters / Prototype.tex
diff --git a/Chapters/Prototype.tex b/Chapters/Prototype.tex

index 6ef6a9785a2f9f38ed673722cb14522071ffda89..fa0b54a4cdefe669fa3b47798f059acfc84ff16e 100644 (file)
--- a/Chapters/Prototype.tex
+++ b/Chapters/Prototype.tex
@@ -11,7 +11,7 @@
    has gone through a number of design iterations which we will not completely
    describe here.
  
    has gone through a number of design iterations which we will not completely
    describe here.
  
-  \section{Choice of language}
+  \section[sec:prototype:input]{Input language}
      When implementing this prototype, the first question to ask is: What
      (functional) language will we use to describe our hardware? (Note that
      this does not concern the \emph{implementation language} of the compiler,
      When implementing this prototype, the first question to ask is: What
      (functional) language will we use to describe our hardware? (Note that
      this does not concern the \emph{implementation language} of the compiler,
@@ -50,6 +50,50 @@
  
      TODO: Was Haskell really a good choice? Perhaps say this somewhere else?
  
  
      TODO: Was Haskell really a good choice? Perhaps say this somewhere else?
  
+  \section[sec:prototype:output]{Output format}
+    The second important question is: What will be our output format? Since
+    our prototype won't be able to program FPGA's directly, we'll have to have
+    output our hardware in some format that can be later processed and
+    programmed by other tools.
+
+    Looking at other tools in the industry, the Electronic Design Interchange
+    Format (\small{EDIF}) is commonly used for storing intermediate
+    \emph{netlists} (lists of components and connections between these
+    components) and is commonly the target for \small{VHDL} and Verilog
+    compilers.
+
+    However, \small{EDIF} is not completely tool-independent. It specifies a
+    meta-format, but the hardware components that can be used vary between
+    various tool and hardware vendors, as well as the interpretation of the
+    \small{EDIF} standard (TODO Is this still true? Reference:
+    http://delivery.acm.org/10.1145/80000/74534/p803-li.pdf?key1=74534\&key2=8370537521\&coll=GUIDE\&dl=GUIDE\&CFID=61207158\&CFTOKEN=61908473).
+   
+    This means that when working with EDIF, our prototype would become
+    technology dependent (\eg only work with \small{FPGA}s of a specific
+    vendor, or even only with specific chips). This limits the applicability
+    of our prototype. Also, the tools we'd like to use for verifying,
+    simulating and draw pretty pictures of our output (like Precision, or
+    QuestaSim) work on \small{VHDL} or Verilog input (TODO: Is this really
+    true?).
+
+    For these reasons, we will use \small{VHDL} as our output language.
+    Verilog is not used simply because we are familiar with \small{VHDL}
+    already. The differences between \small{VHDL} and Verilog are on the
+    higher level, while we will be using \small{VHDL} mainly to write low
+    level, netlist-like descriptions anyway.
+
+    An added advantage of using VHDL is that we can profit from existing
+    optimizations in VHDL synthesizers. A lot of optimizations are done on the
+    VHDL level by existing tools. These tools have years of experience in this
+    field, so it would not be reasonable to assume we could achieve a similar
+    amount of optimization in our prototype (nor should it be a goal,
+    considering this is just a prototype).
+
+    Note that we will be using \small{VHDL} as our output language, but will
+    not use its full expressive power. Our output will be limited to using
+    simple, structural descriptions, without any behavioural descriptions
+    (which might not be supported by all tools).
+
    \section{Prototype design}
      As stated above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
      implement our prototype compiler. To understand the design of the
    \section{Prototype design}
      As stated above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
      implement our prototype compiler. To understand the design of the
@@ -218,9 +262,9 @@
      Each Core expression consists of one of these possible expressions.
  
      \startdesc{Variable reference}
      Each Core expression consists of one of these possible expressions.
  
      \startdesc{Variable reference}
-\startlambda
-a
-\stoplambda
+      \startlambda
+      a
+      \stoplambda
        This is a simple reference to a binder. It's written down as the
        name of the binder that is being referred to, which should of course be
        bound in a containing scope (including top level scope, so a reference
        This is a simple reference to a binder. It's written down as the
        name of the binder that is being referred to, which should of course be
        bound in a containing scope (including top level scope, so a reference
@@ -237,19 +281,21 @@ a
        cast expressions. This minimize the amount of bookkeeping needed to keep
        the typing consistent.
      \stopdesc
        cast expressions. This minimize the amount of bookkeeping needed to keep
        the typing consistent.
      \stopdesc
+
      \startdesc{Literal}
      \startdesc{Literal}
-\startlambda
-10
-\stoplambda
+      \startlambda
+      10
+      \stoplambda
        This is a simple literal. Only primitive types are supported, like
        chars, strings, ints and doubles. The types of these literals are the
        \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
        normal Haskell versions (but there are builtin conversion functions).
      \stopdesc
        This is a simple literal. Only primitive types are supported, like
        chars, strings, ints and doubles. The types of these literals are the
        \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
        normal Haskell versions (but there are builtin conversion functions).
      \stopdesc
+
      \startdesc{Application}
      \startdesc{Application}
-\startlambda
-func arg
-\stoplambda
+      \startlambda
+      func arg
+      \stoplambda
        This is simple function application. Each application consists of two
        parts: The function part and the argument part. Applications are used
        for normal function \quote{calls}, but also for applying type
        This is simple function application. Each application consists of two
        parts: The function part and the argument part. Applications are used
        for normal function \quote{calls}, but also for applying type
@@ -258,10 +304,11 @@ func arg
        The value of an application is the value of the function part, with the
        first argument binder bound to the argument part.
      \stopdesc
        The value of an application is the value of the function part, with the
        first argument binder bound to the argument part.
      \stopdesc
+
      \startdesc{Lambda abstraction}
      \startdesc{Lambda abstraction}
-\startlambda
-λbndr.body
-\stoplambda
+      \startlambda
+      λbndr.body
+      \stoplambda
        This is the basic lambda abstraction, as it occurs in labmda calculus.
        It consists of a binder part and a body part.  A lambda abstraction
        creates a function, that can be applied to an argument. 
        This is the basic lambda abstraction, as it occurs in labmda calculus.
        It consists of a binder part and a body part.  A lambda abstraction
        creates a function, that can be applied to an argument. 
@@ -274,10 +321,11 @@ func arg
        The value of an application is the value of the body part, with the
        binder bound to the value the entire lambda abstraction is applied to.
      \stopdesc
        The value of an application is the value of the body part, with the
        binder bound to the value the entire lambda abstraction is applied to.
      \stopdesc
+
      \startdesc{Non-recursive let expression}
      \startdesc{Non-recursive let expression}
-\startlambda
-let bndr = value in body
-\stoplambda
+      \startlambda
+      let bndr = value in body
+      \stoplambda
        A let expression allows you to bind a binder to some value, while
        evaluating to some other value (where that binder is in scope). This
        allows for sharing of subexpressions (you can use a binder twice) and
        A let expression allows you to bind a binder to some value, while
        evaluating to some other value (where that binder is in scope). This
        allows for sharing of subexpressions (you can use a binder twice) and
@@ -290,9 +338,9 @@ let bndr = value in body
        calculus, it is easily translated to a lambda abstraction. The let
        expression above would then become:
  
        calculus, it is easily translated to a lambda abstraction. The let
        expression above would then become:
  
-\startlambda
-(λbndr.body) value
-\stoplambda
+      \startlambda
+      (λbndr.body) value
+      \stoplambda
  
        This notion might be useful for verifying certain properties on
        transformations, since a lot of verification work has been done on
  
        This notion might be useful for verifying certain properties on
        transformations, since a lot of verification work has been done on
@@ -301,16 +349,16 @@ let bndr = value in body
        The value of a let expression is the value of the body part, with the
        binder bound to the value. 
      \stopdesc
        The value of a let expression is the value of the body part, with the
        binder bound to the value. 
      \stopdesc
-    \startdesc{Recursive let expression}
-\startlambda
-let 
-  bndr1 = value1
-  \vdots
-  bndrn = valuen
-in 
-  body
-\stoplambda
  
  
+    \startdesc{Recursive let expression}
+      \startlambda
+      letrec
+        bndr1 = value1
+        \vdots
+        bndrn = valuen
+      in 
+        body
+      \stoplambda
        This is the recursive version of the let expression. In \small{GHC}'s
        Core implementation, non-recursive and recursive lets are not so
        distinct as we present them here, but this provides a clearer overview.
        This is the recursive version of the let expression. In \small{GHC}'s
        Core implementation, non-recursive and recursive lets are not so
        distinct as we present them here, but this provides a clearer overview.
@@ -323,141 +371,401 @@ in
        lambda calculus, if we use the \emph{least fixed-point operator},
        \lam{Y}.
      \stopdesc
        lambda calculus, if we use the \emph{least fixed-point operator},
        \lam{Y}.
      \stopdesc
+
      \startdesc{Case expression}
      \startdesc{Case expression}
-\startlambda
-  case scrut of bndr
-    DEFAULT -> defaultbody
-    C0 bndr0,0 ... bndr0,m -> body0
-    \vdots
-    Cn bndrn,0 ... bndrn,m -> bodyn
-\stoplambda
-
-TODO: Define WHNF
-
-    A case expression is the only way in Core to choose between values. A case
-    expression evaluates its scrutinee, which should have an algebraic
-    datatype, into weak head normal form (\small{WHNF}) and (optionally) binds
-    it to \lam{bndr}. It then chooses a body depending on the constructor of
-    its scrutinee. If none of the constructors match, the \lam{DEFAULT}
-    alternative is chosen. 
-    
-    Since we can only match the top level constructor, there can be no overlap
-    in the alternatives and thus order of alternatives is not relevant (though
-    the \lam{DEFAULT} alternative must appear first for implementation
-    efficiency).
-    
-    Any arguments to the constructor in the scrutinee are bound to each of the
-    binders after the constructor and are in scope only in the corresponding
-    body.
-
-    To support strictness, the scrutinee is always evaluated into WHNF, even
-    when there is only a \lam{DEFAULT} alternative. This allows a strict
-    function argument to be written like:
-
-\startlambda
-function (case argument of arg
-  DEFAULT -> arg)
-\stoplambda
-
-    This seems to be the only use for the extra binder to which the scrutinee
-    is bound. When not using strictness annotations (which is rather pointless
-    in hardware descriptions), \small{GHC} seems to never generate any code
-    making use of this binder. The current prototype does not handle it
-    either, which probably means that code using it would break.
-
-    Note that these case statements are less powerful than the full Haskell
-    case statements. In particular, they do not support complex patterns like
-    in Haskell. Only the constructor of an expression can be matched, complex
-    patterns are implemented using multiple nested case expressions.
-
-    Case statements are also used for unpacking of algebraic datatypes, even
-    when there is only a single constructor. For examples, to add the elements
-    of a tuple, the following Core is generated:
-
-\startlambda
-sum = λtuple.case tuple of
-  (,) a b -> a + b
-\stoplambda
-  
-    Here, there is only a single alternative (but no \lam{DEFAULT}
-    alternative, since the single alternative is already exhaustive). When
-    it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
-    (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
-  \stopdesc
-  \startdesc{Cast expression}
-\startlambda
-body :: targettype
-\stoplambda
-    A cast expression allows you to change the type of an expression to an
-    equivalent type. Note that this is not meant to do any actual work, like
-    conversion of data from one format to another, or force a complete type
-    change. Instead, it is meant to change between different representations
-    of the same type, \eg switch between types that are provably equal (but
-    look different).
+      \startlambda
+        case scrut of bndr
+          DEFAULT -> defaultbody
+          C0 bndr0,0 ... bndr0,m -> body0
+          \vdots
+          Cn bndrn,0 ... bndrn,m -> bodyn
+      \stoplambda
+
+      TODO: Define WHNF
+
+      A case expression is the only way in Core to choose between values. A case
+      expression evaluates its scrutinee, which should have an algebraic
+      datatype, into weak head normal form (\small{WHNF}) and (optionally) binds
+      it to \lam{bndr}. It then chooses a body depending on the constructor of
+      its scrutinee. If none of the constructors match, the \lam{DEFAULT}
+      alternative is chosen. 
+      
+      Since we can only match the top level constructor, there can be no overlap
+      in the alternatives and thus order of alternatives is not relevant (though
+      the \lam{DEFAULT} alternative must appear first for implementation
+      efficiency).
+      
+      Any arguments to the constructor in the scrutinee are bound to each of the
+      binders after the constructor and are in scope only in the corresponding
+      body.
+
+      To support strictness, the scrutinee is always evaluated into WHNF, even
+      when there is only a \lam{DEFAULT} alternative. This allows a strict
+      function argument to be written like:
+
+      \startlambda
+      function (case argument of arg
+        DEFAULT -> arg)
+      \stoplambda
+
+      This seems to be the only use for the extra binder to which the scrutinee
+      is bound. When not using strictness annotations (which is rather pointless
+      in hardware descriptions), \small{GHC} seems to never generate any code
+      making use of this binder. The current prototype does not handle it
+      either, which probably means that code using it would break.
+
+      Note that these case statements are less powerful than the full Haskell
+      case statements. In particular, they do not support complex patterns like
+      in Haskell. Only the constructor of an expression can be matched, complex
+      patterns are implemented using multiple nested case expressions.
+
+      Case statements are also used for unpacking of algebraic datatypes, even
+      when there is only a single constructor. For examples, to add the elements
+      of a tuple, the following Core is generated:
+
+      \startlambda
+      sum = λtuple.case tuple of
+        (,) a b -> a + b
+      \stoplambda
      
      
-    In our hardware descriptions, we typically see casts to change between a
-    Haskell newtype and its contained type, since those are effectively
-    different representations of the same type.
-
-    More complex are types that are proven to be equal by the typechecker,
-    but look different at first glance. To ensure that, once the typechecker
-    has proven equality, this information sticks around, explicit casts are
-    added. In our notation we only write the target type, but in reality a
-    cast expressions carries around a \emph{coercion}, which can be seen as a
-    proof of equality. TODO: Example
-
-    The value of a cast is the value of its body, unchanged. The type of this
-    value is equal to the target type, not the type of its body.
-
-    Note that this syntax is also used sometimes to indicate that a particular
-    expression has a particular type, even when no cast expression is
-    involved. This is then purely informational, since the only elements that
-    are explicitely typed in the Core language are the binder references and
-    cast expressions, the types of all other elements are determined at
-    runtime.
-  \stopdesc
-  \startdesc{Note}
-
-    The Core language in \small{GHC} allows adding \emph{notes}, which serve
-    as hints to the inliner or add custom (string) annotations to a core
-    expression. These shouldn't be generated normally, so these are not
-    handled in any way in the prototype.
-  \stopdesc
-  \startdesc{Type}
-\startlambda
-@type
-\stoplambda
-    It is possibly to use a Core type as a Core expression. This is done to
-    allow for type abstractions and applications to be handled as normal
-    lambda abstractions and applications above. This means that a type
-    expression in Core can only ever occur in the argument position of an
-    application, and only if the type of the function that is applied to
-    expects a type as the first argument. This happens for all polymorphic
-    functions, for example, the \lam{fst} function:
-
-\startlambda
-fst :: \forall a. \forall b. (a, b) -> a
-fst = λtup.case tup of (,) a b -> a
-
-fstint :: (Int, Int) -> Int
-fstint = λa.λb.fst @Int @Int a b
-\stoplambda
+      Here, there is only a single alternative (but no \lam{DEFAULT}
+      alternative, since the single alternative is already exhaustive). When
+      it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
+      (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
+    \stopdesc
+
+    \startdesc{Cast expression}
+      \startlambda
+      body :: targettype
+      \stoplambda
+      A cast expression allows you to change the type of an expression to an
+      equivalent type. Note that this is not meant to do any actual work, like
+      conversion of data from one format to another, or force a complete type
+      change. Instead, it is meant to change between different representations
+      of the same type, \eg switch between types that are provably equal (but
+      look different).
+      
+      In our hardware descriptions, we typically see casts to change between a
+      Haskell newtype and its contained type, since those are effectively
+      different representations of the same type.
+
+      More complex are types that are proven to be equal by the typechecker,
+      but look different at first glance. To ensure that, once the typechecker
+      has proven equality, this information sticks around, explicit casts are
+      added. In our notation we only write the target type, but in reality a
+      cast expressions carries around a \emph{coercion}, which can be seen as a
+      proof of equality. TODO: Example
+
+      The value of a cast is the value of its body, unchanged. The type of this
+      value is equal to the target type, not the type of its body.
+
+      Note that this syntax is also used sometimes to indicate that a particular
+      expression has a particular type, even when no cast expression is
+      involved. This is then purely informational, since the only elements that
+      are explicitely typed in the Core language are the binder references and
+      cast expressions, the types of all other elements are determined at
+      runtime.
+    \stopdesc
+
+    \startdesc{Note}
+
+      The Core language in \small{GHC} allows adding \emph{notes}, which serve
+      as hints to the inliner or add custom (string) annotations to a core
+      expression. These shouldn't be generated normally, so these are not
+      handled in any way in the prototype.
+    \stopdesc
+
+    \startdesc{Type}
+      \startlambda
+      @type
+      \stoplambda
+      It is possibly to use a Core type as a Core expression. This is done to
+      allow for type abstractions and applications to be handled as normal
+      lambda abstractions and applications above. This means that a type
+      expression in Core can only ever occur in the argument position of an
+      application, and only if the type of the function that is applied to
+      expects a type as the first argument. This happens for all polymorphic
+      functions, for example, the \lam{fst} function:
+
+      \startlambda
+      fst :: \forall a. \forall b. (a, b) -> a
+      fst = λtup.case tup of (,) a b -> a
+
+      fstint :: (Int, Int) -> Int
+      fstint = λa.λb.fst @Int @Int a b
+      \stoplambda
+          
+      The type of \lam{fst} has two universally quantified type variables. When
+      \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
+      (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
+      the type of \lam{fst} actual type of arguments and result can be found:
+      \lam{fst @Int @Int :: (Int, Int) -> Int}).
+    \stopdesc
+
+    TODO: Core type system
+
+  \section[sec:prototype:statetype]{State annotations in Haskell}
+      Ideal: Type synonyms, since there is no additional code overhead for
+      packing and unpacking. Downside: there is no explicit conversion in Core
+      either, so type synonyms tend to get lost in expressions (they can be
+      preserved in binders, but this makes implementation harder, since that
+      statefulness of a value must be manually tracked).
+
+      Less ideal: Newtype. Requires explicit packing and unpacking of function
+      arguments. If you don't unpack substates, there is no overhead for
+      (un)packing substates. This will result in many nested State constructors
+      in a nested state type. \eg: 
+
+      \starttyping
+      State (State Bit, State (State Word, Bit), Word)
+      \stoptyping
+
+      Alternative: Provide different newtypes for input and output state. This
+      makes the code even more explicit, and typechecking can find even more
+      errors. However, this requires defining two type synomyms for each
+      stateful function instead of just one. \eg:
+
+      \starttyping
+      type AccumStateIn = StateIn Bit
+      type AccumStateOut = StateOut Bit
+      \stoptyping
+
+      This also increases the possibility of having different input and output
+      states. Checking for identical input and output state types is also
+      harder, since each element in the state must be unpacked and compared
+      separately.
+
+      Alternative: Provide a type for the entire result type of a stateful
+      function, not just the state part. \eg:
+
+      \starttyping
+      newtype Result state result = Result (state, result)
+      \stoptyping
+      
+      This makes it easy to say "Any stateful function must return a
+      \type{Result} type, without having to sort out result from state. However,
+      this either requires a second type for input state (similar to
+      \type{StateIn} / \type{StateOut} above), or requires the compiler to
+      select the right argument for input state by looking at types (which works
+      for complex states, but when that state has the same type as an argument,
+      things get ambiguous) or by selecting a fixed (\eg, the last) argument,
+      which might be limiting.
+
+      \subsubsection{Example}
+      As an example of the used approach, a simple averaging circuit, that lets
+      the accumulation of the inputs be done by a subcomponent.
+
+      \starttyping
+        newtype State s = State s
+
+        type AccumState = State Bit
+        accum :: Word -> AccumState -> (AccumState, Word)
+        accum i (State s) = (State (s + i), s + i)
+
+        type AvgState = (AccumState, Word)
+        avg :: Word -> AvgState -> (AvgState, Word)
+        avg i (State s) = (State s', o)
+          where
+            (accums, count) = s
+            -- Pass our input through the accumulator, which outputs a sum
+            (accums', sum) = accum i accums
+            -- Increment the count (which will be our new state)
+            count' = count + 1
+            -- Compute the average
+            o = sum / count'
+            s' = (accums', count')
+      \stoptyping
+
+      And the normalized, core-like versions:
+
+      \starttyping
+        accum i spacked = res
+          where
+            s = case spacked of (State s) -> s
+            s' = s + i
+            spacked' = State s'
+            o = s + i
+            res = (spacked', o)
+
+        avg i spacked = res
+          where
+            s = case spacked of (State s) -> s
+            accums = case s of (accums, \_) -> accums
+            count = case s of (\_, count) -> count
+            accumres = accum i accums
+            accums' = case accumres of (accums', \_) -> accums'
+            sum = case accumres of (\_, sum) -> sum
+            count' = count + 1
+            o = sum / count'
+            s' = (accums', count')
+            spacked' = State s'
+            res = (spacked', o)
+      \stoptyping
+
+
+
+      As noted above, any component of a function's state that is a substate,
+      \eg passed on as the state of another function, should have no influence
+      on the hardware generated for the calling function. Any state-specific
+      \small{VHDL} for this component can be generated entirely within the called
+      function. So,we can completely leave out substates from any function.
+      
+      From this observation, we might think to remove the substates from a
+      function's states alltogether, and leave only the state components which
+      are actual states of the current function. While doing this would not
+      remove any information needed to generate \small{VHDL} from the function, it would
+      cause the function definition to become invalid (since we won't have any
+      substate to pass to the functions anymore). We could solve the syntactic
+      problems by passing \type{undefined} for state variables, but that would
+      still break the code on the semantic level (\ie, the function would no
+      longer be semantically equivalent to the original input).
+
+      To keep the function definition correct until the very end of the process,
+      we will not deal with (sub)states until we get to the \small{VHDL} generation.
+      Here, we are translating from Core to \small{VHDL}, and we can simply not generate
+      \small{VHDL} for substates, effectively removing the substate components
+      alltogether.
+
+      There are a few important points when ignore substates.
+
+      First, we have to have some definition of "substate". Since any state
+      argument or return value that represents state must be of the \type{State}
+      type, we can simply look at its type. However, we must be careful to
+      ignore only {\em substates}, and not a function's own state.
+
+      In the example above, this means we should remove \type{accums'} from
+      \type{s'}, but not throw away \type{s'} entirely. We should, however,
+      remove \type{s'} from the output port of the function, since the state
+      will be handled by a \small{VHDL} procedure within the function.
+
+      When looking at substates, these can appear in two places: As part of an
+      argument and as part of a return value. As noted above, these substates
+      can only be used in very specific ways.
+
+      \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
+      completely ignore the argument and generate no input port for it.
+
+      \desc{State variables can be extracted from other state variables.} When
+      extracting a state variable from another state variable, this always means
+      we're extracting a substate, which we can ignore. So, we simply generate no
+      \small{VHDL} for any extraction operation that has a state variable as a result.
+
+      \desc{State variables can be passed to functions.} When passing a
+      state variable to a function, this always means we're passing a substate
+      to a subcomponent. The entire argument can simply be ingored in the
+      resulting port map.
+
+      \desc{State variables can be returned from functions.} When returning a
+      state variable from a function (probably as a part of an algebraic
+      datatype), this always mean we're returning a substate from a
+      subcomponent. The entire state variable should be ignored in the resulting
+      port map. The type binder of the binder that the function call is bound
+      to should not include the state type either.
+
+      \startdesc{State variables can be inserted into other variables.} When inserting
+      a state variable into another variable (usually by constructing that new
+      variable using its constructor), we can identify two cases: 
+
+      \startitemize
+        \item The state is inserted into another state variable. In this case,
+        the inserted state is a substate, and can be safely left out of the
+        constructed variable.
+        \item The state is inserted into a non-state variable. This happens when
+        building up the return value of a function, where you put state and
+        retsult variables together in an algebraic type (usually a tuple). In
+        this case, we should leave the state variable out as well, since we
+        don't want it to be included as an output port.
+      \stopitemize
+
+      So, in both cases, we can simply leave out the state variable from the
+      resulting value. In the latter case, however, we should generate a state
+      proc instead, which assigns the state variable to the input state variable
+      at each clock tick.
+      \stopdesc
+      
+      \desc{State variables can appear as (part of) a function result.} When
+      generating \small{VHDL}, we can completely ignore any part of a function result
+      that has a state type. If the entire result is a state type, this will
+      mean the entity will not have an output port. Otherwise, the state
+      elements will be removed from the type of the output port.
+
+
+      Now, we know how to handle each use of a state variable separately. If we
+      look at the whole, we can conclude the following:
+
+      \startitemize
+        \item A state unpack operation should not generate any \small{VHDL}. The binder
+        to which the unpacked state is bound should still be declared, this signal
+        will become the register and will hold the current state.
+        \item A state pack operation should not generate any \small{VHDL}. The binder th
+        which the packed state is bound should not be declared. The binder that is
+        packed is the signal that will hold the new state.
+        \item Any values of a State type should not be translated to \small{VHDL}. In
+        particular, State elements should be removed from tuples (and other
+        datatypes) and arguments with a state type should not generate ports.
+        \item To make the state actually work, a simple \small{VHDL} proc should be
+        generated. This proc updates the state at every clockcycle, by assigning
+        the new state to the current state. This will be recognized by synthesis
+        tools as a register specification.
+      \stopitemize
+
+
+      When applying these rules to the example program (in normal form), we will
+      get the following result. All the parts that don't generate any value are
+      crossed out, leaving some very boring assignments here and there.
+      
      
      
-    The type of \lam{fst} has two universally quantified type variables. When
-    \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
-    (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
-    the type of \lam{fst} actual type of arguments and result can be found:
-    \lam{fst @Int @Int :: (Int, Int) -> Int}).
-  \stopdesc
+  \starthaskell
+    avg i --spacked-- = res
+      where
+        s = --case spacked of (State s) -> s--
+        --accums = case s of (accums, \_) -> accums--
+        count = case s of (--\_,-- count) -> count
+        accumres = accum i --accums--
+        --accums' = case accumres of (accums', \_) -> accums'--
+        sum = case accumres of (--\_,-- sum) -> sum
+        count' = count + 1
+        o = sum / count'
+        s' = (--accums',-- count')
+        --spacked' = State s'--
+        res = (--spacked',-- o)
+  \stophaskell
+          
+      When we would really leave out the crossed out parts, we get a slightly
+      weird program: There is a variable \type{s} which has no value, and there
+      is a variable \type{s'} that is never used. Together, these two will form
+      the state proc of the function. \type{s} contains the "current" state,
+      \type{s'} is assigned the "next" state. So, at the end of each clock
+      cycle, \type{s'} should be assigned to \type{s}.
+
+      Note that the definition of \type{s'} is not removed, even though one
+      might think it as having a state type. Since the state type has a single
+      argument constructor \type{State}, some type that should be the resulting
+      state should always be explicitly packed with the State constructor,
+      allowing us to remove the packed version, but still generate \small{VHDL} for the
+      unpacked version (of course with any substates removed).
+      
+      As you can see, the definition of \type{s'} is still present, since it
+      does not have a state type (The State constructor. The \type{accums'} substate has been removed,
+      leaving us just with the state of \type{avg} itself.
+    \subsection{Initial state}
+      How to specify the initial state? Cannot be done inside a hardware
+      function, since the initial state is its own state argument for the first
+      call (unless you add an explicit, synchronous reset port).
  
  
-  TODO: Core type system
+      External init state is natural for simulation.
  
  
-        Implementation issues
-        Simplified core?
+      External init state works for hardware generation as well.
  
  
-        Haskell language coverage / constraints
-                Recursion
-                Builtin types
-                Custom types (Sum types, product types)
-                Function types / higher order expressions
+      Implementation issues: state splitting, linking input to output state,
+      checking usage constraints on state variables.
  
  
+        Implementation issues
+          \subsection[sec:prototype:separate]{Separate compilation}
+          - Simplified core?
+
+  \section{Haskell language coverage and constraints}
+    Recursion
+    Builtin types
+    Custom types (Sum types, product types)
+    Function types / higher order expressions