From: Matthijs Kooijman <matthijs@stdin.nl>
Date: Sun, 6 Dec 2009 20:08:39 +0000 (+0100)
Subject: Fix a lot of things following from Jan's comments.
X-Git-Tag: final-thesis~80
X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=commitdiff_plain;h=124f838008d9e63d36d6626ebeb453d9f83129dc

Fix a lot of things following from Jan's comments.
---

diff --git a/Chapters/Context.tex b/Chapters/Context.tex
index 9444a6e..f07f1d9 100644
--- a/Chapters/Context.tex
+++ b/Chapters/Context.tex
@@ -1,6 +1,6 @@
 \chapter[chap:context]{Context}
   An obvious question that arises when starting any research is \quote{Has
-  this not been done before?} Using a functional language for describing hardware
+  this not been done before?}. Using a functional language for describing hardware
   is not a new idea at all. In fact, there has been research into functional
   hardware description even before the conventional hardware description
   languages were created. \todo{Reference about early FHDLs} However,
@@ -86,7 +86,7 @@
       variables (\eg, using the same variable twice while only calculating it
       once) and cycles in circuits are non-trivial to properly and safely
       translate (though there is some work to fix this, but that has not been
-      possible in a completely reliable way yet. \cite[gill09]
+      possible in a completely reliable way yet \cite[gill09]).
       \item Some things are verbose to express. Especially ForSyDe suffers
       from a lot of notational overhead due to the Template Haskell approach
       used. Since conditional expressions are not supported, a lot of Haskell's
@@ -99,10 +99,10 @@
       programmers), but the described circuits do not have any polymorphism
       or higher order functions, which can be limiting. \todo{How true or
       appropriate is this point?}
-      \todo[left]{Function structure gets lost (in Lava)}
+      \todo{Function structure gets lost (in Lava)}
     \stopitemize
 
-    \todo[text]{Complete translation in TH is complex: Works with Haskell AST
+    \todo{Complete translation in TH is complex: Works with Haskell AST
     instead of Core}
 
 % vim: set sw=2 sts=2 expandtab:
diff --git a/Chapters/Future.tex b/Chapters/Future.tex
index 38d618f..7331e9c 100644
--- a/Chapters/Future.tex
+++ b/Chapters/Future.tex
@@ -64,10 +64,10 @@ do
 will be desugared into:
 
 \starthaskell
-(somefunc a) >>= (\\x -> otherfunc x)
+(somefunc a) >>= (\x -> otherfunc x)
 \stophaskell
 
-The \hs{\\x -> ...} notation creates a lambda abstraction in Haskell,
+The \hs{\x -> ...} notation creates a lambda abstraction in Haskell,
 that binds the \hs{x} variable. Here, the \hs{>>=} operator is supposed
 to extract whatever result somefunc has and pass it to the lambda
 expression created. This will probably not make the monadic notation
@@ -82,9 +82,9 @@ just moved into these functions), the compiler should still be able to compile
 these descriptions without any special magic (though perhaps it should
 always inline the binding operators to reveal the flow of values).
 
-This is highlights an important aspect of using a functional language for our
+This highlights an important aspect of using a functional language for our
 descriptions: We can use the language itself to provide abstractions of common
-patterns, making our code smaller.
+patterns, making our code smaller and easier to read.
 
 \subsection{Breaking out of the Monad}
 However, simply using the monad notation is not as easy as it sounds. The main
@@ -105,12 +105,12 @@ type Stateful s r = s -> (s, r)
 (>>) :: Stateful s1 r1 -> Stateful s2 r2 -> Stateful (s1, s2) r2
 \stophaskell
 
-What we see here, is that when we compose two stateful functions (whose inputs
-have already been applied, leaving just the state argument to be applied), the
-result is again a stateful function whose state is composed of the two
-\emph{substates}. The return value is simply the return value of the second
-function, discarding the first (to preserve that one, the \hs{>>=} operator
-can be used).
+What we see here, is that when we compose two stateful functions (that
+have already been applied to their inputs, leaving just the state
+argument to be applied to), the result is again a stateful function
+whose state is composed of the two \emph{substates}. The return value is
+simply the return value of the second function, discarding the first (to
+preserve that result, the \hs{>>=} operator can be used).
 
 There is a trick we can apply to change the signature of the \hs{>>} operator.
 \small{GHC} does not require the bind operators to be part of the \hs{Monad}
@@ -123,11 +123,12 @@ is slightly inconvenient, but since we hardly using anything from the prelude,
 this is not a big problem. We might even be able to replace some of the
 Prelude with hardware-translateable versions by doing this.
 
-We can now define the following binding operators. For completeness, we also
-supply the return function, which is not called by \small{GHC} implicitely,
-but can be called explicitely by a hardware description.
+The binding operators can now be defined exactly as they are needed. For
+completeness, the \hs{return} function is also defined. It is not called
+by \small{GHC} implicitely, but can be called explicitely by a hardware
+description. \in{Example}[ex:StateMonad] shows these definitions.
 
-\starthaskell
+\startbuffer[StateMonad]
 (>>) :: Stateful s1 r1 -> Stateful s2 r2 -> Stateful (s1, s2) r2
 f1 >> f2 = f1 >>= \_ -> f2
 
@@ -138,9 +139,11 @@ f1 >>= f2 = \(s1, s2) -> let (s1', r1) = f1 s1
                
 return :: r -> Stateful () r
 return r = \s -> (s, r)
-\stophaskell
+\stopbuffer
+\placeexample[here][ex:StateMonad]{Binding operators to compose stateful computations}
+  {\typebufferhs{StateMonad}}
 
-As you can see, this closely resembles the boilerplate of unpacking state,
+These definitions closely resemble the boilerplate of unpacking state,
 passing it to two functions and repacking the new state. With these
 definitions, we could have writting \in{example}[ex:NestedState] a lot
 shorter, see \in{example}[ex:DoState]. In this example the type signature of
@@ -165,7 +168,7 @@ happening.
       outb <- funcb outa
       return outb
 \stopbuffer
-\placeexample[here][ex:DoState]{Simple function composing two stateful
+\placeexample[][ex:DoState]{Simple function composing two stateful
                                 functions, using do notation.}
                                {\typebufferhs{DoState}}
 
@@ -189,6 +192,13 @@ abstractions, we could hide a lot of the boilerplate code. Extending
 \small{GHC} with some new syntax sugar similar to the do notation might be a
 feasible.
 
+\subsection{Arrows}
+Another abstraction mechanism offered by Haskell are arrows. Arrows are
+a generalisation of monads \cite[hughes98], for which \GHC also supports
+some syntax sugar \cite[paterson01]. Their use for hiding away state
+boilerplate is not directly evident, but since arrows are a complex
+concept further investigation is appropriate.
+
 \section[sec:future:pipelining]{Improved notation or abstraction for pipelining}
 Since pipelining is a very common optimization for hardware systems, it should
 be easy to specify a pipelined system. Since it introduces quite some registers
@@ -296,9 +306,7 @@ events (Like \quote{clock up}, \quote{clock down}, \quote{input A changed},
 \quote{reset}, etc.). When working with multiple clock domains, each domain
 could get its own clock events.
 
-As an example, we would have something like the following:
-
-\starthaskell
+\startbuffer[AsyncDesc]
 data Event = ClockUp | Reset | ...
 
 type MainState = State Word
@@ -321,30 +329,36 @@ main inp event (State acc) = (State acc', acc')
 -- func is some combinatoric expression
 func :: Word -> Event -> Word
 func inp _ = inp * 2 + 3
-\stophaskell
+\stopbuffer
+
+\placeexample[][ex:AsyncDesc]{Hardware description using asynchronous events.}
+  {\typebufferhs{AsyncDesc}}
 
 \todo{Picture}
 
-In this example, we see that every function takes an input of type
-\hs{Event}. The function \hs{main} that takes the output of
-\hs{func} and accumulates it on every clock cycle. On a reset signal, the
-accumulator is reset. The function \hs{func} is just a combinatoric
-function, with no synchronous elements.  We can see this because there is no
-state and the event input is completely ignored. If the compiler is smart
-enough, we could even leave the event input out for functions that don't use
-it, either because they are completely combinatoric (like in this example), or
-because they rely on the the caller to select the clock signal.
+\in{Example}[ex:AsyncDesc] shows a simple example of this event-based
+approach. In this example we see that every function takes an input of
+type \hs{Event}. The function \hs{main} that takes the output of
+\hs{func} and accumulates it on every clock cycle. On a reset signal,
+the accumulator is reset. The function \hs{func} is just a combinatoric
+function, with no synchronous elements.  We can see this because there
+is no state and the event input is completely ignored. If the compiler
+is smart enough, we could even leave the event input out for functions
+that don't use it, either because they are completely combinatoric (like
+in this example), or because they rely on the the caller to select the
+clock signal.
 
 This structure is similar to the event handling structure used to perform I/O
 in languages like Amanda. \todo{ref} There is a top level case expression that
 decides what to do depending on the current input event.
 
-A slightly more complex example that shows a system with two clock domains.
-There is no real integration between the clock domains (there is one input and
-one output for each clock domain), but it does show how multiple clocks can be
-distinguished.
+A slightly more complex example is show in
+\in{example}[ex:MulticlockDesc]. It shows a system with two clock
+domains. There is no real integration between the clock domains in this
+example (there is one input and one output for each clock domain), but
+it does show how multiple clocks can be distinguished.
 
-\starthaskell
+\startbuffer[MulticlockDesc]
 data Event = ClockUpA | ClockUpB | ...
 
 type MainState = State (SubAState, SubBState)
@@ -376,21 +390,25 @@ suba = ...
 type SubBState = State ...
 subb :: Word -> SubAState -> (SubAState, Word)
 subb = ...
-\stophaskell
+\stopbuffer
+
+\placeexample[][ex:MulticlockDesc]{Hardware description with multiple clock domains through events.}
+  {\typebufferhs{MulticlockDesc}}
 
-Note that in the above example the \hs{suba} and \hs{subb} functions are
-\emph{always} called, to get at their combinatoric outputs. The new state
-is calculated as well, but only saved when the right clock has an up
-transition.
+Note that in \in{example}[ex:MulticlockDesc] the \hs{suba} and \hs{subb}
+functions are \emph{always} called, to get at their combinatoric
+outputs. The new state is calculated as well, but only saved when the
+right clock has an up transition.
 
 As you can see, there is some code duplication in the case expression that
 selects the right clock. One of the advantages of an explicit approach like
 this, is that some of this duplication can be extracted away into helper
 functions. For example, we could imagine a \hs{select_clock} function, which
 takes a stateful function that is not aware of events, and changes it into a
-function that only updates its state on a specific (clock) event.
+function that only updates its state on a specific (clock) event. Such a
+function is shown in \in{example}[ex:SelectClock].
 
-\starthaskell
+\startbuffer[SelectClock]
 select_clock :: Event
                 -> (input -> State s -> (State s, output))
                 -> (input -> State s -> Event -> (State s, output))
@@ -404,7 +422,9 @@ main inpa inpb event (State (sa, sb)) = (State (sa', sb'), outa, outb)
   where
     (sa'', outa) = (select_clock ClockUpA suba) inpa sa event
     (sb'', outb) = (select_clock ClockUpB subb) inpb sb event
-\stophaskell
+\stopbuffer
+\placeexample[][ex:SelectClock]{A function to filter clock events.}
+  {\typebufferhs{SelectClock}}
 
 As you can see, this can greatly reduce the length of the main function, while
 increasing the readability. As you might also have noted, the select\_clock
@@ -481,9 +501,7 @@ Another interesting idea is putting a higher order value inside a function's
 state value. Since we can use higher order values anywhere, why not in the
 state?
 
-As a (contrived) example, consider the following accumulator:
-
-\starthaskell
+\startbuffer[HigherAccum]
 -- The accumulator function that takes a word and returns a new accumulator
 -- and the result so far. This is the function we want to put inside the
 -- state.
@@ -493,14 +511,18 @@ acc = \a -> (\b -> acc ( a + b ), a )
 main :: Word -> State Acc -> (State Acc, Word)
 main a s = (State s', out)
   where (s', out) = s a
-\stophaskell
-
-This code uses a function as its state, which implicitely contains the value
-accumulated so far. This is a fairly trivial example, that is more easy to
-write with a simple \hs{Word} state value, but for more complex descriptions
-this style might pay off. Note that in a way we are using the
-\emph{continuation passing style} of writing functions, where we pass a
-sort of \emph{continuation} from one cycle to the next.
+\stopbuffer
+\placeexample[][ex:HigherAccum]{An accumulator using a higher order state.}
+  {\typebufferhs{HigherAccum}}
+
+As a (contrived) example, consider the accumulator in
+\in{example}[ex:HigherAccum]. This code uses a function as its state,
+which implicitely contains the value accumulated so far. This is a
+fairly trivial example, that is more easy to write with a simple
+\hs{Word} state value, but for more complex descriptions this style
+might pay off. Note that in a way we are using the \emph{continuation
+passing style} of writing functions, where we pass a sort of
+\emph{continuation} from one cycle to the next.
 
 Our normalization process completely removes higher order values inside a
 function by moving applications and higher order values around so that every
diff --git a/Chapters/HardwareDescription.tex b/Chapters/HardwareDescription.tex
index e5dab45..ada3817 100644
--- a/Chapters/HardwareDescription.tex
+++ b/Chapters/HardwareDescription.tex
@@ -6,22 +6,23 @@
 
   \todo{Shortshort introduction to CÎ»ash (Bit, Word, and, not, etc.)}
 
-  When translating Haskell to hardware, we need to make choices about what kind
-  of hardware to generate for which Haskell constructs. When faced with
-  choices, we've tried to stick with the most obvious choice wherever
-  possible. In a lot of cases, when you look at a hardware description it is
-  comletely clear what hardware is described. We want our translator to
-  generate exactly that hardware whenever possible, to make working with CÎ»ash
-  as intuitive as possible.
+  To translate Haskell to hardware, every Haskell construct needs a
+  translation to \VHDL. There are often multiple valid translations
+  possible. When faced with choices, the most obvious choice has been
+  chosen wherever possible. In a lot of cases, when a programmer looks
+  at a functional hardware description it is completely clear what
+  hardware is described. We want our translator to generate exactly that
+  hardware whenever possible, to make working with CÎ»ash as intuitive as
+  possible.
    
-  In this chapter we try to describe how we interpret a Haskell program from a
+  In this chapter we describe how to interpret a Haskell program from a
   hardware perspective. We provide a description of each Haskell language
   element that needs translation, to provide a clear picture of what is
   supported and how.
 
   \section[sec:description:application]{Function application}
   \todo{Sidenote: Inputs vs arguments}
-  The basic syntactic element of a functional program are functions and
+  The basic syntactic elements of a functional program are functions and
   function application. These have a single obvious \small{VHDL} translation: Each
   function becomes a hardware component, where each argument is an input port
   and the result value is the (single) output port. This output port can have
@@ -81,15 +82,15 @@ and3 a b c = and (and a b) c
       {\boxedgraphic{And3}}{The architecture described by the Haskell description.}
     \stopcombination
 
-  \note{This section should also mention hierarchy, top level functions and
+  \todo{This section should also mention hierarchy, top level functions and
   subcircuits}
 
   \section{Choice}
     Although describing components and connections allows us to describe a lot of
     hardware designs already, there is an obvious thing missing: Choice. We
     need some way to be able to choose between values based on another value.
-    In Haskell, choice is achieved by \hs{case} expressions, \hs{if} expressions or
-    pattern matching.
+    In Haskell, choice is achieved by \hs{case} expressions, \hs{if}
+    expressions, pattern matching and guards.
 
     An obvious way to add choice to our language without having to recognize
     any of Haskell's syntax, would be to add a primivite \quote{\hs{if}}
@@ -395,14 +396,14 @@ and3 a b c = and (and a b) c
         last one if \hs{B}), all the other ones have no useful value.
 
         An obvious problem with this naive approach is the space usage: The
-        example above generates a fairly big \VHDL type. However, we can be
+        example above generates a fairly big \VHDL type. Since we can be
         sure that the two \hs{Word}s in the \hs{Sum} type will never be valid
-        at the same time, so this is a waste of space.
+        at the same time, this is a waste of space.
 
         Obviously, we could do some duplication detection here to reuse a
         particular field for another constructor, but this would only
         partially solve the problem. If I would, for example, have an array of
-        8 bits and a 8 bit unsiged word, these are different types and could
+        8 bits and an 8 bit unsiged word, these are different types and could
         not be shared. However, in the final hardware, both of these types
         would simply be 8 bit connections, so we have a 100\% size increase by
         not sharing these.
@@ -438,7 +439,7 @@ and3 a b c = and (and a b) c
       In general, we can say we can never properly translate recursive types:
       All recursive types have a potentially infinite value (even though in
       practice they will have a bounded value, there is no way for the
-      compiler to determine an upper bound on its size).
+      compiler to automatically determine an upper bound on its size).
 
   \subsection{Partial application}
   Now we've seen how to to translate application, choice and types, we will
@@ -521,12 +522,15 @@ quadruple n = mul (mul n)
     calls to the same function share the same machine code. Having more
     machine code has implications for speed (due to less efficient caching)
     and memory usage. For normal compilation, it is therefore important to
-    keep the amount of functions limited and maximize the code sharing.
+    keep the amount of functions limited and maximize the code sharing
+    (though there is a tradeoff between speed and memory usage here).
 
     When generating hardware, this is hardly an issue. Having more \quote{code
     sharing} does reduce the amount of \small{VHDL} output (Since different
     component instantiations still share the same component), but after
-    synthesis, the amount of hardware generated is not affected.
+    synthesis, the amount of hardware generated is not affected. This
+    means there is no tradeoff between speed and memory (or rather,
+    chip area) usage.
 
     In particular, if we would duplicate all functions so that there is a
     separate function for every application in the program (\eg, each function
@@ -542,12 +546,13 @@ quadruple n = mul (mul n)
 
     \subsection{Specialization}
       \defref{specialization}
-      Given some function that has a \emph{domain} $D$ (\eg, the set of all
-      possible arguments that could be applied), we create a specialized
-      function with exactly the same behaviour, but with a domain $D' \subset
-      D$. This subset can be chosen in all sorts of ways. Any subset is valid
-      for the general definition of specialization, but in practice only some
-      of them provide useful optimization opportunities.
+      Given some function that has a \emph{domain} $D$ (\eg, the set of
+      all possible arguments that the function could be applied to), we
+      create a specialized function with exactly the same behaviour, but
+      with a domain $D' \subset D$. This subset can be chosen in all
+      sorts of ways. Any subset is valid for the general definition of
+      specialization, but in practice only some of them provide useful
+      optimization opportunities.
 
       Common subsets include limiting a polymorphic argument to a single type
       (\ie, removing polymorphism) or limiting an argument to just a single
@@ -610,8 +615,8 @@ quadruple n = mul (mul n)
     many wires will you lay down for a value that could have any type? When
     type classes are involved, what hardware components will you lay down for
     a class method (whose behaviour depends on the type of its arguments)?
-    Note that the language currently does not allow user-defined typeclasses,
-    but does support partly some of the builtin typeclasses (like \hs{Num}).
+    Note that CÎ»ash currently does not allow user-defined typeclasses,
+    but does partly support some of the builtin typeclasses (like \hs{Num}).
 
     Fortunately, we can again use the principle of specialization: Since every
     function application generates a separate piece of hardware, we can know
@@ -626,7 +631,7 @@ quadruple n = mul (mul n)
     specialized to work just for these specific types, removing the
     polymorphism from the applied functions as well.
 
-    Our top level function must not have a polymorphic type (otherwise we
+    \defref{top level function}Our top level function must not have a polymorphic type (otherwise we
     wouldn't know the hardware interface to our top level function).
 
     By induction, this means that all functions that are (indirectly) called
@@ -659,7 +664,7 @@ quadruple n = mul (mul n)
       \todo{Define pure}
 
       This is a perfect match for a combinatoric circuit, where the output
-      also soley depends on the inputs. However, when state is involved, this
+      also solely depends on the inputs. However, when state is involved, this
       no longer holds. Since we're in charge of our own language (or at least
       let's pretend we aren't using Haskell and we are), we could remove this
       purity constraint and allow a function to return different values
@@ -775,7 +780,7 @@ acc in = out
         the current state is now an argument.
 
         In Haskell, this would look like
-        \in{example}[ex:ExplicitAcc]\footnote[notfinalsyntax]{Note that this example is not in the final
+        \in{example}[ex:ExplicitAcc]\footnote[notfinalsyntax]{This example is not in the final
   CÎ»ash syntax}. \todo{Referencing notfinalsyntax from Introduction.tex doesn't
   work}
 
@@ -924,16 +929,12 @@ acc in s = (s', out)
 
   \section[sec:recursion]{Recursion}
   An import concept in functional languages is recursion. In it's most basic
-  form, recursion is a definition that is defined in terms of itself. A
+  form, recursion is a definition that is described in terms of itself. A
   recursive function is thus a function that uses itself in its body. This
   usually requires multiple evaluations of this function, with changing
   arguments, until eventually an evaluation of the function no longer requires
   itself.
 
-  Recursion in a hardware description is a bit of a funny thing. Usually,
-  recursion is associated with a lot of nondeterminism, stack overflows, but
-  also flexibility and expressive power.
-
   Given the notion that each function application will translate to a
   component instantiation, we are presented with a problem. A recursive
   function would translate to a component that contains itself. Or, more
@@ -946,7 +947,7 @@ acc in s = (s', out)
   cycle (at best, we could generate hardware that needs an infinite number of
   cycles to complete).
   
-  However, most recursive hardware descriptions will describe finite
+  However, most recursive definitions will describe finite
   recursion. This is because the recursive call is done conditionally. There
   is usually a \hs{case} expression where at least one alternative does not contain
   the recursive call, which we call the "base case". If, for each call to the
@@ -975,12 +976,13 @@ acc in s = (s', out)
       False -> head xs + sum (tail xs)
   \stophaskell
 
-  However, the Haskell typechecker will now use the following reasoning (element
-  type of the vector is left out). Below, we write conditions on type
-  variables before the \hs{=>} operator. This is not completely valid Haskell
-  syntax, but serves to illustrate the typechecker reasoning. Also note that a
-  vector can never have a negative length, so \hs{Vector n} implicitly means
-  \hs{(n >= 0) => Vector n}.
+  However, the Haskell typechecker will now use the following reasoning.
+  For simplicity, the element type of a vector is left out, all vectors
+  are assumed to have the same element type. Below, we write conditions
+  on type variables before the \hs{=>} operator. This is not completely
+  valid Haskell syntax, but serves to illustrate the typechecker
+  reasoning. Also note that a vector can never have a negative length,
+  so \hs{Vector n} implicitly means \hs{(n >= 0) => Vector n}.
 
   \todo{This typechecker disregards the type signature}
   \startitemize
@@ -997,28 +999,21 @@ acc in s = (s', out)
   \stopitemize
 
   As you can see, using a simple \hs{case} expression  at value level causes
-  the type checker to always typecheck both alternatives, which can't be done!
-  This is a fundamental problem, that would seem perfectly suited for a type
-  class.  Considering that we need to switch between to implementations of the
-  sum function, based on the type of the argument, this sounds like the
-  perfect problem to solve with a type class. However, this approach has its
-  own problems (not the least of them that you need to define a new typeclass
-  for every recursive function you want to define).
-
-  Another approach tried involved using GADTs to be able to do pattern
-  matching on empty / non empty lists. While this worked partially, it also
-  created problems with more complex expressions.
-
-  \note{This should reference Christiaan}
-
-  Evaluating all possible (and non-possible) ways to add recursion to our
-  descriptions, it seems better to leave out list recursion alltogether. This
-  allows us to focus on other interesting areas instead. By including
-  (builtin) support for a number of higher order functions like map and fold,
-  we can still express most of the things we would use list recursion for.
-  
-  \todo{Expand on this decision a bit}
- 
+  the type checker to always typecheck both alternatives, which can't be
+  done! The typechecker is unable to distinguish the two case
+  alternatives (this is partly possible using \small{GADT}s, but that
+  approach faced other problems \todo{ref christiaan?}).
+
+  This is a fundamental problem, that would seem perfectly suited for a
+  type class.  Considering that we need to switch between to
+  implementations of the sum function, based on the type of the
+  argument, this sounds like the perfect problem to solve with a type
+  class. However, this approach has its own problems (not the least of
+  them that you need to define a new typeclass for every recursive
+  function you want to define).
+
+  \todo{This should reference Christiaan}
+
   \subsection{General recursion}
   Of course there are other forms of recursion, that do not depend on the
   length (and thus type) of a list. For example, simple recursion using a
@@ -1030,7 +1025,12 @@ acc in s = (s', out)
   might use some obscure notation that results in a corner case of the
   simplifier that is not caught and thus non-termination.
 
-  Due to these complications and limited time available, we leave other forms
-  of recursion as future work as well.
+  Evaluating all possible (and non-possible) ways to add recursion to
+  our descriptions, it seems better to limit the scope of this research
+  to exclude recursion. This allows for focusing on other interesting
+  areas instead. By including (builtin) support for a number of higher
+  order functions like \hs{map} and \hs{fold}, we can still express most
+  of the things we would use (list) recursion for.
+  
 
 % vim: set sw=2 sts=2 expandtab:
diff --git a/Chapters/Introduction.tex b/Chapters/Introduction.tex
index f88b225..5b11f5d 100644
--- a/Chapters/Introduction.tex
+++ b/Chapters/Introduction.tex
@@ -1,6 +1,6 @@
 \chapter[chap:introduction]{Introduction}
 This thesis describes the result and process of my work during my
-Master's assignment. In these pages, I will try to introduce the world
+Master's assignment. In these pages, I will introduce the world
 of hardware descriptions, the world of functional languages and
 compilers and introduce the hardware description language CÎ»ash that will
 connect these worlds and puts a step towards making hardware programming
@@ -10,12 +10,12 @@ on the whole easier, more maintainable and generally more pleasant.
 \subject{Research goals}
   This research started out with the notion that a functional program is very
   easy to interpret as a hardware description. A functional program typically
-  does no assumptions about evaluation order and does not have any side
+  makes no assumptions about evaluation order and does not have any side
   effects. This fits hardware nicely, since the evaluation order for hardware
   is simply everything in parallel.
 
   As a motivating example, consider the simple functional program shown in
-  \in{example}[ex:AndWord]\footnote[notfinalsyntax]{Note that this example is not in the final
+  \in{example}[ex:AndWord]\footnote[notfinalsyntax]{This example is not in the final
   CÎ»ash syntax}. This is a very natural way to describe a lot of parallel not
   ports, that perform a bitwise not on a bitvector. The example also shows an
   image of the architecture described.
@@ -140,7 +140,8 @@ on the whole easier, more maintainable and generally more pleasant.
 
 
   Or... is this the description of a single accumulating adder, that will add
-  one element of each input each clock cycle and has a reset value of 0? In
+  one element of each input each clock cycle and has a reset value of
+  0\todo{normal 0}? In
   that case, we would have described the architecture show in \in{example}[ex:RecursiveSumAlt]
 
   \startuseMPgraphic{RecursiveSumAlt}
@@ -210,7 +211,7 @@ on the whole easier, more maintainable and generally more pleasant.
   \stopitemize
 
   In addition to looking at designing a hardware description language, we
-  will also implement a prototype to test drive our ideas. This prototype will
+  will also implement a prototype to test ideas. This prototype will
   translate hardware descriptions written in the Haskell functional language
   to simple (netlist-like) hardware descriptions in the \VHDL language. The
   reasons for choosing these languages are detailed in section
diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex
index 0700b3a..f11788f 100644
--- a/Chapters/Normalization.tex
+++ b/Chapters/Normalization.tex
@@ -24,19 +24,20 @@
   The first step in the core to \small{VHDL} translation process, is normalization. We
   aim to bring the core description into a simpler form, which we can
   subsequently translate into \small{VHDL} easily. This normal form is needed because
-  the full core language is more expressive than \small{VHDL} in some areas and because
-  core can describe expressions that do not have a direct hardware
-  interpretation.
+  the full core language is more expressive than \small{VHDL} in some
+  areas (higher order expressions, limited polymorphism using type
+  classes, etc.) and because core can describe expressions that do not
+  have a direct hardware interpretation.
 
   \section{Normal form}
     The transformations described here have a well-defined goal: To bring the
-    program in a well-defined form that is directly translatable to hardware,
-    while fully preserving the semantics of the program. We refer to this form as
-    the \emph{normal form} of the program. The formal definition of this normal
-    form is quite simple:
+    program in a well-defined form that is directly translatable to
+    \VHDL, while fully preserving the semantics of the program. We refer
+    to this form as the \emph{normal form} of the program. The formal
+    definition of this normal form is quite simple:
 
-    \placedefinition{}{A program is in \emph{normal form} if none of the
-    transformations from this chapter apply.}
+    \placedefinition{}{\startboxed A program is in \emph{normal form} if none of the
+    transformations from this chapter apply.\stopboxed}
 
     Of course, this is an \quote{easy} definition of the normal form, since our
     program will end up in normal form automatically. The more interesting part is
@@ -52,11 +53,11 @@
       can't generate any signals that can have multiple types. All types must be
       completely known to generate hardware.
       
-      \item Any \emph{higher order} constructions must be removed. We can't
+      \item All \emph{higher order} constructions must be removed. We can't
       generate a hardware signal that contains a function, so all values,
-      arguments and returns values used must be first order.
+      arguments and return values used must be first order.
 
-      \item Any complex \emph{nested scopes} must be removed. In the \small{VHDL}
+      \item All complex \emph{nested scopes} must be removed. In the \small{VHDL}
       description, every signal is in a single scope. Also, full expressions are
       not supported everywhere (in particular port maps can only map signal
       names and constants, not complete expressions). To make the \small{VHDL}
@@ -68,13 +69,13 @@
 
     A very simple example of a program in normal form is given in
     \in{example}[ex:MulSum]. As you can see, all arguments to the function (which
-    will become input ports in the final hardware) are at the outer level.
+    will become input ports in the generated \VHDL) are at the outer level.
     This means that the body of the inner lambda abstraction is never a
     function, but always a plain value.
 
     As the body of the inner lambda abstraction, we see a single (recursive)
     let expression, that binds two variables (\lam{mul} and \lam{sum}). These
-    variables will be signals in the final hardware, bound to the output port
+    variables will be signals in the generated \VHDL, bound to the output port
     of the \lam{*} and \lam{+} components.
 
     The final line (the \quote{return value} of the function) selects the
@@ -101,7 +102,7 @@
       newCircle.a(btex $a$ etex) "framed(false)";
       newCircle.b(btex $b$ etex) "framed(false)";
       newCircle.c(btex $c$ etex) "framed(false)";
-      newCircle.sum(btex $res$ etex) "framed(false)";
+      newCircle.sum(btex $sum$ etex) "framed(false)";
 
       % Components
       newCircle.mul(btex * etex);
@@ -131,12 +132,13 @@
         {\boxedgraphic{MulSum}}{The architecture described by the normal form.}
       \stopcombination
 
-    The previous example described composing an architecture by calling other
-    functions (operators), resulting in a simple architecture with components and
-    connections. There is of course also some mechanism for choice in the normal
-    form. In a normal Core program, the \emph{case} expression can be used in a
-    few different ways to describe choice. In normal form, this is limited to a
-    very specific form.
+    \in{Example}[ex:MulSum] showed a function that just applied two
+    other functions (multiplication and addition), resulting in a simple
+    architecture with two components and some connections.  There is of
+    course also some mechanism for choice in the normal form. In a
+    normal Core program, the \emph{case} expression can be used in a few
+    different ways to describe choice. In normal form, this is limited
+    to a very specific form.
 
     \in{Example}[ex:AddSubAlu] shows an example describing a
     simple \small{ALU}, which chooses between two operations based on an opcode
@@ -145,6 +147,7 @@
     expression scrutinizes the variable \lam{opcode} (and scrutinizing more
     complex expressions is not supported). The case expression can select a
     different variable based on the constructor of \lam{opcode}.
+    \refdef{case expression}
 
     \startbuffer[AddSubAlu]
     alu :: Bit -> Word -> Word -> Word
@@ -199,15 +202,17 @@
         {\boxedgraphic{AddSubAlu}}{The architecture described by the normal form.}
       \stopcombination
 
-    As a more complete example, consider \in{example}[ex:NormalComplete]. This
-    example contains everything that is supported in normal form, with the
-    exception of builtin higher order functions. The graphical version of the
-    architecture contains a slightly simplified version, since the state tuple
-    packing and unpacking have been left out. Instead, two seperate registers are
-    drawn. Also note that most synthesis tools will further optimize this
-    architecture by removing the multiplexers at the register input and
-    instead put some gates in front of the register's clock input, but we want
-    to show the architecture as close to the description as possible.
+    As a more complete example, consider
+    \in{example}[ex:NormalComplete]. This example shows everything that
+    is allowed in normal form, except for builtin higher order functions
+    (like \lam{map}). The graphical version of the architecture contains
+    a slightly simplified version, since the state tuple packing and
+    unpacking have been left out. Instead, two separate registers are
+    drawn. Also note that most synthesis tools will further optimize
+    this architecture by removing the multiplexers at the register input
+    and instead put some gates in front of the register's clock input,
+    but we want to show the architecture as close to the description as
+    possible.
 
     As you can see from the previous examples, the generation of the final
     architecture from the normal form is straightforward. In each of the
@@ -315,58 +320,73 @@
     \subsection[sec:normalization:intendednormalform]{Intended normal form definition}
       Now we have some intuition for the normal form, we can describe how we want
       the normal form to look like in a slightly more formal manner. The following
-      EBNF-like description completely captures the intended structure (and
-      generates a subset of GHC's core format).
-
-      Some clauses have an expression listed in parentheses. These are conditions
-      that need to apply to the clause.
+      EBNF-like description captures most of the intended structure (and
+      generates a subset of GHC's core format). 
+      
+      There are two things missing: Cast expressions are sometimes
+      allowed by the prototype, but not specified here and the below
+      definition allows uses of state that cannot be translated to \VHDL
+      properly. These two problems are discussed in
+      \in{section}[sec:normalization:castproblems] and
+      \in{section}[sec:normalization:stateproblems] respectively.
+
+      Some clauses have an expression listed behind them in parentheses.
+      These are conditions that need to apply to the clause. The
+      predicates used there (\lam{lvar()}, \lam{representable()},
+      \lam{gvar()}) will be defined in
+      \in{section}[sec:normalization:predicates].
+
+      An expression is in normal form if it matches the first
+      definition, \emph{normal}.
 
-      \defref{intended normal form definition}
       \todo{Fix indentation}
-      \startlambda
+      \startbuffer[IntendedNormal]
       \italic{normal} := \italic{lambda}
-      \italic{lambda} := Î»var.\italic{lambda} (representable(var))
+      \italic{lambda} := Î»var.\italic{lambda}                        (representable(var))
                       | \italic{toplet} 
-      \italic{toplet} := letrec [\italic{binding}...] in var (representable(var))
-      \italic{binding} := var = \italic{rhs} (representable(rhs))
+      \italic{toplet} := letrec [\italic{binding}...] in var         (representable(var))
+      \italic{binding} := var = \italic{rhs}                         (representable(rhs))
                        -- State packing and unpacking by coercion
-                       | var0 = var1 â¶ State ty (lvar(var1))
-                       | var0 = var1 â¶ ty (var1 :: State ty â§ lvar(var1))
-      \italic{rhs} := userapp
-                   | builtinapp
+                       | var0 = var1 â¶ State ty                      (lvar(var1))
+                       | var0 = var1 â¶ ty                            (var1 :: State ty â§ lvar(var1))
+      \italic{rhs} := \italic{userapp}
+                   | \italic{builtinapp}
                    -- Extractor case
-                   | case var of C a0 ... an -> ai (lvar(var))
+                   | case var of C a0 ... an -> ai                   (lvar(var))
                    -- Selector case
-                   | case var of (lvar(var))
-                      [ DEFAULT -> var ]  (lvar(var))
+                   | case var of                                     (lvar(var))
+                      [ DEFAULT -> var ]                             (lvar(var))
                       C0 w0,0 ... w0,n -> var0
                       \vdots
-                      Cm wm,0 ... wm,n -> varm       (\forall{}i \forall{}j, wi,j \neq vari, lvar(vari))
+                      Cm wm,0 ... wm,n -> varm                       (\forall{}i \forall{}j, wi,j \neq vari, lvar(vari))
       \italic{userapp} := \italic{userfunc}
                        | \italic{userapp} {userarg}
-      \italic{userfunc} := var (gvar(var))
-      \italic{userarg} := var (lvar(var))
+      \italic{userfunc} := var                                       (gvar(var))
+      \italic{userarg} := var                                        (lvar(var))
       \italic{builtinapp} := \italic{builtinfunc}
                           | \italic{builtinapp} \italic{builtinarg}
-      \italic{builtinfunc} := var (bvar(var))
-      \italic{builtinarg} := var (representable(var) â§ lvar(var))
-                          | \italic{partapp} (partapp :: a -> b)
-                          | \italic{coreexpr} (Â¬representable(coreexpr) â§ Â¬(coreexpr :: a -> b))
-      \italic{partapp} := \italic{userapp} | \italic{builtinapp}
-      \stoplambda
-
-      \todo{There can still be other casts around (which the code can handle,
-      e.g., ignore), which still need to be documented here}
+      \italic{builtinfunc} := var                                    (bvar(var))
+      \italic{builtinarg} := var                                     (representable(var) â§ lvar(var))
+                          | \italic{partapp}                         (partapp :: a -> b)
+                          | \italic{coreexpr}                        (Â¬representable(coreexpr) â§ Â¬(coreexpr :: a -> b))
+      \italic{partapp} := \italic{userapp} 
+                       | \italic{builtinapp}
+      \stopbuffer
 
-      When looking at such a program from a hardware perspective, the top level
-      lambda's define the input ports. The variable reference in the body of
-      the recursive let expression is the output port. Most function
-      applications bound by the let expression define a component
-      instantiation, where the input and output ports are mapped to local
-      signals or arguments. Some of the others use a builtin construction (\eg
-      the \lam{case} expression) or call a builtin function (\eg \lam{+} or
-      \lam{map}). For these, a hardcoded \small{VHDL} translation is
-      available.
+      \placedefinition[][def:IntendedNormal]{Definition of the intended nnormal form using an \small{EBNF}-like syntax.}
+          {\defref{intended normal form definition}
+           \typebufferlam{IntendedNormal}}
+
+      When looking at such a program from a hardware perspective, the
+      top level lambda abstractions define the input ports. Lambda
+      abstractions cannot appear anywhere else. The variable reference
+      in the body of the recursive let expression is the output port.
+      Most function applications bound by the let expression define a
+      component instantiation, where the input and output ports are
+      mapped to local signals or arguments. Some of the others use a
+      builtin construction (\eg the \lam{case} expression) or call a
+      builtin function (\eg \lam{+} or \lam{map}). For these, a
+      hardcoded \small{VHDL} translation is available.
 
   \section[sec:normalization:transformation]{Transformation notation}
     To be able to concisely present transformations, we use a specific format
@@ -386,7 +406,7 @@
 
     This format desribes a transformation that applies to \lam{<original
     expresssion>} and transforms it into \lam{<transformed expression>}, assuming
-    that all conditions apply. In this format, there are a number of placeholders
+    that all conditions are satisfied. In this format, there are a number of placeholders
     in pointy brackets, most of which should be rather obvious in their meaning.
     Nevertheless, we will more precisely specify their meaning below:
 
@@ -410,7 +430,7 @@
       transformation applies, commonly to prevent a transformation from
       causing a loop with itself or another transformation.
 
-      Only if these conditions are \emph{all} true, the transformation
+      Only if these conditions are \emph{all} satisfied, the transformation
       applies.
       \stopdesc
 
@@ -431,7 +451,7 @@
       \startdesc{<transformed expression>}
       This is the expression template that is the result of the transformation. If, looking
       at the above three items, the transformation applies, the \lam{<original
-      expression>} is completely replaced with the \lam{<transformed expression>}.
+      expression>} is completely replaced by the \lam{<transformed expression>}.
       We call this a template, because it can contain placeholders, referring to
       any placeholder bound by the \lam{<original expression>} or the
       \lam{<context conditions>}. The resulting expression will have those
@@ -442,8 +462,9 @@
       \stopdesc
 
       \startdesc{<context additions>}
-      These are templates for new functions to add to the context. This is a way
-      to have a transformation create new top level functions.
+      These are templates for new functions to be added to the context.
+      This is a way to let a transformation create new top level
+      functions.
 
       Each addition has the form \lam{binder = template}. As above, any
       placeholder in the addition is replaced with the value bound to it, and any
@@ -451,7 +472,10 @@
       replaced with) a fresh binder.
       \stopdesc
 
-    As an example, we'll look at Î·-abstraction:
+    To understand this notation better, the step by step application of
+    the Î·-abstraction transformation to a simple \small{ALU} will be
+    shown. Consider Î·-abstraction, described using above notation as
+    follows:
 
     \starttrans
     E                 \lam{E :: a -> b}
@@ -466,12 +490,13 @@
     transformation does not apply infinitely (which are not necessarily part
     of the conventional definition of Î·-abstraction).
 
-    Consider the following function, which is a fairly obvious way to specify a
-    simple ALU (Note that \in{example}[ex:AddSubAlu] shows the normal form of this
-    function). The parentheses around the \lam{+} and \lam{-} operators are
-    commonly used in Haskell to show that the operators are used as normal
-    functions, instead of \emph{infix} operators (\eg, the operators appear
-    before their arguments, instead of in between).
+    Consider the following function, in Core notation, which is a fairly obvious way to specify a
+    simple \small{ALU} (Note that it is not yet in normal form, but
+    \in{example}[ex:AddSubAlu] shows the normal form of this function).
+    The parentheses around the \lam{+} and \lam{-} operators are
+    commonly used in Haskell to show that the operators are used as
+    normal functions, instead of \emph{infix} operators (\eg, the
+    operators appear before their arguments, instead of in between).
 
     \startlambda 
     alu :: Bit -> Word -> Word -> Word
@@ -541,8 +566,8 @@
     \stoplambda
     
     Here, the transformation does apply, binding \lam{E} to the entire
-    expression and \lam{x} to the fresh binder \lam{b}, resulting in the
-    replacement:
+    expression (which has type \lam{Word -> Word}) and binding \lam{x}
+    to the fresh binder \lam{b}, resulting in the replacement:
 
     \startlambda
     Î»b.(case opcode of
@@ -550,7 +575,7 @@
       High -> (-)) a b
     \stoplambda
 
-    Again, the transformation does not apply to this lambda abstraction, so we
+    The transformation does not apply to this lambda abstraction, so we
     look at its body. For brevity, we'll put the case expression on one line from
     now on.
 
@@ -636,29 +661,26 @@
       keep the transformation descriptions concise and powerful.
 
     \subsection{Definitions}
-      In the following sections, we will be using a number of functions and
-      notations, which we will define here.
-
-      \subsubsection{Concepts}
-        A \emph{global variable} is any variable (binder) that is bound at the
-        top level of a program, or an external module. A \emph{local variable} is any
-        other variable (\eg, variables local to a function, which can be bound by
-        lambda abstractions, let expressions and pattern matches of case
-        alternatives).  Note that this is a slightly different notion of global versus
-        local than what \small{GHC} uses internally.
-        \defref{global variable} \defref{local variable}
-
-        A \emph{hardware representable} (or just \emph{representable}) type or value
-        is (a value of) a type that we can generate a signal for in hardware. For
-        example, a bit, a vector of bits, a 32 bit unsigned word, etc. Values that are
-        not runtime representable notably include (but are not limited to): Types,
-        dictionaries, functions.
-        \defref{representable}
-
-        A \emph{builtin function} is a function supplied by the CÎ»ash framework, whose
-        implementation is not valid CÎ»ash. The implementation is of course valid
-        Haskell, for simulation, but it is not expressable in CÎ»ash.
-        \defref{builtin function} \defref{user-defined function}
+      A \emph{global variable} is any variable (binder) that is bound at the
+      top level of a program, or an external module. A \emph{local variable} is any
+      other variable (\eg, variables local to a function, which can be bound by
+      lambda abstractions, let expressions and pattern matches of case
+      alternatives). This is a slightly different notion of global versus
+      local than what \small{GHC} uses internally, but for our purposes
+      the distinction \GHC makes is not useful.
+      \defref{global variable} \defref{local variable}
+
+      A \emph{hardware representable} (or just \emph{representable}) type or value
+      is (a value of) a type that we can generate a signal for in hardware. For
+      example, a bit, a vector of bits, a 32 bit unsigned word, etc. Values that are
+      not runtime representable notably include (but are not limited to): Types,
+      dictionaries, functions.
+      \defref{representable}
+
+      A \emph{builtin function} is a function supplied by the CÎ»ash framework, whose
+      implementation is not valid CÎ»ash. The implementation is of course valid
+      Haskell, for simulation, but it is not expressable in CÎ»ash.
+      \defref{builtin function} \defref{user-defined function}
 
       For these functions, CÎ»ash has a \emph{builtin hardware translation}, so calls
       to these functions can still be translated. These are functions like
@@ -667,18 +689,17 @@
       A \emph{user-defined} function is a function for which we do have a CÎ»ash
       implementation available.
 
-      \subsubsection{Predicates}
+      \subsubsection[sec:normalization:predicates]{Predicates}
         Here, we define a number of predicates that can be used below to concisely
-        specify conditions.\refdef{global variable}
+        specify conditions.
 
         \emph{gvar(expr)} is true when \emph{expr} is a variable that references a
         global variable. It is false when it references a local variable.
 
-        \refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
+        \emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
         references a local variable, false when it references a global variable.
 
-        \refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when
-        \emph{expr} or \emph{var} is \emph{representable}.
+        \emph{representable(expr)} is true when \emph{expr} is \emph{representable}.
 
     \subsection[sec:normalization:uniq]{Binder uniqueness}
       A common problem in transformation systems, is binder uniqueness. When not
@@ -709,11 +730,17 @@
       \stoplambda
 
       This is obviously not what was supposed to happen! The root of this problem is
-      the reuse of binders: Identical binders can be bound in different scopes, such
-      that only the inner one is \quote{visible} in the inner expression. In the example
-      above, the \lam{c} binder was bound outside of the expression and in the inner
-      lambda expression. Inside that lambda expression, only the inner \lam{c} is
-      visible.
+      the reuse of binders: Identical binders can be bound in different,
+      but overlapping scopes. Any variable reference in those
+      overlapping scopes then refers to the variable bound in the inner
+      (smallest) scope. There is not way to refer to the variable in the
+      outer scope. This effect is usually referred to as
+      \emph{shadowing}: When a binder is bound in a scope where the
+      binder already had a value, the inner binding is said to
+      \emph{shadow} the outer binding. In the example above, the \lam{c}
+      binder was bound outside of the expression and in the inner lambda
+      expression. Inside that lambda expression, only the inner \lam{c}
+      can be accessed.
 
       There are a number of ways to solve this. \small{GHC} has isolated this
       problem to their binder substitution code, which performs \emph{deshadowing}
@@ -917,11 +944,15 @@
         This transformation removes let bindings that are never used.
         Occasionally, \GHC's desugarer introduces some unused let bindings.
 
-        This normalization pass should really be unneeded to get into intended normal form
-        (since unused bindings are not forbidden by the normal form), but in practice
-        the desugarer or simplifier emits some unused bindings that cannot be
-        normalized (e.g., calls to a \type{PatError}\todo{Check this name}). Also,
-        this transformation makes the resulting \small{VHDL} a lot shorter.
+        This normalization pass should really be not be necessary to get
+        into intended normal form (since the intended normal form
+        definition \refdef{intended normal form definition} does not
+        require that every binding is used), but in practice the
+        desugarer or simplifier emits some bindings that cannot be
+        normalized (e.g., calls to a
+        \hs{Control.Exception.Base.patError}) but are not used anywhere
+        either. To prevent the \VHDL generation from breaking on these
+        artifacts, this transformation removes them.
 
         \todo{Don't use old-style numerals in transformations}
         \starttrans
@@ -948,25 +979,46 @@
         \todo{Example}
 
       \subsubsection{Cast propagation / simplification}
-        This transform pushes casts down into the expression as far as possible.
-        Since its exact role and need is not clear yet, this transformation is
-        not yet specified.
+        This transform pushes casts down into the expression as far as
+        possible. This transformation has been added to make a few
+        specific corner cases work, but it is not clear yet if this
+        transformation handles cast expressions completely or in the
+        right way. See \in{section}[sec:normalization:castproblems].
 
-        \todo{Cast propagation}
+        \starttrans
+        (let binds in E) â¶ T
+        -------------------------
+        let binds in (E â¶ T)
+        \stoptrans
+
+        \starttrans
+        (case S of
+          p0 -> E0
+          \vdots
+          pn -> En
+        ) â¶ T
+        -------------------------
+        case S of
+          p0 -> E0 â¶ T
+          \vdots
+          pn -> En â¶ T
+        \stoptrans
 
       \subsubsection{Top level binding inlining}
+        \refdef{top level binding}
         This transform takes simple top level bindings generated by the
         \small{GHC} compiler. \small{GHC} sometimes generates very simple
         \quote{wrapper} bindings, which are bound to just a variable
-        reference, or a partial application to constants or other variable
-        references.
+        reference, or contain just a (partial) function appliation with
+        the type and dictionary arguments filled in (such as the
+        \lam{(+)} in the example below).
 
         Note that this transformation is completely optional. It is not
         required to get any function into intended normal form, but it does help making
         the resulting VHDL output easier to read (since it removes a bunch of
         components that are really boring).
 
-        This transform takes any top level binding generated by the compiler,
+        This transform takes any top level binding generated by \GHC,
         whose normalized form contains only a single let binding.
 
         \starttrans
@@ -1021,9 +1073,8 @@
 
         \starttrans
         E                 \lam{E :: a -> b}
-        --------------    \lam{E} is not the first argument of an application.
+        --------------    \lam{E} does not occur on a function position in an application
         Î»x.E x            \lam{E} is not a lambda abstraction.
-                          \lam{x} is a variable that does not occur free in \lam{E}.
         \stoptrans
 
         \startbuffer[from]
@@ -1080,12 +1131,12 @@
 
         \starttrans
         (case x of
-          p1 -> E1
+          p0 -> E0
           \vdots
           pn -> En) M
         -----------------
         case x of
-          p1 -> E1 M
+          p0 -> E0 M
           \vdots
           pn -> En M
         \stoptrans
@@ -1187,21 +1238,22 @@
         This transformation ensures that the return value of a function is always a
         simple local variable reference.
 
-        Currently implemented using lambda simplification, let simplification, and
-        top simplification. Should change into something like the following, which
-        works only on the result of a function instead of any subexpression. This is
-        achieved by the contexts, like \lam{x = E}, though this is strictly not
-        correct (you could read this as "if there is any function \lam{x} that binds
-        \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
-        is bound by \lam{x}. This might need some extra notes or something).
-
-        Note that the return value is not simplified if its not representable.
-        Otherwise, this would cause a direct loop with the inlining of
-        unrepresentable bindings. If the return value is not
-        representable because it has a function type, Î·-abstraction should
-        make sure that this transformation will eventually apply. If the value
-        is not representable for other reasons, the function result itself is
-        not representable, meaning this function is not translatable anyway.
+        This transformation only applies to the entire body of a
+        function instead of any subexpression in a function. This is
+        achieved by the contexts, like \lam{x = E}, though this is
+        strictly not correct (you could read this as "if there is any
+        function \lam{x} that binds \lam{E}, any \lam{E} can be
+        transformed, while we only mean the \lam{E} that is bound by
+        \lam{x}).
+
+        Note that the return value is not simplified if its not
+        representable.  Otherwise, this would cause a direct loop with
+        the inlining of unrepresentable bindings. If the return value is
+        not representable because it has a function type, Î·-abstraction
+        should make sure that this transformation will eventually apply.
+        If the value is not representable for other reasons, the
+        function result itself is not representable, meaning this
+        function is not translatable anyway.
 
         \starttrans
         x = E                            \lam{E} is representable
@@ -1372,7 +1424,7 @@
           alts
         -----------------        \lam{E} is not a local variable reference
         letrec x = E in 
-          case E of
+          case x of
             alts
         \stoptrans
 
@@ -1392,51 +1444,59 @@
         \transexample{letflat}{Case normalisation}{from}{to}
 
 
-      \subsubsection{Case simplification}
-        This transformation ensures that all case expressions become normal form. This
-        means they will become one of:
+      \subsubsection{Case normalization}
+        This transformation ensures that all case expressions get a form
+        that is allowed by the intended normal form. This means they
+        will become one of: \refdef{intended normal form definition}
         \startitemize
-        \item An extractor case with a single alternative that picks a single field
+        \item An extractor case with a single alternative that picks a field
         from a datatype, \eg \lam{case x of (a, b) -> a}.
         \item A selector case with multiple alternatives and only wild binders, that
         makes a choice between expressions based on the constructor of another
         expression, \eg \lam{case x of Low -> a; High -> b}.
         \stopitemize
+
+        For an arbitrary case, that has \lam{n} alternatives, with
+        \lam{m} binders in each alternatives, this will result in \lam{m
+        * n} extractor case expression to get at each variable, \lam{n}
+        let bindings for each of the alternatives' value and a single
+        selector case to select the right value out of these.
+
+        Technically, the defintion of this transformation would require
+        that the constructor for every alternative has exactly the same
+        amount (\lam{m}) of arguments, but of course this transformation
+        also applies when this is not the case.
         
-        \defref{wild binder}
         \starttrans
         case E of
           C0 v0,0 ... v0,m -> E0
           \vdots
           Cn vn,0 ... vn,m -> En
         --------------------------------------------------- \forall i \forall j, 0 â¤ i â¤ n, 0 â¤ i < m (\lam{wi,j} is a wild (unused) binder)
-        letrec
-          v0,0 = case E of C0 v0,0 .. v0,m -> v0,0
+        letrec                                              The case expression is not an extractor case
+          v0,0 = case E of C0 x0,0 .. x0,m -> x0,0          The case expression is not a selector case
           \vdots
-          v0,m = case E of C0 v0,0 .. v0,m -> v0,m
+          v0,m = case E of C0 x0,0 .. x0,m -> x0,m
           \vdots
-          vn,m = case E of Cn vn,0 .. vn,m -> vn,m
-          x0 = E0
+          vn,m = case E of Cn xn,0 .. xn,m -> xn,m
+          y0 = E0
           \vdots
-          xn = En
+          yn = En
         in
           case E of
-            C0 w0,0 ... w0,m -> x0
+            C0 w0,0 ... w0,m -> y0
             \vdots
-            Cn wn,0 ... wn,m -> xn
+            Cn wn,0 ... wn,m -> yn
         \stoptrans
-        \todo{Check the subscripts of this transformation}
 
+        \refdef{wild binder}
         Note that this transformation applies to case expressions with any
-        scrutinee. If the scrutinee is a complex expression, this might result
-        in duplicate hardware. An extra condition to only apply this
-        transformation when the scrutinee is already simple (effectively
-        causing this transformation to be only applied after the scrutinee
-        simplification transformation) might be in order. 
-
-        \fxnote{This transformation specified like this is complicated and misses
-        conditions to prevent looping with itself. Perhaps it should be split here for
-        discussion?}
+        scrutinee. If the scrutinee is a complex expression, this might
+        result in duplication of work (hardware). An extra condition to
+        only apply this transformation when the scrutinee is already
+        simple (effectively causing this transformation to be only
+        applied after the scrutinee simplification transformation) might
+        be in order. 
 
         \startbuffer[from]
         case a of
@@ -1445,7 +1505,7 @@
         \stopbuffer
 
         \startbuffer[to]
-        letnonrec
+        letrec
           x0 = add b 1
           x1 = add b 2
         in
@@ -1481,7 +1541,7 @@
 
       \subsubsection[sec:transformation:caseremoval]{Case removal}
         This transform removes any case expression with a single alternative and
-        only wild binders.
+        only wild binders.\refdef{wild binder}
 
         These "useless" case expressions are usually leftovers from case simplification
         on extractor case (see the previous example).
@@ -1912,10 +1972,12 @@
         ~
         x Y0 ... Yi ... Yn                               \lam{Yi} is not representable
         ---------------------------------------------    \lam{Yi} is not a local variable reference
-        x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} are all free local vars of \lam{Yi}
+        x' Y0 ... Yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} are all free local vars of \lam{Yi}
         ~                                                \lam{T0 ... Tn} are the types of \lam{Y0 ... Yn}
-        x' = Î»(y0 :: T0) ... Î»(yi-1 :: Ty-1). Î»f0 ... Î»fm. Î»(yi+1 :: Ty+1) ...  Î»(yn :: Tn).       
-              E y0 ... yi-1 Yi yi+1 ... yn   
+        x' = Î»(y0 :: T0) ... Î»(yi-1 :: Ty-1). 
+             Î»f0 ... Î»fm.
+             Î»(yi+1 :: Ty+1) ...  Î»(yn :: Tn).       
+               E y0 ... yi-1 Yi yi+1 ... yn   
         \stoptrans
 
         This is a bit of a complex transformation. It transforms an
@@ -1940,7 +2002,7 @@
         from the types of the \emph{actual} arguments (\lam{T0 ... Tn}). This
         means that any polymorphism in the arguments is removed, even when the
         corresponding explicit type lambda is not removed
-        yet.\refdef{type lambda}
+        yet.
 
         \todo{Examples. Perhaps reference the previous sections}
 
@@ -2050,7 +2112,7 @@
         transformation system as well. However, it is likely that there are
         other occurences of this problem.
 
-      \subsection{Casts}
+      \subsection[sec:normalization:castproblems]{Casts}
         We do not fully understand the use of cast expressions in Core, so
         there are probably expressions involving cast expressions that cannot
         be brought into intended normal form by this transformation system.
@@ -2208,7 +2270,7 @@
       \todo{Define Î²-reduction and Î·-reduction?}
 
       Note that the normal form of such a system consists of the set of nodes
-      (expressions) without outgoing edges, since those are the expression to which
+      (expressions) without outgoing edges, since those are the expressions to which
       no transformation applies anymore. We call this set of nodes the \emph{normal
       set}. The set of nodes containing expressions in intended normal
       form \refdef{intended normal form} is called the \emph{intended
diff --git a/Chapters/Prototype.tex b/Chapters/Prototype.tex
index 48c69b9..6a2683e 100644
--- a/Chapters/Prototype.tex
+++ b/Chapters/Prototype.tex
@@ -176,7 +176,7 @@
       discuss it any further, since it is not required for our prototype.
     \stopdesc
 
-    In this process, there a number of places where we can start our work.
+    In this process, there are a number of places where we can start our work.
     Assuming that we don't want to deal with (or modify) parsing, typechecking
     and other frontend business and that native code isn't really a useful
     format anymore, we are left with the choice between the full Haskell
@@ -184,7 +184,7 @@
 
     The advantage of taking the full \small{AST} is that the exact structure
     of the source program is preserved. We can see exactly what the hardware
-    descriiption looks like and which syntax constructs were used. However,
+    description looks like and which syntax constructs were used. However,
     the full \small{AST} is a very complicated datastructure. If we are to
     handle everything it offers, we will quickly get a big compiler.
 
@@ -197,7 +197,7 @@
     However, the fact that the core language is so much smaller, means it is a
     lot easier to analyze and translate it into something else. For the same
     reason, \small{GHC} runs its simplifications and optimizations on the core
-    representation as well.
+    representation as well \cite[jones96].
 
     However, we will use the normal core representation, not the simplified
     core. Reasons for this are detailed below. \todo{Ref}
@@ -271,10 +271,11 @@
     binder (the function name) to an expression (the function value, which has
     a function type).
 
-    The Core language itself does not prescribe any program structure, only
-    expression structure. In the \small{GHC} compiler, the Haskell module
-    structure is used for the resulting Core code as well. Since this is not
-    so relevant for understanding the Core language or the Normalization
+    The Core language itself does not prescribe any program structure
+    (like modules, declarations, imports, etc.), only expression
+    structure. In the \small{GHC} compiler, the Haskell module structure
+    is used for the resulting Core code as well. Since this is not so
+    relevant for understanding the Core language or the Normalization
     process, we'll only look at the Core expression language here.
 
     Each Core expression consists of one of these possible expressions.
@@ -282,16 +283,20 @@
     \startdesc{Variable reference}
       \defref{variable reference}
       \startlambda
-      x :: T
+      bndr :: T
       \stoplambda
       This is a reference to a binder. It's written down as the
       name of the binder that is being referred to along with its type. The
-      binder name should of course be bound in a containing scope (including
-      top level scope, so a reference to a top level function is also a
-      variable reference). Additionally, constructors from algebraic datatypes
-      also become variable references.
+      binder name should of course be bound in a containing scope
+      (including top level scope, so a reference to a top level function
+      is also a variable reference). Additionally, constructors from
+      algebraic datatypes also become variable references.
 
-      The value of this expression is the value bound to the given binder.
+      In our examples, binders will commonly consist of a single
+      characters, but they can have any length.
+
+      The value of this expression is the value bound to the given
+      binder.
 
       Each binder also carries around its type (explicitly shown above), but
       this is usually not shown in the Core expressions. Only when the type is
@@ -308,8 +313,11 @@
       \stoplambda
       This is a literal. Only primitive types are supported, like
       chars, strings, ints and doubles. The types of these literals are the
-      \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
-      normal Haskell versions (but there are builtin conversion functions).
+      \quote{primitive}, unboxed versions, like \lam{Char\#} and \lam{Word\#}, not the
+      normal Haskell versions (but there are builtin conversion
+      functions). Without going into detail about these types, note that
+      a few conversion functions exist to convert these to the normal
+      (boxed) Haskell equivalents.
     \stopdesc
 
     \startdesc{Application}
@@ -339,10 +347,10 @@
       variable, which can be used in types later on. See
       \in{section}[sec:prototype:coretypes] for details.
      
-      Note that the body of a lambda abstraction extends all the way to the
-      end of the expression, or the closing bracket surrounding the lambda. In
-      other words, the lambda abstraction \quote{operator} has the lowest
-      priority of all.
+      The body of a lambda abstraction extends all the way to the end of
+      the expression, or the closing bracket surrounding the lambda. In
+      other words, the lambda abstraction \quote{operator} has the
+      lowest priority of all.
 
       The value of an application is the value of the body part, with the
       binder bound to the value the entire lambda abstraction is applied to.
@@ -354,12 +362,12 @@
       let bndr = value in body
       \stoplambda
       A let expression allows you to bind a binder to some value, while
-      evaluating to some other value (where that binder is in scope). This
+      evaluating to some other value (for which that binder is in scope). This
       allows for sharing of subexpressions (you can use a binder twice) and
-      explicit \quote{naming} of arbitrary expressions. Note that the binder
-      is not in scope in the value bound to it, so it's not possible to make
-      recursive definitions with the normal form of the let expression (see
-      the recursive form below).
+      explicit \quote{naming} of arbitrary expressions. A binder is not
+      in scope in the value bound it is bound to, so it's not possible
+      to make recursive definitions with a non-recursive let expression
+      (see the recursive form below).
 
       Even though this let expression is an extension on the basic lambda
       calculus, it is easily translated to a lambda abstraction. The let
@@ -394,12 +402,32 @@
       binders is in scope in each of the values, in addition to the body. This
       allows for self-recursive or mutually recursive definitions.
 
-      It should also be possible to express a recursive let using normal
-      lambda calculus, if we use the \emph{least fixed-point operator},
-      \lam{Y}. This falls beyond the scope of this report, since it is not
-      needed for this research.
+      It is also possible to express a recursive let expression using
+      normal lambda calculus, if we use the \emph{least fixed-point
+      operator}, \lam{Y} (but the details are too complicated to help
+      clarify the let expression, so this will not be explored further).
     \stopdesc
 
+    \placeintermezzo{}{
+      \startframedtext[width=8cm,background=box,frame=no]
+      \startalignment[center]
+        {\tfa Weak head normal form (\small{WHNF})}
+      \stopalignment
+      \blank[medium]
+        An expression is in weak head normal form if it is either an
+        constructor application or lambda abstraction. \todo{How about
+        atoms?}
+
+        Without going into detail about the differences with head
+        normal form and normal form, note that evaluating the scrutinee
+        of a case expression to normal form (evaluating any function
+        applications, variable references and case expressions) is
+        sufficient to decide which case alternatives should be chosen.
+        \todo{ref?}
+      \stopframedtext
+
+    }
+
     \startdesc{Case expression}
       \defref{case expression}
       \startlambda
@@ -410,30 +438,55 @@
           Cn bndrn,0 ... bndrn,m -> bodyn
       \stoplambda
 
-      \todo{Define WHNF}
-
       A case expression is the only way in Core to choose between values. All
       \hs{if} expressions and pattern matchings from the original Haskell
       PRogram have been translated to case expressions by the desugarer. 
       
       A case expression evaluates its scrutinee, which should have an
       algebraic datatype, into weak head normal form (\small{WHNF}) and
-      (optionally) binds it to \lam{bndr}. It then chooses a body depending on
-      the constructor of its scrutinee. If none of the constructors match, the
-      \lam{DEFAULT} alternative is chosen. A case expression must always be
-      exhaustive, \ie it must cover all possible constructors that the
-      scrutinee can have (if all of them are covered explicitly, the
-      \lam{DEFAULT} alternative can be left out).
+      (optionally) binds it to \lam{bndr}. Every alternative lists a
+      single constructor (\lam{C0 ... Cn}). Based on the actual
+      constructor of the scrutinee, the corresponding alternative is
+      chosen. The binders in the chosen alternative (\lam{bndr0,0 ....
+      bndr0,m} are bound to the actual arguments to the constructor in
+      the scrutinee.
+
+      This is best illustrated with an example. Assume
+      there is an algebraic datatype declared as follows\footnote{This
+      datatype is not suported by the current CÎ»ash implementation, but
+      serves well to illustrate the case expression}:
+
+      \starthaskell
+      data D = A Word | B Bit
+      \stophaskell
+
+      This is an algebraic datatype with two constructors, each getting
+      a single argument. A case expression scrutinizing this datatype
+      could look like the following:
+
+      \startlambda
+        case s of
+          A word -> High
+          B bit -> bit
+      \stoplambda
+
+      What this expression does is check the constructor of the
+      scrutinee \lam{s}. If it is \lam{A}, it always evaluates to
+      \lam{High}. If the constructor is \lam{B}, the binder \lam{bit} is
+      bound to the argument passed to \lam{B} and the case expression
+      evaluates to this bit.
+      
+      If none of the alternatives match, the \lam{DEFAULT} alternative
+      is chosen. A case expression must always be exhaustive, \ie it
+      must cover all possible constructors that the scrutinee can have
+      (if all of them are covered explicitly, the \lam{DEFAULT}
+      alternative can be left out).
       
       Since we can only match the top level constructor, there can be no overlap
       in the alternatives and thus order of alternatives is not relevant (though
       the \lam{DEFAULT} alternative must appear first for implementation
       efficiency).
       
-      Any arguments to the constructor in the scrutinee are bound to each of the
-      binders after the constructor and are in scope only in the corresponding
-      body.
-
       To support strictness, the scrutinee is always evaluated into
       \small{WHNF}, even when there is only a \lam{DEFAULT} alternative. This
       allows aplication of the strict function \lam{f} to the argument \lam{a}
@@ -517,21 +570,25 @@
     \startdesc{Type}
       \defref{type expression}
       \startlambda
-      @type
+      @T
       \stoplambda
-      It is possibly to use a Core type as a Core expression. For the actual
-      types supported by Core, see \in{section}[sec:prototype:coretypes]. This
-      \quote{lifting} of a type into the value domain is done to allow for
-      type abstractions and applications to be handled as normal lambda
-      abstractions and applications above. This means that a type expression
-      in Core can only ever occur in the argument position of an application,
-      and only if the type of the function that is applied to expects a type
-      as the first argument. This happens for all polymorphic functions, for
-      example, the \lam{fst} function:
+      It is possibly to use a Core type as a Core expression. To prevent
+      confusion between types and values, the \lam{@} sign is used to
+      explicitly mark a type that is used in a Core expression.
+      
+      For the actual types supported by Core, see
+      \in{section}[sec:prototype:coretypes]. This \quote{lifting} of a
+      type into the value domain is done to allow for type abstractions
+      and applications to be handled as normal lambda abstractions and
+      applications above. This means that a type expression in Core can
+      only ever occur in the argument position of an application, and
+      only if the type of the function that is applied to expects a type
+      as the first argument. This happens for all polymorphic functions,
+      for example, the \lam{fst} function:
 
       \startlambda
-      fst :: \forall a. \forall b. (a, b) -> a
-      fst = Î»tup.case tup of (,) a b -> a
+      fst :: \forall t1. \forall t2. (t1, t2) ->t1 
+      fst = Î»t1.Î»t2.Î»(tup :: (t1, t2)). case tup of (,) a b -> a
 
       fstint :: (Int, Int) -> Int
       fstint = Î»a.Î»b.fst @Int @Int a b
@@ -539,8 +596,8 @@
           
       The type of \lam{fst} has two universally quantified type variables. When
       \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
-      (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
-      the type of \lam{fst} actual type of arguments and result can be found:
+      (which are substitued for \lam{t1} and \lam{t2} in the type of \lam{fst}, so
+      the actual type of arguments and result of \lam{fst} can be found:
       \lam{fst @Int @Int :: (Int, Int) -> Int}).
     \stopdesc
 
@@ -558,7 +615,8 @@
 
       In Core, every expression is typed. The translation to Core happens
       after the typechecker, so types in Core are always correct as well
-      (though you could of course construct invalidly typed expressions).
+      (though you could of course construct invalidly typed expressions
+      through the \GHC API).
 
       Any type in core is one of the following:
 
@@ -591,23 +649,23 @@
           Maybe Int
         \stoplambda
 
-        This applies a some type to another type. This is particularly used to
+        This applies some type to another type. This is particularly used to
         apply type variables (type constructors) to their arguments.
 
         As mentioned above, applications of some type constructors have
         special notation. In particular, these are applications of the
         \emph{function type constructor} and \emph{tuple type constructors}:
         \startlambda
-          foo :: a -> b
-          foo' :: -> a b
-          bar :: (a, b, c)
-          bar' :: (,,) a b c
+          foo :: t1 -> t2 
+          foo' :: -> t1 t2 
+          bar :: (t1, t2, t3)
+          bar' :: (,,) t1 t2 t3
         \stoplambda
       \stopdesc
 
       \startdesc{The forall type}
         \startlambda
-          id :: \forall a. a -> a
+          id :: \forall t. t -> t
         \stoplambda
         The forall type introduces polymorphism. It is the only way to
         introduce new type variables, which are completely unconstrained (Any
@@ -618,10 +676,10 @@
         expression. For example, the Core translation of the
         id function is:
         \startlambda
-          id = Î»a.Î»x.x
+          id = Î»t.Î»(x :: t).x
         \stoplambda
 
-        Here, the type of the binder \lam{x} is \lam{a}, referring to the
+        Here, the type of the binder \lam{x} is \lam{t}, referring to the
         binder in the topmost lambda.
 
         When using a value with a forall type, the actual type
@@ -634,7 +692,7 @@
         \stoplambda
 
         Here, id is first applied to the type to work with. Note that the type
-        then changes from \lam{id :: \forall a. a -> a} to \lam{id @Bool ::
+        then changes from \lam{id :: \forall t. t -> t} to \lam{id @Bool ::
         Bool -> Bool}. Note that the type variable \lam{a} has been
         substituted with the actual type.
 
@@ -649,14 +707,14 @@
 
       \startdesc{Predicate type}
         \startlambda
-          show :: \forall a. Show s â s â String
+          show :: \forall t. Show t â t â String
         \stoplambda
        
         \todo{Sidenote: type classes?}
 
         A predicate type introduces a constraint on a type variable introduced
         by a forall type (or type lambda). In the example above, the type
-        variable \lam{a} can only contain types that are an \emph{instance} of
+        variable \lam{t} can only contain types that are an \emph{instance} of
         the \emph{type class} \lam{Show}. \refdef{type class}
 
         There are other sorts of predicate types, used for the type families
@@ -1132,10 +1190,11 @@
         \small{VHDL}. In particular, State elements should be removed from
         tuples (and other datatypes) and arguments with a state type should
         not generate ports.
-        \item To make the state actually work, a simple \small{VHDL} proc
-        should be generated. This proc updates the state at every
-        clockcycle, by assigning the new state to the current state. This
-        will be recognized by synthesis tools as a register specification.
+        \item To make the state actually work, a simple \small{VHDL}
+        (sequential) process should be generated. This process updates
+        the state at every clockcycle, by assigning the new state to the
+        current state. This will be recognized by synthesis tools as a
+        register specification.
       \stopitemize
 
       When applying these rules to the description in
@@ -1145,7 +1204,7 @@
       actual flow of values in the final hardware.
       
       \startlambda
-        avg = iÎ».--Î»spacked.--
+        avg = iÎ».Î»--spacked.--
           let 
             s = --spacked â¶ (AccState, Word)--
             --accs = case s of (accs, _) -> accs--
@@ -1165,7 +1224,7 @@
       When we would really leave out the crossed out parts, we get a slightly
       weird program: There is a variable \lam{s} which has no value, and there
       is a variable \lam{s'} that is never used. Together, these two will form
-      the state proc of the function. \lam{s} contains the "current" state,
+      the state process of the function. \lam{s} contains the "current" state,
       \lam{s'} is assigned the "next" state. So, at the end of each clock
       cycle, \lam{s'} should be assigned to \lam{s}.
 
diff --git a/Outline b/Outline
index 997276d..b9939ea 100644
--- a/Outline
+++ b/Outline
@@ -64,4 +64,5 @@ TODO: Reorder future work.
 TODO: Future work: Use CÎ»ash
 TODO: Abstract
 TODO: Preface
-TOOD: Footnote font has not lambda
+TODO: Footnote font has not lambda
+TODO: eta-abstraction -> expansion
diff --git a/Report.bib b/Report.bib
index 549b958..a2e5417 100644
--- a/Report.bib
+++ b/Report.bib
@@ -45,4 +45,39 @@
  address = {New York, NY, USA},
 }
 
+@inproceedings{jones96,
+    author = {Simon L Peyton Jones},
+    booktitle = {Programming Languages and Systems â ESOP '96},
+    title = {Compiling Haskell by program transformation: A report from the trenches},
+	pages = {18--44},
+    year = {1996},
+    doi = {10.1007/3-540-61055-3_27},
+    url = {http://www.springerlink.com/content/yjnv70x758167327},
+	isbn = {978-3-540-61055-7},
+	publisher={Springer},
+	location={Berlin / Heidelberg},
+}
+
+@article{hughes98,
+    author = {John Hughes},
+    title = {Generalising Monads to Arrows},
+    journal = {Science of Computer Programming},
+    year = {1998},
+    volume = {37},
+    pages = {67--111},
+}    
+
+@inproceedings{paterson01,
+ author = {Paterson, Ross},
+ title = {A new notation for arrows},
+ booktitle = {ICFP '01: Proceedings of the sixth ACM SIGPLAN international conference on Functional programming},
+ year = {2001},
+ isbn = {1-58113-415-0},
+ pages = {229--240},
+ location = {Florence, Italy},
+ doi = {http://doi.acm.org/10.1145/507635.507664},
+ publisher = {ACM},
+ address = {New York, NY, USA},
+}
+
 % vim: set paste:
diff --git a/Utils/Lambda.tex b/Utils/Lambda.tex
index f88dfb4..f59ef0c 100644
--- a/Utils/Lambda.tex
+++ b/Utils/Lambda.tex
@@ -112,17 +112,18 @@ draw b;
   \stopboxed
 }
 
-% Define a "definition" float. We would like some more backgroundoffset, but
-% this offset doesn't move the caption, causing the border to cross the
-% caption... Also, centering the content didn't seem to work...
-% TODO: Make more pretty.
-\definefloat[definition][definitions]
-\setupfloat[definition][background=box,backgroundoffset=1mm]
-
 % Define an "example" float. Don't add box around it, since an example will
 % commonly contain two boxed items (Before / after, code / picture).
 \definefloat[example][examples]
 \setupcaption[example][location=top] % Put captions on top
+% Define a definition float that shares its numbering and setting with
+% examples.
+\definefloat[definition][definitions][example]
+% Make sure the label really says definition instead of example, this
+% seems to be a bug in ConTeXt (\redodefinefloat in strc-flt.mkiv has
+% \presetlabeltext[#1=\Word{#3}~]% which should reference #1 instead of
+% #3).
+\setuplabeltext[en][definition=Definition]
 
 % Margin magic taken from
 % http://www.pragma-ade.com/general/manuals/details.pdf By setting negative