X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FNormalization.tex;h=d8f2fe9723182bdc4d66236e1b79b48a63454470;hp=f64f7909aab0fbedf59e4d5c5a0c7d2777a5f927;hb=25d228c0ea102570073ccf5dd642615cc2ebe7a1;hpb=21c8a86147c874c29156d1859a670c3f7c6df7ee

diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex
index f64f790..d8f2fe9 100644
--- a/Chapters/Normalization.tex
+++ b/Chapters/Normalization.tex
@@ -14,11 +14,6 @@
   }
 }
 
-
-% A transformation example
-\definefloat[example][examples]
-\setupcaption[example][location=top] % Put captions on top
-
 \define[3]\transexample{
   \placeexample[here]{#1}
   \startcombination[2*1]
@@ -26,24 +21,16 @@
     {\example{#3}}{Transformed program}
   \stopcombination
 }
-%
-%\define[3]\transexampleh{
-%%  \placeexample[here]{#1}
-%%  \startcombination[1*2]
-%%    {\example{#2}}{Original program}
-%%    {\example{#3}}{Transformed program}
-%%  \stopcombination
-%}
-
-The first step in the core to VHDL translation process, is normalization. We
+
+The first step in the core to \small{VHDL} translation process, is normalization. We
 aim to bring the core description into a simpler form, which we can
-subsequently translate into VHDL easily. This normal form is needed because
-the full core language is more expressive than VHDL in some areas and because
+subsequently translate into \small{VHDL} easily. This normal form is needed because
+the full core language is more expressive than \small{VHDL} in some areas and because
 core can describe expressions that do not have a direct hardware
 interpretation.
 
-TODO: Describe core properties not supported in VHDL, and describe how the
-VHDL we want to generate should look like.
+TODO: Describe core properties not supported in \small{VHDL}, and describe how the
+\small{VHDL} we want to generate should look like.
 
 \section{Normal form}
 The transformations described here have a well-defined goal: To bring the
@@ -73,10 +60,10 @@ describing the things we want to not have in a normal form.
   generate a hardware signal that contains a function, so all values,
   arguments and returns values used must be first order.
 
-  \item Any complex \emph{nested scopes} must be removed. In the VHDL
+  \item Any complex \emph{nested scopes} must be removed. In the \small{VHDL}
   description, every signal is in a single scope. Also, full expressions are
   not supported everywhere (in particular port maps can only map signal names,
-  not expressions). To make the VHDL generation easy, all values must be bound
+  not expressions). To make the \small{VHDL} generation easy, all values must be bound
   on the \quote{top level}.
 \stopitemize
 
@@ -215,7 +202,13 @@ alu = Î»opcode.Î»a.Î»b.
 
 As a more complete example, consider \in{example}[ex:NormalComplete]. This
 example contains everything that is supported in normal form, with the
-exception of builtin higher order functions.
+exception of builtin higher order functions. The graphical version of the
+architecture contains a slightly simplified version, since the state tuple
+packing and unpacking have been left out. Instead, two seperate registers are
+drawn. Also note that most synthesis tools will further optimize this
+architecture by removing the multiplexers at the register input and replace
+them with some logic in the clock inputs, but we want to show the architecture
+as close to the description as possible.
 
 \startbuffer[NormalComplete]
   regbank :: Bit 
@@ -312,7 +305,7 @@ subtractor.}
     {\boxedgraphic{NormalComplete}}{The architecture described by the normal form.}
   \stopcombination
 
-\subsection{Normal form definition}
+\subsection{Intended normal form definition}
 Now we have some intuition for the normal form, we can describe how we want
 the normal form to look like in a slightly more formal manner. The following
 EBNF-like description completely captures the intended structure (and
@@ -364,826 +357,1310 @@ the output port. Most function applications bound by the let expression
 define a component instantiation, where the input and output ports are mapped
 to local signals or arguments. Some of the others use a builtin
 construction (\eg the \lam{case} statement) or call a builtin function
-(\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is
+(\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is
 available.
 
-\subsection{Definitions}
-In the following sections, we will be using a number of functions and
-notations, which we will define here.
+\section{Transformation notation}
+To be able to concisely present transformations, we use a specific format to
+them. It is a simple format, similar to one used in logic reasoning.
 
-\subsubsection{Transformations}
-The most important notation is the one for transformation, which looks like
-the following:
+Such a transformation description looks like the following.
 
 \starttrans
-context conditions
+<context conditions>
 ~
-from
-------------------------            expression conditions
-to
+<original expression>
+--------------------------          <expression conditions>
+<transformed expresssion>
 ~
-context additions
+<context additions>
 \stoptrans
 
-Here, we describe a transformation. The most import parts are \lam{from} and
-\lam{to}, which describe the Core expresssion that should be matched and the
-expression that it should be replaced with. This matching can occur anywhere
-in function that is being normalized, so it applies to any subexpression as
-well.
-
-The \lam{expression conditions} list a number of conditions on the \lam{from}
-expression that must hold for the transformation to apply.
-
-Furthermore, there is some way to look into the environment (\eg, other top
-level bindings).  The \lam{context conditions} part specifies any number of
-top level bindings that must be present for the transformation to apply.
-Usually, this lists a top level binding that binds an identfier that is also
-used in the \lam{from} expression, allowing us to "access" the value of a top
-level binding in the \lam{to} expression (\eg, for inlining).
-
-Finally, there is a way to influence the environment. The \lam{context
-additions} part lists any number of new top level bindings that should be
-added.
-
-If there are no \lam{context conditions} or \lam{context additions}, they can
-be left out alltogether, along with the separator \lam{~}.
-
-TODO: Example
-
-\subsubsection{Other concepts}
-A \emph{global variable} is any variable that is bound at the
-top level of a program, or an external module. A local variable is any other
-variable (\eg, variables local to a function, which can be bound by lambda
-abstractions, let expressions and case expressions).
-
-A \emph{hardware representable} type is a type that we can generate
-a signal for in hardware. For example, a bit, a vector of bits, a 32 bit
-unsigned word, etc. Types that are not runtime representable notably
-include (but are not limited to): Types, dictionaries, functions.
-
-A \emph{builtin function} is a function for which a builtin
-hardware translation is available, because its actual definition is not
-translatable. A user-defined function is any other function.
-
-\subsubsection{Functions}
-Here, we define a number of functions that can be used below to concisely
-specify conditions.
-
-\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
-global variable. It is false when it references a local variable.
-
-\emph{lvar(expr)} is the inverse of \emph{gvar}; it is true when \emph{expr}
-references a local variable, false when it references a global variable.
-
-\emph{representable(expr)} or \emph{representable(var)} is true when
-\emph{expr} or \emph{var} has a type that is representable at runtime.
-
-\section{Transform passes}
-In this section we describe the actual transforms. Here we're using
-the core language in a notation that resembles lambda calculus.
-
-Each of these transforms is meant to be applied to every (sub)expression
-in a program, for as long as it applies. Only when none of the
-transformations can be applied anymore, the program is in normal form (by
-definition). We hope to be able to prove that this form will obey all of the
-constraints defined above, but this has yet to happen (though it seems likely
-that it will).
-
-Each of the transforms will be described informally first, explaining
-the need for and goal of the transform. Then, a formal definition is
-given, using a familiar syntax from the world of logic. Each transform
-is specified as a number of conditions (above the horizontal line) and a
-number of conclusions (below the horizontal line). The details of using
-this notation are still a bit fuzzy, so comments are welcom.
-
-TODO: Formally describe the "apply to every (sub)expression" in terms of
-rules with full transformations in the conditions.
-
-\subsection{Î·-abstraction}
-This transformation makes sure that all arguments of a function-typed
-expression are named, by introducing lambda expressions. When combined with
-Î²-reduction and function inlining below, all function-typed expressions should
-be lambda abstractions or global identifiers.
+This format desribes a transformation that applies to \lam{original
+expresssion} and transforms it into \lam{transformed expression}, assuming
+that all conditions apply. In this format, there are a number of placeholders
+in pointy brackets, most of which should be rather obvious in their meaning.
+Nevertheless, we will more precisely specify their meaning below:
+
+  \startdesc{<original expression>} The expression pattern that will be matched
+  against (subexpressions of) the expression to be transformed. We call this a
+  pattern, because it can contain \emph{placeholders} (variables), which match
+  any expression or binder. Any such placeholder is said to be \emph{bound} to
+  the expression it matches. It is convention to use an uppercase latter (\eg
+  \lam{M} or \lam{E} to refer to any expression (including a simple variable
+  reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to
+  (references to) binders.
+
+  For example, the pattern \lam{a + B} will match the expression 
+  \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to 
+  \lam{(2 * 2)}), but not \lam{v + (2 * w)}.
+  \stopdesc
+
+  \startdesc{<expression conditions>}
+  These are extra conditions on the expression that is matched. These
+  conditions can be used to further limit the cases in which the
+  transformation applies, in particular to prevent a transformation from
+  causing a loop with itself or another transformation.
+
+  Only if these if these conditions are \emph{all} true, this transformation
+  applies.
+  \stopdesc
+
+  \startdesc{<context conditions>}
+  These are a number of extra conditions on the context of the function. In
+  particular, these conditions can require some other top level function to be
+  present, whose value matches the pattern given here. The format of each of
+  these conditions is: \lam{binder = <pattern>}.
+
+  Typically, the binder is some placeholder bound in the \lam{<original
+  expression>}, while the pattern contains some placeholders that are used in
+  the \lam{transformed expression}.
+  
+  Only if a top level binder exists that matches each binder and pattern, this
+  transformation applies.
+  \stopdesc
+
+  \startdesc{<transformed expression>}
+  This is the expression template that is the result of the transformation. If, looking
+  at the above three items, the transformation applies, the \lam{original
+  expression} is completely replaced with the \lam{<transformed expression>}.
+  We call this a template, because it can contain placeholders, referring to
+  any placeholder bound by the \lam{<original expression>} or the
+  \lam{<context conditions>}. The resulting expression will have those
+  placeholders replaced by the values bound to them.
+
+  Any binder (lowercase) placeholder that has no value bound to it yet will be
+  bound to (and replaced with) a fresh binder.
+  \stopdesc
+
+  \startdesc{<context additions>}
+  These are templates for new functions to add to the context. This is a way
+  to have a transformation create new top level functiosn.
+
+  Each addition has the form \lam{binder = template}. As above, any
+  placeholder in the addition is replaced with the value bound to it, and any
+  binder placeholder that has no value bound to it yet will be bound to (and
+  replaced with) a fresh binder.
+  \stopdesc
+
+  As an example, we'll look at Î·-abstraction:
 
 \starttrans
-E                 \lam{E :: * -> *}
---------------    \lam{E} is not the first argument of an application.
+E                 \lam{E :: a -> b}
+--------------    \lam{E} does not occur on a function position in an application
 Î»x.E x            \lam{E} is not a lambda abstraction.
-                  \lam{x} is a variable that does not occur free in \lam{E}.
-\stoptrans
-
-\startbuffer[from]
-foo = Î»a.case a of 
-  True -> Î»b.mul b b
-  False -> id
-\stopbuffer
-
-\startbuffer[to]
-foo = Î»a.Î»x.(case a of 
-    True -> Î»b.mul b b
-    False -> Î»y.id y) x
-\stopbuffer
-
-\transexample{Î·-abstraction}{from}{to}
-
-\subsection{Extended Î²-reduction}
-This transformation is meant to propagate application expressions downwards
-into expressions as far as possible. In lambda calculus, this reduction
-is known as Î²-reduction, but it is of course only defined for
-applications of lambda abstractions. We extend this reduction to also
-work for the rest of core (case and let expressions).
-
-For let expressions:
-\starttrans
-let binds in E) M
------------------
-let binds in E M
-\stoptrans
-
-For case statements:
-\starttrans
-(case x of
-  p1 -> E1
-  \vdots
-  pn -> En) M
------------------
-case x of
-  p1 -> E1 M
-  \vdots
-  pn -> En M
-\stoptrans
-
-For lambda expressions:
-\starttrans
-(Î»x.E) M
------------------
-E[M/x]
 \stoptrans
 
-% And an example
-\startbuffer[from]
-( let a = (case x of 
-            True -> id
-            False -> neg
-          ) 1
-      b = (let y = 3 in add y) 2
-  in
-    (Î»z.add 1 z)
-) 3
-\stopbuffer
-
-\startbuffer[to]
-let a = case x of 
-           True -> id 1
-           False -> neg 1
-    b = let y = 3 in add y 2
-in
-  add 1 3
-\stopbuffer
-
-\transexample{Extended Î²-reduction}{from}{to}
-
-\subsection{Let derecursification}
-This transformation is meant to make lets non-recursive whenever possible.
-This might allow other optimizations to do their work better. TODO: Why is
-this needed exactly?
-
-\subsection{Let flattening}
-This transformation puts nested lets in the same scope, by lifting the
-binding(s) of the inner let into a new let around the outer let. Eventually,
-this will cause all let bindings to appear in the same scope (they will all be
-in scope for the function return value).
-
-Note that this transformation does not try to be smart when faced with
-recursive lets, it will just leave the lets recursive (possibly joining a
-recursive and non-recursive let into a single recursive let). The let
-rederursification transformation will do this instead.
-
-\starttrans
-letnonrec x = (let bindings in M) in N
-------------------------------------------
-let bindings in (letnonrec x = M) in N
-\stoptrans
-
-\starttrans
-letrec 
-  \vdots
-  x = (let bindings in M)
-  \vdots
-in
-  N
-------------------------------------------
-letrec
-  \vdots
-  bindings
-  x = M
-  \vdots
-in
-  N
-\stoptrans
-
-\startbuffer[from]
-let
-  a = letrec
-    x = 1
-    y = 2
-  in
-    x + y
-in
-  letrec
-    b = let c = 3 in a + c
-    d = 4
-  in
-    d + b
-\stopbuffer
-\startbuffer[to]
-letrec
-  x = 1
-  y = 2
-in
-  let
-    a = x + y
-  in
-    letrec
-      c = 3
-      b = a + c
-      d = 4
-    in
-      d + b
-\stopbuffer
-
-\transexample{Let flattening}{from}{to}
-
-\subsection{Empty let removal}
-This transformation is simple: It removes recursive lets that have no bindings
-(which usually occurs when let derecursification removes the last binding from
-it).
-
-\starttrans
-letrec in M
---------------
-M
-\stoptrans
-
-\subsection{Simple let binding removal}
-This transformation inlines simple let bindings (\eg a = b).
-
-This transformation is not needed to get into normal form, but makes the
-resulting VHDL a lot shorter.
-
-\starttrans
-letnonrec
-  a = b
-in
-  M
------------------
-M[b/a]
-\stoptrans
-
-\starttrans
-letrec
-  \vdots
-  a = b
-  \vdots
-in
-  M
------------------
-let
-  \vdots [b/a]
-  \vdots [b/a]
-in
-  M[b/a]
-\stoptrans
-
-\subsection{Unused let binding removal}
-This transformation removes let bindings that are never used. Usually,
-the desugarer introduces some unused let bindings.
-
-This normalization pass should really be unneeded to get into normal form
-(since ununsed bindings are not forbidden by the normal form), but in practice
-the desugarer or simplifier emits some unused bindings that cannot be
-normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
-this transformation makes the resulting VHDL a lot shorter.
-
-\starttrans
-let a = E in M
-----------------------------    \lam{a} does not occur free in \lam{M}
-M
-\stoptrans
-
-\starttrans
-letrec
-  \vdots
-  a = E
-  \vdots
-in
-  M
-----------------------------    \lam{a} does not occur free in \lam{M}
-letrec
-  \vdots
-  \vdots
-in
-  M
-\stoptrans
-
-\subsection{Non-representable binding inlining}
-This transform inlines let bindings that have a non-representable type. Since
-we can never generate a signal assignment for these bindings (we cannot
-declare a signal assignment with a non-representable type, for obvious
-reasons), we have no choice but to inline the binding to remove it.
-
-If the binding is non-representable because it is a lambda abstraction, it is
-likely that it will inlined into an application and Î²-reduction will remove
-the lambda abstraction and turn it into a representable expression at the
-inline site. The same holds for partial applications, which can be turned into
-full applications by inlining.
-
-Other cases of non-representable bindings we see in practice are primitive
-Haskell types. In most cases, these will not result in a valid normalized
-output, but then the input would have been invalid to start with. There is one
-exception to this: When a builtin function is applied to a non-representable
-expression, things might work out in some cases. For example, when you write a
-literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
-the following core: \lam{fromInteger (smallInteger 10)}, where for example
-\lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
-non-representable types. TODO: This/these paragraph(s) should probably become a
-separate discussion somewhere else.
-
-\starttrans
-letnonrec a = E in M
---------------------------    \lam{E} has a non-representable type.
-M[E/a]
-\stoptrans
-
-\starttrans
-letrec 
-  \vdots
-  a = E
-  \vdots
-in
-  M
---------------------------    \lam{E} has a non-representable type.
-letrec
-  \vdots [E/a]
-  \vdots [E/a]
-in
-  M[E/a]
-\stoptrans
-
-\startbuffer[from]
-letrec
-  a = smallInteger 10
-  inc = Î»a -> add a 1
-  inc' = add 1
-  x = fromInteger a 
-in
-  inc (inc' x)
-\stopbuffer
+  Consider the following function, which is a fairly obvious way to specify a
+  simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this
+  function):
 
-\startbuffer[to]
-letrec
-  x = fromInteger (smallInteger 10)
-in
-  (Î»a -> add a 1) (add 1 x)
-\stopbuffer
-
-\transexample{Let flattening}{from}{to}
-
-\subsection{Compiler generated top level binding inlining}
-TODO
-
-\subsection{Scrutinee simplification}
-This transform ensures that the scrutinee of a case expression is always
-a simple variable reference.
-
-\starttrans
-case E of
-  alts
------------------        \lam{E} is not a local variable reference
-let x = E in 
-  case E of
-    alts
-\stoptrans
-
-\startbuffer[from]
-case (foo a) of
-  True -> a
-  False -> b
-\stopbuffer
-
-\startbuffer[to]
-let x = foo a in
-  case x of
-    True -> a
-    False -> b
-\stopbuffer
-
-\transexample{Let flattening}{from}{to}
-
-
-\subsection{Case simplification}
-This transformation ensures that all case expressions become normal form. This
-means they will become one of:
-\startitemize
-\item An extractor case with a single alternative that picks a single field
-from a datatype, \eg \lam{case x of (a, b) -> a}.
-\item A selector case with multiple alternatives and only wild binders, that
-makes a choice between expressions based on the constructor of another
-expression, \eg \lam{case x of Low -> a; High -> b}.
-\stopitemize
-
-\starttrans
-case E of
-  C0 v0,0 ... v0,m -> E0
-  \vdots
-  Cn vn,0 ... vn,m -> En
---------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
-letnonrec
-  v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
-  \vdots
-  v0,m = case x of C0 v0,0 .. v0,m -> v0,m
-  x0 = E0
-  \dots
-  vn,m = case x of Cn vn,0 .. vn,m -> vn,m
-  xn = En
-in
-  case E of
-    C0 w0,0 ... w0,m -> x0
-    \vdots
-    Cn wn,0 ... wn,m -> xn
-\stoptrans
-
-TODO: This transformation specified like this is complicated and misses
-conditions to prevent looping with itself. Perhaps we should split it here for
-discussion?
-
-\startbuffer[from]
-case a of
-  True -> add b 1
-  False -> add b 2
-\stopbuffer
-
-\startbuffer[to]
-letnonrec
-  x0 = add b 1
-  x1 = add b 2
-in
-  case a of
-    True -> x0
-    False -> x1
-\stopbuffer
-
-\transexample{Selector case simplification}{from}{to}
-
-\startbuffer[from]
-case a of
-  (,) b c -> add b c
-\stopbuffer
-\startbuffer[to]
-letnonrec
-  b = case a of (,) b c -> b
-  c = case a of (,) b c -> c
-  x0 = add b c
-in
-  case a of
-    (,) w0 w1 -> x0
-\stopbuffer
-
-\transexample{Extractor case simplification}{from}{to}
+\startlambda 
+alu :: Bit -> Word -> Word -> Word
+alu = Î»opcode. case opcode of
+  Low -> (+)
+  High -> (-)
+\stoplambda
 
-\subsection{Case removal}
-This transform removes any case statements with a single alternative and
-only wild binders.
+  There are a few subexpressions in this function to which we could possibly
+  apply the transformation. Since the pattern of the transformation is only
+  the placeholder \lam{E}, any expression will match that. Whether the
+  transformation applies to an expression is thus solely decided by the
+  conditions to the right of the transformation.
 
-These "useless" case statements are usually leftovers from case simplification
-on extractor case (see the previous example).
+  We will look at each expression in the function in a top down manner. The
+  first expression is the entire expression the function is bound to.
 
-\starttrans
-case x of
-  C v0 ... vm -> E
-----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
-E
-\stoptrans
+\startlambda
+Î»opcode. case opcode of
+  Low -> (+)
+  High -> (-)
+\stoplambda
 
-\startbuffer[from]
-case a of
-  (,) w0 w1 -> x0
-\stopbuffer
+  As said, the expression pattern matches this. The type of this expression is
+  \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in
+  this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}).
 
-\startbuffer[to]
-x0
-\stopbuffer
+  Since this expression is at top level, it does not occur at a function
+  position of an application. However, The expression is a lambda abstraction,
+  so this transformation does not apply.
 
-\transexample{Case removal}{from}{to}
+  The next expression we could apply this transformation to, is the body of
+  the lambda abstraction:
 
-\subsection{Argument simplification}
-The transforms in this section deal with simplifying application
-arguments into normal form. The goal here is to:
+\startlambda
+case opcode of
+  Low -> (+)
+  High -> (-)
+\stoplambda
 
-\startitemize
- \item Make all arguments of user-defined functions (\eg, of which
- we have a function body) simple variable references of a runtime
- representable type. This is needed, since these applications will be turned
- into component instantiations.
- \item Make all arguments of builtin functions one of:
-   \startitemize
-    \item A type argument.
-    \item A dictionary argument.
-    \item A type level expression.
-    \item A variable reference of a runtime representable type.
-    \item A variable reference or partial application of a function type.
-   \stopitemize
-\stopitemize
+  The type of this expression is \lam{Word -> Word -> Word}, which again
+  matches \lam{a -> b}. The expression is the body of a lambda expression, so
+  it does not occur at a function position of an application. Finally, the
+  expression is not a lambda abstraction but a case expression, so all the
+  conditions match. There are no context conditions to match, so the
+  transformation applies.
 
-When looking at the arguments of a user-defined function, we can
-divide them into two categories:
-\startitemize
-  \item Arguments of a runtime representable type (\eg bits or vectors).
-
-        These arguments can be preserved in the program, since they can
-        be translated to input ports later on.  However, since we can
-        only connect signals to input ports, these arguments must be
-        reduced to simple variables (for which signals will be
-        produced). This is taken care of by the argument extraction
-        transform.
-  \item Non-runtime representable typed arguments.
-        
-        These arguments cannot be preserved in the program, since we
-        cannot represent them as input or output ports in the resulting
-        VHDL. To remove them, we create a specialized version of the
-        called function with these arguments filled in. This is done by
-        the argument propagation transform.
-
-        Typically, these arguments are type and dictionary arguments that are
-        used to make functions polymorphic. By propagating these arguments, we
-        are essentially doing the same which GHC does when it specializes
-        functions: Creating multiple variants of the same function, one for
-        each type for which it is used. Other common non-representable
-        arguments are functions, e.g. when calling a higher order function
-        with another function or a lambda abstraction as an argument.
-
-        The reason for doing this is similar to the reasoning provided for
-        the inlining of non-representable let bindings above. In fact, this
-        argument propagation could be viewed as a form of cross-function
-        inlining.
-\stopitemize
+  By now, the placeholder \lam{E} is bound to the entire expression. The
+  placeholder \lam{x}, which occurs in the replacement template, is not bound
+  yet, so we need to generate a fresh binder for that. Let's use the binder
+  \lam{a}. This results in the following replacement expression:
 
-TODO: Check the following itemization.
+\startlambda
+Î»a.(case opcode of
+  Low -> (+)
+  High -> (-)) a
+\stoplambda
 
-When looking at the arguments of a builtin function, we can divide them
-into categories: 
+  Continuing with this expression, we see that the transformation does not
+  apply again (it is a lambda expression). Next we look at the body of this
+  labmda abstraction:
 
-\startitemize
-  \item Arguments of a runtime representable type.
-        
-        As we have seen with user-defined functions, these arguments can
-        always be reduced to a simple variable reference, by the
-        argument extraction transform. Performing this transform for
-        builtin functions as well, means that the translation of builtin
-        functions can be limited to signal references, instead of
-        needing to support all possible expressions.
-
-  \item Arguments of a function type.
-        
-        These arguments are functions passed to higher order builtins,
-        like \lam{map} and \lam{foldl}. Since implementing these
-        functions for arbitrary function-typed expressions (\eg, lambda
-        expressions) is rather comlex, we reduce these arguments to
-        (partial applications of) global functions.
-        
-        We can still support arbitrary expressions from the user code,
-        by creating a new global function containing that expression.
-        This way, we can simply replace the argument with a reference to
-        that new function. However, since the expression can contain any
-        number of free variables we also have to include partial
-        applications in our normal form.
-
-        This category of arguments is handled by the function extraction
-        transform.
-  \item Other unrepresentable arguments.
-        
-        These arguments can take a few different forms:
-        \startdesc{Type arguments}
-          In the core language, type arguments can only take a single
-          form: A type wrapped in the Type constructor. Also, there is
-          nothing that can be done with type expressions, except for
-          applying functions to them, so we can simply leave type
-          arguments as they are.
-        \stopdesc
-        \startdesc{Dictionary arguments}
-          In the core language, dictionary arguments are used to find
-          operations operating on one of the type arguments (mostly for
-          finding class methods). Since we will not actually evaluatie
-          the function body for builtin functions and can generate
-          code for builtin functions by just looking at the type
-          arguments, these arguments can be ignored and left as they
-          are.
-        \stopdesc
-        \startdesc{Type level arguments}
-          Sometimes, we want to pass a value to a builtin function, but
-          we need to know the value at compile time. Additionally, the
-          value has an impact on the type of the function. This is
-          encoded using type-level values, where the actual value of the
-          argument is not important, but the type encodes some integer,
-          for example. Since the value is not important, the actual form
-          of the expression does not matter either and we can leave
-          these arguments as they are.
-        \stopdesc
-        \startdesc{Other arguments}
-          Technically, there is still a wide array of arguments that can
-          be passed, but does not fall into any of the above categories.
-          However, none of the supported builtin functions requires such
-          an argument. This leaves use with passing unsupported types to
-          a function, such as calling \lam{head} on a list of functions.
-
-          In these cases, it would be impossible to generate hardware
-          for such a function call anyway, so we can ignore these
-          arguments.
-
-          The only way to generate hardware for builtin functions with
-          arguments like these, is to expand the function call into an
-          equivalent core expression (\eg, expand map into a series of
-          function applications). But for now, we choose to simply not
-          support expressions like these.
-        \stopdesc
-
-        From the above, we can conclude that we can simply ignore these
-        other unrepresentable arguments and focus on the first two
-        categories instead.
-\stopitemize
+\startlambda
+(case opcode of
+  Low -> (+)
+  High -> (-)) a
+\stoplambda
+  
+  Here, the transformation does apply, binding \lam{E} to the entire
+  expression and \lam{x} to the fresh binder \lam{b}, resulting in the
+  replacement:
 
-\subsubsection{Argument simplification}
-This transform deals with arguments to functions that
-are of a runtime representable type. It ensures that they will all become
-references to global variables, or local signals in the resulting VHDL. 
+\startlambda
+Î»b.(case opcode of
+  Low -> (+)
+  High -> (-)) a b
+\stoplambda
 
-TODO: It seems we can map an expression to a port, not only a signal.
-Perhaps this makes this transformation not needed?
-TODO: Say something about dataconstructors (without arguments, like True
-or False), which are variable references of a runtime representable
-type, but do not result in a signal.
+  Again, the transformation does not apply to this lambda abstraction, so we
+  look at its body. For brevity, we'll put the case statement on one line from
+  now on.
 
-To reduce a complex expression to a simple variable reference, we create
-a new let expression around the application, which binds the complex
-expression to a new variable. The original function is then applied to
-this variable.
+\startlambda
+(case opcode of Low -> (+); High -> (-)) a b
+\stoplambda
 
-\starttrans
-M N
---------------------    \lam{N} is of a representable type
-let x = N in M x        \lam{N} is not a local variable reference
-\stoptrans
+  The type of this expression is \lam{Word}, so it does not match \lam{a -> b}
+  and the transformation does not apply. Next, we have two options for the
+  next expression to look at: The function position and argument position of
+  the application. The expression in the argument position is \lam{b}, which
+  has type \lam{Word}, so the transformation does not apply. The expression in
+  the function position is:
 
-\startbuffer[from]
-add (add a 1) 1
-\stopbuffer
+\startlambda
+(case opcode of Low -> (+); High -> (-)) a
+\stoplambda
 
-\startbuffer[to]
-let x = add a 1 in add x 1
-\stopbuffer
+  Obviously, the transformation does not apply here, since it occurs in
+  function position. In the same way the transformation does not apply to both
+  components of this expression (\lam{case opcode of Low -> (+); High -> (-)}
+  and \lam{a}), so we'll skip to the components of the case expression: The
+  scrutinee and both alternatives. Since the opcode is not a function, it does
+  not apply here, and we'll leave both alternatives as an exercise to the
+  reader. The final function, after all these transformations becomes:
 
-\transexample{Argument extraction}{from}{to}
+\startlambda 
+alu :: Bit -> Word -> Word -> Word
+alu = Î»opcode.Î»a.b. (case opcode of
+  Low -> Î»a1.Î»b1 (+) a1 b1
+  High -> Î»a2.Î»b2 (-) a2 b2) a b
+\stoplambda
 
-\subsubsection{Function extraction}
-This transform deals with function-typed arguments to builtin functions.
-Since these arguments cannot be propagated, we choose to extract them
-into a new global function instead.
+  In this case, the transformation does not apply anymore, though this might
+  not always be the case (e.g., the application of a transformation on a
+  subexpression might open up possibilities to apply the transformation
+  further up in the expression).
 
-Any free variables occuring in the extracted arguments will become
-parameters to the new global function. The original argument is replaced
-with a reference to the new function, applied to any free variables from
-the original argument.
+\subsection{Transformation application}
+In this chapter we define a number of transformations, but how will we apply
+these? As stated before, our normal form is reached as soon as no
+transformation applies anymore. This means our application strategy is to
+simply apply any transformation that applies, and continuing to do that with
+the result of each transformation.
 
-This transformation is useful when applying higher order builtin functions
-like \hs{map} to a lambda abstraction, for example. In this case, the code
-that generates VHDL for \hs{map} only needs to handle top level functions and
-partial applications, not any other expression (such as lambda abstractions or
-even more complicated expressions).
+In particular, we define no particular order of transformations. Since
+transformation order should not influence the resulting normal form (see TODO
+ref), this leaves the implementation free to choose any application order that
+results in an efficient implementation.
 
-\starttrans
-M N                     \lam{M} is a (partial aplication of a) builtin function.
----------------------   \lam{f0 ... fn} = free local variables of \lam{N}
-M x f0 ... fn           \lam{N :: a -> b}
-~                       \lam{N} is not a (partial application of) a top level function
-x = Î»f0 ... Î»fn.N
-\stoptrans
+When applying a single transformation, we try to apply it to every (sub)expression
+in a function, not just the top level function. This allows us to keep the
+transformation descriptions concise and powerful.
 
-\startbuffer[from]
-map (Î»a . add a b) xs
+\subsection{Definitions}
+In the following sections, we will be using a number of functions and
+notations, which we will define here.
 
-map (add b) ys
-\stopbuffer
+\subsubsection{Other concepts}
+A \emph{global variable} is any variable that is bound at the
+top level of a program, or an external module. A \emph{local variable} is any
+other variable (\eg, variables local to a function, which can be bound by
+lambda abstractions, let expressions and pattern matches of case
+alternatives).  Note that this is a slightly different notion of global versus
+local than what \small{GHC} uses internally.
+\defref{global variable} \defref{local variable}
+
+A \emph{hardware representable} (or just \emph{representable}) type or value
+is (a value of) a type that we can generate a signal for in hardware. For
+example, a bit, a vector of bits, a 32 bit unsigned word, etc. Types that are
+not runtime representable notably include (but are not limited to): Types,
+dictionaries, functions.
+\defref{representable}
+
+A \emph{builtin function} is a function supplied by the CÎ»ash framework, whose
+implementation is not valid CÎ»ash. The implementation is of course valid
+Haskell, for simulation, but it is not expressable in CÎ»ash.
+\defref{builtin function} \defref{user-defined function}
+
+For these functions, CÎ»ash has a \emph{builtin hardware translation}, so calls
+to these functions can still be translated. These are functions like
+\lam{map}, \lam{hwor} and \lam{length}.
+
+A \emph{user-defined} function is a function for which we do have a CÎ»ash
+implementation available.
 
-\startbuffer[to]
-x0 = Î»b.Î»a.add a b
-~
-map x0 xs
+\subsubsection{Functions}
+Here, we define a number of functions that can be used below to concisely
+specify conditions.
 
-x1 = Î»b.add b
-map x1 ys
-\stopbuffer
+\refdef{global variable}\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
+global variable. It is false when it references a local variable.
 
-\transexample{Function extraction}{from}{to}
+\refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
+references a local variable, false when it references a global variable.
 
-\subsubsection{Argument propagation}
-This transform deals with arguments to user-defined functions that are
-not representable at runtime. This means these arguments cannot be
-preserved in the final form and most be {\em propagated}.
+\refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when
+\emph{expr} or \emph{var} is \emph{representable}.
 
-Propagation means to create a specialized version of the called
-function, with the propagated argument already filled in. As a simple
-example, in the following program:
+\subsection{Binder uniqueness}
+A common problem in transformation systems, is binder uniqueness. When not
+considering this problem, it is easy to create transformations that mix up
+bindings and cause name collisions. Take for example, the following core
+expression:
 
 \startlambda
-f = Î»a.Î»b.a + b
-inc = Î»a.f a 1
+(Î»a.Î»b.Î»c. a * b * c) x c
 \stoplambda
 
-we could {\em propagate} the constant argument 1, with the following
-result:
+By applying Î²-reduction (see below) once, we can simplify this expression to:
 
 \startlambda
-f' = Î»a.a + 1
-inc = Î»a.f' a
+(Î»b.Î»c. x * b * c) c
 \stoplambda
 
-Special care must be taken when the to-be-propagated expression has any
-free variables. If this is the case, the original argument should not be
-removed alltogether, but replaced by all the free variables of the
-expression. In this way, the original expression can still be evaluated
-inside the new function. Also, this brings us closer to our goal: All
-these free variables will be simple variable references.
-
-To prevent us from propagating the same argument over and over, a simple
-local variable reference is not propagated (since is has exactly one
-free variable, itself, we would only replace that argument with itself).
+Now, we have replaced the \lam{a} binder with a reference to the \lam{x}
+binder. No harm done here. But note that we see multiple occurences of the
+\lam{c} binder. The first is a binding occurence, to which the second refers.
+The last, however refers to \emph{another} instance of \lam{c}, which is
+bound somewhere outside of this expression. Now, if we would apply beta
+reduction without taking heed of binder uniqueness, we would get:
 
-This shows that any free local variables that are not runtime representable
-cannot be brought into normal form by this transform. We rely on an
-inlining transformation to replace such a variable with an expression we
-can propagate again.
+\startlambda
+Î»c. x * c * c
+\stoplambda
 
-\starttrans
-x = E
-~
-x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
----------------------------------------------    \lam{Yi} is not a local variable reference
-x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
-~
-x' = Î»y0 ... yi-1 f0 ... fm yi+1 ... yn .       
-      E y0 ... yi-1 Yi yi+1 ... yn   
+This is obviously not what was supposed to happen! The root of this problem is
+the reuse of binders: Identical binders can be bound in different scopes, such
+that only the inner one is \quote{visible} in the inner expression. In the example
+above, the \lam{c} binder was bound outside of the expression and in the inner
+lambda expression. Inside that lambda expression, only the inner \lam{c} is
+visible.
+
+There are a number of ways to solve this. \small{GHC} has isolated this
+problem to their binder substitution code, which performs \emph{deshadowing}
+during its expression traversal. This means that any binding that shadows
+another binding on a higher level is replaced by a new binder that does not
+shadow any other binding. This non-shadowing invariant is enough to prevent
+binder uniqueness problems in \small{GHC}.
+
+In our transformation system, maintaining this non-shadowing invariant is
+a bit harder to do (mostly due to implementation issues, the prototype doesn't
+use \small{GHC}'s subsitution code). Also, we can observe the following
+points.
 
-\stoptrans
+\startitemize
+\item Deshadowing does not guarantee overall uniqueness. For example, the
+following (slightly contrived) expression shows the identifier \lam{x} bound in
+two seperate places (and to different values), even though no shadowing
+occurs.
 
-TODO: Example
+\startlambda
+(let x = 1 in x) + (let x = 2 in x)
+\stoplambda
 
-\subsection{Cast propagation / simplification}
-This transform pushes casts down into the expression as far as possible. Since
-its exact role and need is not clear yet, this transformation is not yet
-specified.
+\item In our normal form (and the resulting \small{VHDL}), all binders
+(signals) will end up in the same scope. To allow this, all binders within the
+same function should be unique.
 
-\subsection{Return value simplification}
-This transformation ensures that the return value of a function is always a
-simple local variable reference.
+\item When we know that all binders in an expression are unique, moving around
+or removing a subexpression will never cause any binder conflicts. If we have
+some way to generate fresh binders, introducing new subexpressions will not
+cause any problems either. The only way to cause conflicts is thus to
+duplicate an existing subexpression.
+\stopitemize
 
-Currently implemented using lambda simplification, let simplification, and
-top simplification. Should change into something like the following, which
-works only on the result of a function instead of any subexpression. This is
-achieved by the contexts, like \lam{x = E}, though this is strictly not
-correct (you could read this as "if there is any function \lam{x} that binds
-\lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
-is bound by \lam{x}. This might need some extra notes or something).
+Given the above, our prototype maintains a unique binder invariant. This
+meanst that in any given moment during normalization, all binders \emph{within
+a single function} must be unique. To achieve this, we apply the following
+technique.
 
-\starttrans
-x = E                            \lam{E} is representable
-~                                \lam{E} is not a lambda abstraction
-E                                \lam{E} is not a let expression
----------------------------      \lam{E} is not a local variable reference
-let x = E in x
-\stoptrans
+TODO: Define fresh binders and unique supplies
 
-\starttrans
-x = Î»v0 ... Î»vn.E
-~                                \lam{E} is representable
-E                                \lam{E} is not a let expression
----------------------------      \lam{E} is not a local variable reference
-let x = E in x
-\stoptrans
+\startitemize
+\item Before starting normalization, all binders in the function are made
+unique. This is done by generating a fresh binder for every binder used. This
+also replaces binders that did not pose any conflict, but it does ensure that
+all binders within the function are generated by the same unique supply. See
+(TODO: ref fresh binder).
+\item Whenever a new binder must be generated, we generate a fresh binder that
+is guaranteed to be different from \emph{all binders generated so far}. This
+can thus never introduce duplication and will maintain the invariant.
+\item Whenever (part of) an expression is duplicated (for example when
+inlining), all binders in the expression are replaced with fresh binders
+(using the same method as at the start of normalization). These fresh binders
+can never introduce duplication, so this will maintain the invariant.
+\item Whenever we move part of an expression around within the function, there
+is no need to do anything special. There is obviously no way to introduce
+duplication by moving expressions around. Since we know that each of the
+binders is already unique, there is no way to introduce (incorrect) shadowing
+either.
+\stopitemize
 
-\starttrans
-x = Î»v0 ... Î»vn.let ... in E
-~                                \lam{E} is representable
-E                                \lam{E} is not a local variable reference
----------------------------
-let x = E in x
-\stoptrans
+\section{Transform passes}
+In this section we describe the actual transforms. Here we're using
+the core language in a notation that resembles lambda calculus.
 
-\startbuffer[from]
-x = add 1 2
-\stopbuffer
+Each of these transforms is meant to be applied to every (sub)expression
+in a program, for as long as it applies. Only when none of the
+transformations can be applied anymore, the program is in normal form (by
+definition). We hope to be able to prove that this form will obey all of the
+constraints defined above, but this has yet to happen (though it seems likely
+that it will).
 
-\startbuffer[to]
-x = let x = add 1 2 in x
-\stopbuffer
+Each of the transforms will be described informally first, explaining
+the need for and goal of the transform. Then, a formal definition is
+given, using a familiar syntax from the world of logic. Each transform
+is specified as a number of conditions (above the horizontal line) and a
+number of conclusions (below the horizontal line). The details of using
+this notation are still a bit fuzzy, so comments are welcom.
 
-\transexample{Return value simplification}{from}{to}
+  \subsection{General cleanup}
+
+    \subsubsection{Î²-reduction}
+      Î²-reduction is a well known transformation from lambda calculus, where it is
+      the main reduction step. It reduces applications of labmda abstractions,
+      removing both the lambda abstraction and the application.
+
+      In our transformation system, this step helps to remove unwanted lambda
+      abstractions (basically all but the ones at the top level). Other
+      transformations (application propagation, non-representable inlining) make
+      sure that most lambda abstractions will eventually be reducable by
+      Î²-reduction.
+
+      TODO: Define substitution syntax
+
+      \starttrans
+      (Î»x.E) M
+      -----------------
+      E[M/x]
+      \stoptrans
+
+      % And an example
+      \startbuffer[from]
+      (Î»a. 2 * a) (2 * b)
+      \stopbuffer
+
+      \startbuffer[to]
+      2 * (2 * b)
+      \stopbuffer
+
+      \transexample{Î²-reduction}{from}{to}
+
+    \subsubsection{Application propagation}
+      This transformation is meant to propagate application expressions downwards
+      into expressions as far as possible. This allows partial applications inside
+      expressions to become fully applied and exposes new transformation
+      possibilities for other transformations (like Î²-reduction).
+
+      \starttrans
+      let binds in E) M
+      -----------------
+      let binds in E M
+      \stoptrans
+
+      % And an example
+      \startbuffer[from]
+      ( let 
+          val = 1
+        in 
+          add val
+      ) 3
+      \stopbuffer
+
+      \startbuffer[to]
+      let 
+        val = 1
+      in 
+        add val 3
+      \stopbuffer
+
+      \transexample{Application propagation for a let expression}{from}{to}
+
+      \starttrans
+      (case x of
+        p1 -> E1
+        \vdots
+        pn -> En) M
+      -----------------
+      case x of
+        p1 -> E1 M
+        \vdots
+        pn -> En M
+      \stoptrans
+
+      % And an example
+      \startbuffer[from]
+      ( case x of 
+          True -> id
+          False -> neg
+      ) 1
+      \stopbuffer
+
+      \startbuffer[to]
+      case x of 
+        True -> id 1
+        False -> neg 1
+      \stopbuffer
+
+      \transexample{Application propagation for a case expression}{from}{to}
+
+    \subsubsection{Empty let removal}
+      This transformation is simple: It removes recursive lets that have no bindings
+      (which usually occurs when let derecursification removes the last binding from
+      it).
+
+      \starttrans
+      letrec in M
+      --------------
+      M
+      \stoptrans
+
+      \subsubsection{Simple let binding removal}
+      This transformation inlines simple let bindings (\eg a = b).
+
+      This transformation is not needed to get into normal form, but makes the
+      resulting \small{VHDL} a lot shorter.
+
+      \starttrans
+      letnonrec
+        a = b
+      in
+        M
+      -----------------
+      M[b/a]
+      \stoptrans
+
+      \starttrans
+      letrec
+        \vdots
+        a = b
+        \vdots
+      in
+        M
+      -----------------
+      let
+        \vdots [b/a]
+        \vdots [b/a]
+      in
+        M[b/a]
+      \stoptrans
+
+    \subsubsection{Unused let binding removal}
+      This transformation removes let bindings that are never used. Usually,
+      the desugarer introduces some unused let bindings.
+
+      This normalization pass should really be unneeded to get into normal form
+      (since ununsed bindings are not forbidden by the normal form), but in practice
+      the desugarer or simplifier emits some unused bindings that cannot be
+      normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
+      this transformation makes the resulting \small{VHDL} a lot shorter.
+
+      \starttrans
+      let a = E in M
+      ----------------------------    \lam{a} does not occur free in \lam{M}
+      M
+      \stoptrans
+
+      \starttrans
+      letrec
+        \vdots
+        a = E
+        \vdots
+      in
+        M
+      ----------------------------    \lam{a} does not occur free in \lam{M}
+      letrec
+        \vdots
+        \vdots
+      in
+        M
+      \stoptrans
+
+    \subsubsection{Cast propagation / simplification}
+      This transform pushes casts down into the expression as far as possible.
+      Since its exact role and need is not clear yet, this transformation is
+      not yet specified.
+
+    \subsubsection{Compiler generated top level binding inlining}
+      TODO
+
+  \section{Program structure}
+
+    \subsubsection{Î·-abstraction}
+      This transformation makes sure that all arguments of a function-typed
+      expression are named, by introducing lambda expressions. When combined with
+      Î²-reduction and function inlining below, all function-typed expressions should
+      be lambda abstractions or global identifiers.
+
+      \starttrans
+      E                 \lam{E :: a -> b}
+      --------------    \lam{E} is not the first argument of an application.
+      Î»x.E x            \lam{E} is not a lambda abstraction.
+                        \lam{x} is a variable that does not occur free in \lam{E}.
+      \stoptrans
+
+      \startbuffer[from]
+      foo = Î»a.case a of 
+        True -> Î»b.mul b b
+        False -> id
+      \stopbuffer
+
+      \startbuffer[to]
+      foo = Î»a.Î»x.(case a of 
+          True -> Î»b.mul b b
+          False -> Î»y.id y) x
+      \stopbuffer
+
+      \transexample{Î·-abstraction}{from}{to}
+
+    \subsubsection{Let derecursification}
+      This transformation is meant to make lets non-recursive whenever possible.
+      This might allow other optimizations to do their work better. TODO: Why is
+      this needed exactly?
+
+    \subsubsection{Let flattening}
+      This transformation puts nested lets in the same scope, by lifting the
+      binding(s) of the inner let into a new let around the outer let. Eventually,
+      this will cause all let bindings to appear in the same scope (they will all be
+      in scope for the function return value).
+
+      Note that this transformation does not try to be smart when faced with
+      recursive lets, it will just leave the lets recursive (possibly joining a
+      recursive and non-recursive let into a single recursive let). The let
+      rederecursification transformation will do this instead.
+
+      \starttrans
+      letnonrec x = (let bindings in M) in N
+      ------------------------------------------
+      let bindings in (letnonrec x = M) in N
+      \stoptrans
+
+      \starttrans
+      letrec 
+        \vdots
+        x = (let bindings in M)
+        \vdots
+      in
+        N
+      ------------------------------------------
+      letrec
+        \vdots
+        bindings
+        x = M
+        \vdots
+      in
+        N
+      \stoptrans
+
+      \startbuffer[from]
+      let
+        a = letrec
+          x = 1
+          y = 2
+        in
+          x + y
+      in
+        letrec
+          b = let c = 3 in a + c
+          d = 4
+        in
+          d + b
+      \stopbuffer
+      \startbuffer[to]
+      letrec
+        x = 1
+        y = 2
+      in
+        let
+          a = x + y
+        in
+          letrec
+            c = 3
+            b = a + c
+            d = 4
+          in
+            d + b
+      \stopbuffer
+
+      \transexample{Let flattening}{from}{to}
+
+    \subsubsection{Return value simplification}
+      This transformation ensures that the return value of a function is always a
+      simple local variable reference.
+
+      Currently implemented using lambda simplification, let simplification, and
+      top simplification. Should change into something like the following, which
+      works only on the result of a function instead of any subexpression. This is
+      achieved by the contexts, like \lam{x = E}, though this is strictly not
+      correct (you could read this as "if there is any function \lam{x} that binds
+      \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
+      is bound by \lam{x}. This might need some extra notes or something).
+
+      \starttrans
+      x = E                            \lam{E} is representable
+      ~                                \lam{E} is not a lambda abstraction
+      E                                \lam{E} is not a let expression
+      ---------------------------      \lam{E} is not a local variable reference
+      let x = E in x
+      \stoptrans
+
+      \starttrans
+      x = Î»v0 ... Î»vn.E
+      ~                                \lam{E} is representable
+      E                                \lam{E} is not a let expression
+      ---------------------------      \lam{E} is not a local variable reference
+      let x = E in x
+      \stoptrans
+
+      \starttrans
+      x = Î»v0 ... Î»vn.let ... in E
+      ~                                \lam{E} is representable
+      E                                \lam{E} is not a local variable reference
+      ---------------------------
+      let x = E in x
+      \stoptrans
+
+      \startbuffer[from]
+      x = add 1 2
+      \stopbuffer
+
+      \startbuffer[to]
+      x = let x = add 1 2 in x
+      \stopbuffer
+
+      \transexample{Return value simplification}{from}{to}
+
+  \subsection{Argument simplification}
+    The transforms in this section deal with simplifying application
+    arguments into normal form. The goal here is to:
+
+    \startitemize
+     \item Make all arguments of user-defined functions (\eg, of which
+     we have a function body) simple variable references of a runtime
+     representable type. This is needed, since these applications will be turned
+     into component instantiations.
+     \item Make all arguments of builtin functions one of:
+       \startitemize
+        \item A type argument.
+        \item A dictionary argument.
+        \item A type level expression.
+        \item A variable reference of a runtime representable type.
+        \item A variable reference or partial application of a function type.
+       \stopitemize
+    \stopitemize
+
+    When looking at the arguments of a user-defined function, we can
+    divide them into two categories:
+    \startitemize
+      \item Arguments of a runtime representable type (\eg bits or vectors).
+
+            These arguments can be preserved in the program, since they can
+            be translated to input ports later on.  However, since we can
+            only connect signals to input ports, these arguments must be
+            reduced to simple variables (for which signals will be
+            produced). This is taken care of by the argument extraction
+            transform.
+      \item Non-runtime representable typed arguments.
+            
+            These arguments cannot be preserved in the program, since we
+            cannot represent them as input or output ports in the resulting
+            \small{VHDL}. To remove them, we create a specialized version of the
+            called function with these arguments filled in. This is done by
+            the argument propagation transform.
+
+            Typically, these arguments are type and dictionary arguments that are
+            used to make functions polymorphic. By propagating these arguments, we
+            are essentially doing the same which GHC does when it specializes
+            functions: Creating multiple variants of the same function, one for
+            each type for which it is used. Other common non-representable
+            arguments are functions, e.g. when calling a higher order function
+            with another function or a lambda abstraction as an argument.
+
+            The reason for doing this is similar to the reasoning provided for
+            the inlining of non-representable let bindings above. In fact, this
+            argument propagation could be viewed as a form of cross-function
+            inlining.
+    \stopitemize
+
+    TODO: Check the following itemization.
+
+    When looking at the arguments of a builtin function, we can divide them
+    into categories: 
+
+    \startitemize
+      \item Arguments of a runtime representable type.
+            
+            As we have seen with user-defined functions, these arguments can
+            always be reduced to a simple variable reference, by the
+            argument extraction transform. Performing this transform for
+            builtin functions as well, means that the translation of builtin
+            functions can be limited to signal references, instead of
+            needing to support all possible expressions.
+
+      \item Arguments of a function type.
+            
+            These arguments are functions passed to higher order builtins,
+            like \lam{map} and \lam{foldl}. Since implementing these
+            functions for arbitrary function-typed expressions (\eg, lambda
+            expressions) is rather comlex, we reduce these arguments to
+            (partial applications of) global functions.
+            
+            We can still support arbitrary expressions from the user code,
+            by creating a new global function containing that expression.
+            This way, we can simply replace the argument with a reference to
+            that new function. However, since the expression can contain any
+            number of free variables we also have to include partial
+            applications in our normal form.
+
+            This category of arguments is handled by the function extraction
+            transform.
+      \item Other unrepresentable arguments.
+            
+            These arguments can take a few different forms:
+            \startdesc{Type arguments}
+              In the core language, type arguments can only take a single
+              form: A type wrapped in the Type constructor. Also, there is
+              nothing that can be done with type expressions, except for
+              applying functions to them, so we can simply leave type
+              arguments as they are.
+            \stopdesc
+            \startdesc{Dictionary arguments}
+              In the core language, dictionary arguments are used to find
+              operations operating on one of the type arguments (mostly for
+              finding class methods). Since we will not actually evaluatie
+              the function body for builtin functions and can generate
+              code for builtin functions by just looking at the type
+              arguments, these arguments can be ignored and left as they
+              are.
+            \stopdesc
+            \startdesc{Type level arguments}
+              Sometimes, we want to pass a value to a builtin function, but
+              we need to know the value at compile time. Additionally, the
+              value has an impact on the type of the function. This is
+              encoded using type-level values, where the actual value of the
+              argument is not important, but the type encodes some integer,
+              for example. Since the value is not important, the actual form
+              of the expression does not matter either and we can leave
+              these arguments as they are.
+            \stopdesc
+            \startdesc{Other arguments}
+              Technically, there is still a wide array of arguments that can
+              be passed, but does not fall into any of the above categories.
+              However, none of the supported builtin functions requires such
+              an argument. This leaves use with passing unsupported types to
+              a function, such as calling \lam{head} on a list of functions.
+
+              In these cases, it would be impossible to generate hardware
+              for such a function call anyway, so we can ignore these
+              arguments.
+
+              The only way to generate hardware for builtin functions with
+              arguments like these, is to expand the function call into an
+              equivalent core expression (\eg, expand map into a series of
+              function applications). But for now, we choose to simply not
+              support expressions like these.
+            \stopdesc
+
+            From the above, we can conclude that we can simply ignore these
+            other unrepresentable arguments and focus on the first two
+            categories instead.
+    \stopitemize
+
+    \subsubsection{Argument simplification}
+      This transform deals with arguments to functions that
+      are of a runtime representable type. It ensures that they will all become
+      references to global variables, or local signals in the resulting \small{VHDL}. 
+
+      TODO: It seems we can map an expression to a port, not only a signal.
+      Perhaps this makes this transformation not needed?
+      TODO: Say something about dataconstructors (without arguments, like True
+      or False), which are variable references of a runtime representable
+      type, but do not result in a signal.
+
+      To reduce a complex expression to a simple variable reference, we create
+      a new let expression around the application, which binds the complex
+      expression to a new variable. The original function is then applied to
+      this variable.
+
+      \starttrans
+      M N
+      --------------------    \lam{N} is of a representable type
+      let x = N in M x        \lam{N} is not a local variable reference
+      \stoptrans
+
+      \startbuffer[from]
+      add (add a 1) 1
+      \stopbuffer
+
+      \startbuffer[to]
+      let x = add a 1 in add x 1
+      \stopbuffer
+
+      \transexample{Argument extraction}{from}{to}
+
+    \subsubsection{Function extraction}
+      This transform deals with function-typed arguments to builtin functions.
+      Since these arguments cannot be propagated, we choose to extract them
+      into a new global function instead.
+
+      Any free variables occuring in the extracted arguments will become
+      parameters to the new global function. The original argument is replaced
+      with a reference to the new function, applied to any free variables from
+      the original argument.
+
+      This transformation is useful when applying higher order builtin functions
+      like \hs{map} to a lambda abstraction, for example. In this case, the code
+      that generates \small{VHDL} for \hs{map} only needs to handle top level functions and
+      partial applications, not any other expression (such as lambda abstractions or
+      even more complicated expressions).
+
+      \starttrans
+      M N                     \lam{M} is a (partial aplication of a) builtin function.
+      ---------------------   \lam{f0 ... fn} = free local variables of \lam{N}
+      M x f0 ... fn           \lam{N :: a -> b}
+      ~                       \lam{N} is not a (partial application of) a top level function
+      x = Î»f0 ... Î»fn.N
+      \stoptrans
+
+      \startbuffer[from]
+      map (Î»a . add a b) xs
+
+      map (add b) ys
+      \stopbuffer
+
+      \startbuffer[to]
+      x0 = Î»b.Î»a.add a b
+      ~
+      map x0 xs
+
+      x1 = Î»b.add b
+      map x1 ys
+      \stopbuffer
+
+      \transexample{Function extraction}{from}{to}
+
+    \subsubsection{Argument propagation}
+      This transform deals with arguments to user-defined functions that are
+      not representable at runtime. This means these arguments cannot be
+      preserved in the final form and most be {\em propagated}.
+
+      Propagation means to create a specialized version of the called
+      function, with the propagated argument already filled in. As a simple
+      example, in the following program:
+
+      \startlambda
+      f = Î»a.Î»b.a + b
+      inc = Î»a.f a 1
+      \stoplambda
+
+      we could {\em propagate} the constant argument 1, with the following
+      result:
+
+      \startlambda
+      f' = Î»a.a + 1
+      inc = Î»a.f' a
+      \stoplambda
+
+      Special care must be taken when the to-be-propagated expression has any
+      free variables. If this is the case, the original argument should not be
+      removed alltogether, but replaced by all the free variables of the
+      expression. In this way, the original expression can still be evaluated
+      inside the new function. Also, this brings us closer to our goal: All
+      these free variables will be simple variable references.
+
+      To prevent us from propagating the same argument over and over, a simple
+      local variable reference is not propagated (since is has exactly one
+      free variable, itself, we would only replace that argument with itself).
+
+      This shows that any free local variables that are not runtime representable
+      cannot be brought into normal form by this transform. We rely on an
+      inlining transformation to replace such a variable with an expression we
+      can propagate again.
+
+      \starttrans
+      x = E
+      ~
+      x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
+      ---------------------------------------------    \lam{Yi} is not a local variable reference
+      x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
+      ~
+      x' = Î»y0 ... yi-1 f0 ... fm yi+1 ... yn .       
+            E y0 ... yi-1 Yi yi+1 ... yn   
+
+      \stoptrans
+
+      TODO: Example
+
+  \subsection{Case simplification}
+    \subsubsection{Scrutinee simplification}
+      This transform ensures that the scrutinee of a case expression is always
+      a simple variable reference.
+
+      \starttrans
+      case E of
+        alts
+      -----------------        \lam{E} is not a local variable reference
+      let x = E in 
+        case E of
+          alts
+      \stoptrans
+
+      \startbuffer[from]
+      case (foo a) of
+        True -> a
+        False -> b
+      \stopbuffer
+
+      \startbuffer[to]
+      let x = foo a in
+        case x of
+          True -> a
+          False -> b
+      \stopbuffer
+
+      \transexample{Let flattening}{from}{to}
+
+
+    \subsubsection{Case simplification}
+      This transformation ensures that all case expressions become normal form. This
+      means they will become one of:
+      \startitemize
+      \item An extractor case with a single alternative that picks a single field
+      from a datatype, \eg \lam{case x of (a, b) -> a}.
+      \item A selector case with multiple alternatives and only wild binders, that
+      makes a choice between expressions based on the constructor of another
+      expression, \eg \lam{case x of Low -> a; High -> b}.
+      \stopitemize
+
+      \starttrans
+      case E of
+        C0 v0,0 ... v0,m -> E0
+        \vdots
+        Cn vn,0 ... vn,m -> En
+      --------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
+      letnonrec
+        v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
+        \vdots
+        v0,m = case x of C0 v0,0 .. v0,m -> v0,m
+        x0 = E0
+        \dots
+        vn,m = case x of Cn vn,0 .. vn,m -> vn,m
+        xn = En
+      in
+        case E of
+          C0 w0,0 ... w0,m -> x0
+          \vdots
+          Cn wn,0 ... wn,m -> xn
+      \stoptrans
+
+      TODO: This transformation specified like this is complicated and misses
+      conditions to prevent looping with itself. Perhaps we should split it here for
+      discussion?
+
+      \startbuffer[from]
+      case a of
+        True -> add b 1
+        False -> add b 2
+      \stopbuffer
+
+      \startbuffer[to]
+      letnonrec
+        x0 = add b 1
+        x1 = add b 2
+      in
+        case a of
+          True -> x0
+          False -> x1
+      \stopbuffer
+
+      \transexample{Selector case simplification}{from}{to}
+
+      \startbuffer[from]
+      case a of
+        (,) b c -> add b c
+      \stopbuffer
+      \startbuffer[to]
+      letnonrec
+        b = case a of (,) b c -> b
+        c = case a of (,) b c -> c
+        x0 = add b c
+      in
+        case a of
+          (,) w0 w1 -> x0
+      \stopbuffer
+
+      \transexample{Extractor case simplification}{from}{to}
+
+    \subsubsection{Case removal}
+      This transform removes any case statements with a single alternative and
+      only wild binders.
+
+      These "useless" case statements are usually leftovers from case simplification
+      on extractor case (see the previous example).
+
+      \starttrans
+      case x of
+        C v0 ... vm -> E
+      ----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
+      E
+      \stoptrans
+
+      \startbuffer[from]
+      case a of
+        (,) w0 w1 -> x0
+      \stopbuffer
+
+      \startbuffer[to]
+      x0
+      \stopbuffer
+
+      \transexample{Case removal}{from}{to}
+
+\subsection{Monomorphisation}
+  TODO: Better name for this section
+
+  Reference type-specialization (== argument propagation)
+
+\subsubsection{Defunctionalization}
+  Reference higher-order-specialization (== argument propagation)
+
+    \subsubsection{Non-representable binding inlining}
+      This transform inlines let bindings that have a non-representable type. Since
+      we can never generate a signal assignment for these bindings (we cannot
+      declare a signal assignment with a non-representable type, for obvious
+      reasons), we have no choice but to inline the binding to remove it.
+
+      If the binding is non-representable because it is a lambda abstraction, it is
+      likely that it will inlined into an application and Î²-reduction will remove
+      the lambda abstraction and turn it into a representable expression at the
+      inline site. The same holds for partial applications, which can be turned into
+      full applications by inlining.
+
+      Other cases of non-representable bindings we see in practice are primitive
+      Haskell types. In most cases, these will not result in a valid normalized
+      output, but then the input would have been invalid to start with. There is one
+      exception to this: When a builtin function is applied to a non-representable
+      expression, things might work out in some cases. For example, when you write a
+      literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
+      the following core: \lam{fromInteger (smallInteger 10)}, where for example
+      \lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
+      non-representable types. TODO: This/these paragraph(s) should probably become a
+      separate discussion somewhere else.
+
+      \starttrans
+      letnonrec a = E in M
+      --------------------------    \lam{E} has a non-representable type.
+      M[E/a]
+      \stoptrans
+
+      \starttrans
+      letrec 
+        \vdots
+        a = E
+        \vdots
+      in
+        M
+      --------------------------    \lam{E} has a non-representable type.
+      letrec
+        \vdots [E/a]
+        \vdots [E/a]
+      in
+        M[E/a]
+      \stoptrans
+
+      \startbuffer[from]
+      letrec
+        a = smallInteger 10
+        inc = Î»a -> add a 1
+        inc' = add 1
+        x = fromInteger a 
+      in
+        inc (inc' x)
+      \stopbuffer
+
+      \startbuffer[to]
+      letrec
+        x = fromInteger (smallInteger 10)
+      in
+        (Î»a -> add a 1) (add 1 x)
+      \stopbuffer
+
+      \transexample{Let flattening}{from}{to}
+
+
+\section{Provable properties}
+  When looking at the system of transformations outlined above, there are a
+  number of questions that we can ask ourselves. The main question is of course:
+  \quote{Does our system work as intended?}. We can split this question into a
+  number of subquestions:
+
+  \startitemize[KR]
+  \item[q:termination] Does our system \emph{terminate}? Since our system will
+  keep running as long as transformations apply, there is an obvious risk that
+  it will keep running indefinitely. One transformation produces a result that
+  is transformed back to the original by another transformation, for example.
+  \item[q:soundness] Is our system \emph{sound}? Since our transformations
+  continuously modify the expression, there is an obvious risk that the final
+  normal form will not be equivalent to the original program: Its meaning could
+  have changed.
+  \item[q:completeness] Is our system \emph{complete}? Since we have a complex
+  system of transformations, there is an obvious risk that some expressions will
+  not end up in our intended normal form, because we forgot some transformation.
+  In other words: Does our transformation system result in our intended normal
+  form for all possible inputs?
+  \item[q:determinism] Is our system \emph{deterministic}? Since we have defined
+  no particular order in which the transformation should be applied, there is an
+  obvious risk that different transformation orderings will result in
+  \emph{different} normal forms. They might still both be intended normal forms
+  (if our system is \emph{complete}) and describe correct hardware (if our
+  system is \emph{sound}), so this property is less important than the previous
+  three: The translator would still function properly without it.
+  \stopitemize
+
+  \subsection{Graph representation}
+    Before looking into how to prove these properties, we'll look at our
+    transformation system from a graph perspective. The nodes of the graph are
+    all possible Core expressions. The (directed) edges of the graph are
+    transformations. When a transformation Î± applies to an expression \lam{A} to
+    produce an expression \lam{B}, we add an edge from the node for \lam{A} to the
+    node for \lam{B}, labeled Î±.
+
+    \startuseMPgraphic{TransformGraph}
+      save a, b, c, d;
+
+      % Nodes
+      newCircle.a(btex \lam{(Î»x.Î»y. (+) x y) 1} etex);
+      newCircle.b(btex \lam{Î»y. (+) 1 y} etex);
+      newCircle.c(btex \lam{(Î»x.(+) x) 1} etex);
+      newCircle.d(btex \lam{(+) 1} etex);
+
+      b.c = origin;
+      c.c = b.c + (4cm, 0cm);
+      a.c = midpoint(b.c, c.c) + (0cm, 4cm);
+      d.c = midpoint(b.c, c.c) - (0cm, 3cm);
+
+      % Î²-conversion between a and b
+      ncarc.a(a)(b) "name(bred)";
+      ObjLabel.a(btex $\xrightarrow[normal]{}{Î²}$ etex) "labpathname(bred)", "labdir(rt)";
+      ncarc.b(b)(a) "name(bexp)", "linestyle(dashed withdots)";
+      ObjLabel.b(btex $\xleftarrow[normal]{}{Î²}$ etex) "labpathname(bexp)", "labdir(lft)";
+
+      % Î·-conversion between a and c
+      ncarc.a(a)(c) "name(ered)";
+      ObjLabel.a(btex $\xrightarrow[normal]{}{Î·}$ etex) "labpathname(ered)", "labdir(rt)";
+      ncarc.c(c)(a) "name(eexp)", "linestyle(dashed withdots)";
+      ObjLabel.c(btex $\xleftarrow[normal]{}{Î·}$ etex) "labpathname(eexp)", "labdir(lft)";
+
+      % Î·-conversion between b and d
+      ncarc.b(b)(d) "name(ered)";
+      ObjLabel.b(btex $\xrightarrow[normal]{}{Î·}$ etex) "labpathname(ered)", "labdir(rt)";
+      ncarc.d(d)(b) "name(eexp)", "linestyle(dashed withdots)";
+      ObjLabel.d(btex $\xleftarrow[normal]{}{Î·}$ etex) "labpathname(eexp)", "labdir(lft)";
+
+      % Î²-conversion between c and d
+      ncarc.c(c)(d) "name(bred)";
+      ObjLabel.c(btex $\xrightarrow[normal]{}{Î²}$ etex) "labpathname(bred)", "labdir(rt)";
+      ncarc.d(d)(c) "name(bexp)", "linestyle(dashed withdots)";
+      ObjLabel.d(btex $\xleftarrow[normal]{}{Î²}$ etex) "labpathname(bexp)", "labdir(lft)";
+
+      % Draw objects and lines
+      drawObj(a, b, c, d);
+    \stopuseMPgraphic
+
+    \placeexample[right][ex:TransformGraph]{Partial graph of a labmda calculus
+    system with Î² and Î· reduction (solid lines) and expansion (dotted lines).}
+        \boxedgraphic{TransformGraph}
+
+    Of course our graph is unbounded, since we can construct an infinite amount of
+    Core expressions. Also, there might potentially be multiple edges between two
+    given nodes (with different labels), though seems unlikely to actually happen
+    in our system.
+
+    See \in{example}[ex:TransformGraph] for the graph representation of a very
+    simple lambda calculus that contains just the expressions \lam{(Î»x.Î»y. (+) x
+    y) 1}, \lam{Î»y. (+) 1 y}, \lam{(Î»x.(+) x) 1} and \lam{(+) 1}. The
+    transformation system consists of Î²-reduction and Î·-reduction (solid edges) or
+    Î²-reduction and Î·-reduction (dotted edges).
+
+    TODO: Define Î²-reduction and Î·-reduction?
+
+    Note that the normal form of such a system consists of the set of nodes
+    (expressions) without outgoing edges, since those are the expression to which
+    no transformation applies anymore. We call this set of nodes the \emph{normal
+    set}.
+
+    From such a graph, we can derive some properties easily:
+    \startitemize[KR]
+      \item A system will \emph{terminate} if there is no path of infinite length
+      in the graph (this includes cycles).
+      \item Soundness is not easily represented in the graph.
+      \item A system is \emph{complete} if all of the nodes in the normal set have
+      the intended normal form. The inverse (that all of the nodes outside of
+      the normal set are \emph{not} in the intended normal form) is not
+      strictly required.
+      \item A system is deterministic if all paths from a node, which end in a node
+      in the normal set, end at the same node.
+    \stopitemize
+
+    When looking at the \in{example}[ex:TransformGraph], we see that the system
+    terminates for both the reduction and expansion systems (but note that, for
+    expansion, this is only true because we've limited the possible expressions!
+    In comlete lambda calculus, there would be a path from \lam{(Î»x.Î»y. (+) x y)
+    1} to \lam{(Î»x.Î»y.(Î»z.(+) z) x y) 1} to \lam{(Î»x.Î»y.(Î»z.(Î»q.(+) q) z) x y) 1}
+    etc.)
+
+    If we would consider the system with both expansion and reduction, there would
+    no longer be termination, since there would be cycles all over the place.
+
+    The reduction and expansion systems have a normal set of containing just
+    \lam{(+) 1} or \lam{(Î»x.Î»y. (+) x y) 1} respectively. Since all paths in
+    either system end up in these normal forms, both systems are \emph{complete}.
+    Also, since there is only one normal form, it must obviously be
+    \emph{deterministic} as well.
+
+  \subsection{Termination}
+    Approach: Counting.
+
+    Church-Rosser?
+
+  \subsection{Soundness}
+    Needs formal definition of semantics.
+    Prove for each transformation seperately, implies soundness of the system.
+   
+  \subsection{Completeness}
+    Show that any transformation applies to every Core expression that is not
+    in normal form. To prove: no transformation applies => in intended form.
+    Show the reverse: Not in intended form => transformation applies.
+
+  \subsection{Determinism}
+    How to prove this?