More indent fixes to the Normalization chapter.

author Matthijs Kooijman <matthijs@stdin.nl>

Mon, 2 Nov 2009 11:21:06 +0000 (12:21 +0100)

committer Matthijs Kooijman <matthijs@stdin.nl>

Mon, 2 Nov 2009 12:12:32 +0000 (13:12 +0100)
author Matthijs Kooijman <matthijs@stdin.nl>
Mon, 2 Nov 2009 11:21:06 +0000 (12:21 +0100)
committer Matthijs Kooijman <matthijs@stdin.nl>
Mon, 2 Nov 2009 12:12:32 +0000 (13:12 +0100)
diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex

index d8f2fe9723182bdc4d66236e1b79b48a63454470..d93717a1337e968253320a41d5a25cf32971ae7f 100644 (file)
--- a/Chapters/Normalization.tex
+++ b/Chapters/Normalization.tex
@@ -1,1666 +1,1665 @@
  \chapter[chap:normalization]{Normalization}
+  % A helper to print a single example in the half the page width. The example
+  % text should be in a buffer whose name is given in an argument.
+  %
+  % The align=right option really does left-alignment, but without the program
+  % will end up on a single line. The strut=no option prevents a bunch of empty
+  % space at the start of the frame.
+  \define[1]\example{
+    \framed[offset=1mm,align=right,strut=no,background=box,frame=off]{
+      \setuptyping[option=LAM,style=sans,before=,after=]
+      \typebuffer[#1]
+      \setuptyping[option=none,style=\tttf]
+    }
+  }
  
-% A helper to print a single example in the half the page width. The example
-% text should be in a buffer whose name is given in an argument.
-%
-% The align=right option really does left-alignment, but without the program
-% will end up on a single line. The strut=no option prevents a bunch of empty
-% space at the start of the frame.
-\define[1]\example{
-  \framed[offset=1mm,align=right,strut=no,background=box,frame=off]{
-    \setuptyping[option=LAM,style=sans,before=,after=]
-    \typebuffer[#1]
-    \setuptyping[option=none,style=\tttf]
+  \define[3]\transexample{
+    \placeexample[here]{#1}
+    \startcombination[2*1]
+      {\example{#2}}{Original program}
+      {\example{#3}}{Transformed program}
+    \stopcombination
    }
-}
-
-\define[3]\transexample{
-  \placeexample[here]{#1}
-  \startcombination[2*1]
-    {\example{#2}}{Original program}
-    {\example{#3}}{Transformed program}
-  \stopcombination
-}
-
-The first step in the core to \small{VHDL} translation process, is normalization. We
-aim to bring the core description into a simpler form, which we can
-subsequently translate into \small{VHDL} easily. This normal form is needed because
-the full core language is more expressive than \small{VHDL} in some areas and because
-core can describe expressions that do not have a direct hardware
-interpretation.
-
-TODO: Describe core properties not supported in \small{VHDL}, and describe how the
-\small{VHDL} we want to generate should look like.
-
-\section{Normal form}
-The transformations described here have a well-defined goal: To bring the
-program in a well-defined form that is directly translatable to hardware,
-while fully preserving the semantics of the program. We refer to this form as
-the \emph{normal form} of the program. The formal definition of this normal
-form is quite simple:
-
-\placedefinition{}{A program is in \emph{normal form} if none of the
-transformations from this chapter apply.}
-
-Of course, this is an \quote{easy} definition of the normal form, since our
-program will end up in normal form automatically. The more interesting part is
-to see if this normal form actually has the properties we would like it to
-have.
-
-But, before getting into more definitions and details about this normal form,
-let's try to get a feeling for it first. The easiest way to do this is by
-describing the things we want to not have in a normal form.
-
-\startitemize
-  \item Any \emph{polymorphism} must be removed. When laying down hardware, we
-  can't generate any signals that can have multiple types. All types must be
-  completely known to generate hardware.
-  
-  \item Any \emph{higher order} constructions must be removed. We can't
-  generate a hardware signal that contains a function, so all values,
-  arguments and returns values used must be first order.
-
-  \item Any complex \emph{nested scopes} must be removed. In the \small{VHDL}
-  description, every signal is in a single scope. Also, full expressions are
-  not supported everywhere (in particular port maps can only map signal names,
-  not expressions). To make the \small{VHDL} generation easy, all values must be bound
-  on the \quote{top level}.
-\stopitemize
-
-TODO: Intermezzo: functions vs plain values
-
-A very simple example of a program in normal form is given in
-\in{example}[ex:MulSum]. As you can see, all arguments to the function (which
-will become input ports in the final hardware) are at the top. This means that
-the body of the final lambda abstraction is never a function, but always a
-plain value.
-
-After the lambda abstractions, we see a single let expression, that binds two
-variables (\lam{mul} and \lam{sum}). These variables will be signals in the
-final hardware, bound to the output port of the \lam{*} and \lam{+}
-components.
-
-The final line (the \quote{return value} of the function) selects the
-\lam{sum} signal to be the output port of the function. This \quote{return
-value} can always only be a variable reference, never a more complex
-expression.
-
-\startbuffer[MulSum]
-alu :: Bit -> Word -> Word -> Word
-alu = λa.λb.λc.
-    let
-      mul = (*) a b
-      sum = (+) mul c
-    in
-      sum
-\stopbuffer
-
-\startuseMPgraphic{MulSum}
-  save a, b, c, mul, add, sum;
-
-  % I/O ports
-  newCircle.a(btex $a$ etex) "framed(false)";
-  newCircle.b(btex $b$ etex) "framed(false)";
-  newCircle.c(btex $c$ etex) "framed(false)";
-  newCircle.sum(btex $res$ etex) "framed(false)";
-
-  % Components
-  newCircle.mul(btex - etex);
-  newCircle.add(btex + etex);
-
-  a.c      - b.c   = (0cm, 2cm);
-  b.c      - c.c   = (0cm, 2cm);
-  add.c            = c.c + (2cm, 0cm);
-  mul.c            = midpoint(a.c, b.c) + (2cm, 0cm);
-  sum.c            = add.c + (2cm, 0cm);
-  c.c              = origin;
-
-  % Draw objects and lines
-  drawObj(a, b, c, mul, add, sum);
-
-  ncarc(a)(mul) "arcangle(15)";
-  ncarc(b)(mul) "arcangle(-15)";
-  ncline(c)(add);
-  ncline(mul)(add);
-  ncline(add)(sum);
-\stopuseMPgraphic
-
-\placeexample[here][ex:MulSum]{Simple architecture consisting of an adder and a
-subtractor.}
-  \startcombination[2*1]
-    {\typebufferlam{MulSum}}{Core description in normal form.}
-    {\boxedgraphic{MulSum}}{The architecture described by the normal form.}
-  \stopcombination
-
-The previous example described composing an architecture by calling other
-functions (operators), resulting in a simple architecture with component and
-connection. There is of course also some mechanism for choice in the normal
-form. In a normal Core program, the \emph{case} expression can be used in a
-few different ways to describe choice. In normal form, this is limited to a
-very specific form.
-
-\in{Example}[ex:AddSubAlu] shows an example describing a
-simple \small{ALU}, which chooses between two operations based on an opcode
-bit. The main structure is the same as in \in{example}[ex:MulSum], but this
-time the \lam{res} variable is bound to a case expression. This case
-expression scrutinizes the variable \lam{opcode} (and scrutinizing more
-complex expressions is not supported). The case expression can select a
-different variable based on the constructor of \lam{opcode}.
-
-\startbuffer[AddSubAlu]
-alu :: Bit -> Word -> Word -> Word
-alu = λopcode.λa.λb.
-    let
-      res1 = (+) a b
-      res2 = (-) a b
-      res = case opcode of
-        Low -> res1
-        High -> res2
-    in
-      res
-\stopbuffer
-
-\startuseMPgraphic{AddSubAlu}
-  save opcode, a, b, add, sub, mux, res;
-
-  % I/O ports
-  newCircle.opcode(btex $opcode$ etex) "framed(false)";
-  newCircle.a(btex $a$ etex) "framed(false)";
-  newCircle.b(btex $b$ etex) "framed(false)";
-  newCircle.res(btex $res$ etex) "framed(false)";
-  % Components
-  newCircle.add(btex + etex);
-  newCircle.sub(btex - etex);
-  newMux.mux;
-
-  opcode.c - a.c   = (0cm, 2cm);
-  add.c    - a.c   = (4cm, 0cm);
-  sub.c    - b.c   = (4cm, 0cm);
-  a.c      - b.c   = (0cm, 3cm);
-  mux.c            = midpoint(add.c, sub.c) + (1.5cm, 0cm);
-  res.c    - mux.c = (1.5cm, 0cm);
-  b.c              = origin;
-
-  % Draw objects and lines
-  drawObj(opcode, a, b, res, add, sub, mux);
-
-  ncline(a)(add) "posA(e)";
-  ncline(b)(sub) "posA(e)";
-  nccurve(a)(sub) "posA(e)", "angleA(0)";
-  nccurve(b)(add) "posA(e)", "angleA(0)";
-  nccurve(add)(mux) "posB(inpa)", "angleB(0)";
-  nccurve(sub)(mux) "posB(inpb)", "angleB(0)";
-  nccurve(opcode)(mux) "posB(n)", "angleA(0)", "angleB(-90)";
-  ncline(mux)(res) "posA(out)";
-\stopuseMPgraphic
-
-\placeexample[here][ex:AddSubAlu]{Simple \small{ALU} supporting two operations.}
-  \startcombination[2*1]
-    {\typebufferlam{AddSubAlu}}{Core description in normal form.}
-    {\boxedgraphic{AddSubAlu}}{The architecture described by the normal form.}
-  \stopcombination
-
-As a more complete example, consider \in{example}[ex:NormalComplete]. This
-example contains everything that is supported in normal form, with the
-exception of builtin higher order functions. The graphical version of the
-architecture contains a slightly simplified version, since the state tuple
-packing and unpacking have been left out. Instead, two seperate registers are
-drawn. Also note that most synthesis tools will further optimize this
-architecture by removing the multiplexers at the register input and replace
-them with some logic in the clock inputs, but we want to show the architecture
-as close to the description as possible.
-
-\startbuffer[NormalComplete]
-  regbank :: Bit 
-             -> Word 
-             -> State (Word, Word) 
-             -> (State (Word, Word), Word)
-
-  -- All arguments are an inital lambda
-  regbank = λa.λd.λsp.
-  -- There are nested let expressions at top level
-  let
-    -- Unpack the state by coercion (\eg, cast from
-    -- State (Word, Word) to (Word, Word))
-    s = sp :: (Word, Word)
-    -- Extract both registers from the state
-    r1 = case s of (fst, snd) -> fst
-    r2 = case s of (fst, snd) -> snd
-    -- Calling some other user-defined function.
-    d' = foo d
-    -- Conditional connections
-    out = case a of
-      High -> r1
-      Low -> r2
-    r1' = case a of
-      High -> d'
-      Low -> r1
-    r2' = case a of
-      High -> r2
-      Low -> d'
-    -- Packing a tuple
-    s' = (,) r1' r2'
-    -- pack the state by coercion (\eg, cast from
-    -- (Word, Word) to State (Word, Word))
-    sp' = s' :: State (Word, Word)
-    -- Pack our return value
-    res = (,) sp' out
-  in
-    -- The actual result
-    res
-\stopbuffer
-
-\startuseMPgraphic{NormalComplete}
-  save a, d, r, foo, muxr, muxout, out;
-
-  % I/O ports
-  newCircle.a(btex \lam{a} etex) "framed(false)";
-  newCircle.d(btex \lam{d} etex) "framed(false)";
-  newCircle.out(btex \lam{out} etex) "framed(false)";
-  % Components
-  %newCircle.add(btex + etex);
-  newBox.foo(btex \lam{foo} etex);
-  newReg.r1(btex $\lam{r1}$ etex) "dx(4mm)", "dy(6mm)";
-  newReg.r2(btex $\lam{r2}$ etex) "dx(4mm)", "dy(6mm)", "reflect(true)";
-  newMux.muxr1;
-  % Reflect over the vertical axis
-  reflectObj(muxr1)((0,0), (0,1));
-  newMux.muxr2;
-  newMux.muxout;
-  rotateObj(muxout)(-90);
-
-  d.c               = foo.c + (0cm, 1.5cm); 
-  a.c               = (xpart r2.c + 2cm, ypart d.c - 0.5cm);
-  foo.c             = midpoint(muxr1.c, muxr2.c) + (0cm, 2cm);
-  muxr1.c           = r1.c + (0cm, 2cm);
-  muxr2.c           = r2.c + (0cm, 2cm);
-  r2.c              = r1.c + (4cm, 0cm);
-  r1.c              = origin;
-  muxout.c          = midpoint(r1.c, r2.c) - (0cm, 2cm);
-  out.c             = muxout.c - (0cm, 1.5cm);
-
-%  % Draw objects and lines
-  drawObj(a, d, foo, r1, r2, muxr1, muxr2, muxout, out);
-  
-  ncline(d)(foo);
-  nccurve(foo)(muxr1) "angleA(-90)", "posB(inpa)", "angleB(180)";
-  nccurve(foo)(muxr2) "angleA(-90)", "posB(inpb)", "angleB(0)";
-  nccurve(muxr1)(r1) "posA(out)", "angleA(180)", "posB(d)", "angleB(0)";
-  nccurve(r1)(muxr1) "posA(out)", "angleA(0)", "posB(inpb)", "angleB(180)";
-  nccurve(muxr2)(r2) "posA(out)", "angleA(0)", "posB(d)", "angleB(180)";
-  nccurve(r2)(muxr2) "posA(out)", "angleA(180)", "posB(inpa)", "angleB(0)";
-  nccurve(r1)(muxout) "posA(out)", "angleA(0)", "posB(inpb)", "angleB(-90)";
-  nccurve(r2)(muxout) "posA(out)", "angleA(180)", "posB(inpa)", "angleB(-90)";
-  % Connect port a
-  nccurve(a)(muxout) "angleA(-90)", "angleB(180)", "posB(sel)";
-  nccurve(a)(muxr1) "angleA(180)", "angleB(-90)", "posB(sel)";
-  nccurve(a)(muxr2) "angleA(180)", "angleB(-90)", "posB(sel)";
-  ncline(muxout)(out) "posA(out)";
-\stopuseMPgraphic
-
-\placeexample[here][ex:NormalComplete]{Simple architecture consisting of an adder and a
-subtractor.}
-  \startcombination[2*1]
-    {\typebufferlam{NormalComplete}}{Core description in normal form.}
-    {\boxedgraphic{NormalComplete}}{The architecture described by the normal form.}
-  \stopcombination
-
-\subsection{Intended normal form definition}
-Now we have some intuition for the normal form, we can describe how we want
-the normal form to look like in a slightly more formal manner. The following
-EBNF-like description completely captures the intended structure (and
-generates a subset of GHC's core format).
-
-Some clauses have an expression listed in parentheses. These are conditions
-that need to apply to the clause.
-
-\startlambda
-\italic{normal} = \italic{lambda}
-\italic{lambda} = λvar.\italic{lambda} (representable(var))
-                | \italic{toplet} 
-\italic{toplet} = let \italic{binding} in \italic{toplet} 
-                | letrec [\italic{binding}] in \italic{toplet}
-                | var (representable(varvar))
-\italic{binding} = var = \italic{rhs} (representable(rhs))
-                 -- State packing and unpacking by coercion
-                 | var0 = var1 :: State ty (lvar(var1))
-                 | var0 = var1 :: ty (var0 :: State ty) (lvar(var1))
-\italic{rhs} = userapp
-             | builtinapp
-             -- Extractor case
-             | case var of C a0 ... an -> ai (lvar(var))
-             -- Selector case
-             | case var of (lvar(var))
-                DEFAULT -> var0 (lvar(var0))
-                C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, lvar(resvar))
-\italic{userapp} = \italic{userfunc}
-                 | \italic{userapp} {userarg}
-\italic{userfunc} = var (gvar(var))
-\italic{userarg} = var (lvar(var))
-\italic{builtinapp} = \italic{builtinfunc}
-                    | \italic{builtinapp} \italic{builtinarg}
-\italic{builtinfunc} = var (bvar(var))
-\italic{builtinarg} = \italic{coreexpr}
-\stoplambda
-
--- TODO: Limit builtinarg further
-
--- TODO: There can still be other casts around (which the code can handle,
-e.g., ignore), which still need to be documented here.
-
--- TODO: Note about the selector case. It just supports Bit and Bool
-currently, perhaps it should be generalized in the normal form?
-
-When looking at such a program from a hardware perspective, the top level
-lambda's define the input ports. The value produced by the let expression is
-the output port. Most function applications bound by the let expression
-define a component instantiation, where the input and output ports are mapped
-to local signals or arguments. Some of the others use a builtin
-construction (\eg the \lam{case} statement) or call a builtin function
-(\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is
-available.
-
-\section{Transformation notation}
-To be able to concisely present transformations, we use a specific format to
-them. It is a simple format, similar to one used in logic reasoning.
-
-Such a transformation description looks like the following.
-
-\starttrans
-<context conditions>
-~
-<original expression>
---------------------------          <expression conditions>
-<transformed expresssion>
-~
-<context additions>
-\stoptrans
-
-This format desribes a transformation that applies to \lam{original
-expresssion} and transforms it into \lam{transformed expression}, assuming
-that all conditions apply. In this format, there are a number of placeholders
-in pointy brackets, most of which should be rather obvious in their meaning.
-Nevertheless, we will more precisely specify their meaning below:
-
-  \startdesc{<original expression>} The expression pattern that will be matched
-  against (subexpressions of) the expression to be transformed. We call this a
-  pattern, because it can contain \emph{placeholders} (variables), which match
-  any expression or binder. Any such placeholder is said to be \emph{bound} to
-  the expression it matches. It is convention to use an uppercase latter (\eg
-  \lam{M} or \lam{E} to refer to any expression (including a simple variable
-  reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to
-  (references to) binders.
-
-  For example, the pattern \lam{a + B} will match the expression 
-  \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to 
-  \lam{(2 * 2)}), but not \lam{v + (2 * w)}.
-  \stopdesc
-
-  \startdesc{<expression conditions>}
-  These are extra conditions on the expression that is matched. These
-  conditions can be used to further limit the cases in which the
-  transformation applies, in particular to prevent a transformation from
-  causing a loop with itself or another transformation.
-
-  Only if these if these conditions are \emph{all} true, this transformation
-  applies.
-  \stopdesc
-
-  \startdesc{<context conditions>}
-  These are a number of extra conditions on the context of the function. In
-  particular, these conditions can require some other top level function to be
-  present, whose value matches the pattern given here. The format of each of
-  these conditions is: \lam{binder = <pattern>}.
-
-  Typically, the binder is some placeholder bound in the \lam{<original
-  expression>}, while the pattern contains some placeholders that are used in
-  the \lam{transformed expression}.
-  
-  Only if a top level binder exists that matches each binder and pattern, this
-  transformation applies.
-  \stopdesc
-
-  \startdesc{<transformed expression>}
-  This is the expression template that is the result of the transformation. If, looking
-  at the above three items, the transformation applies, the \lam{original
-  expression} is completely replaced with the \lam{<transformed expression>}.
-  We call this a template, because it can contain placeholders, referring to
-  any placeholder bound by the \lam{<original expression>} or the
-  \lam{<context conditions>}. The resulting expression will have those
-  placeholders replaced by the values bound to them.
-
-  Any binder (lowercase) placeholder that has no value bound to it yet will be
-  bound to (and replaced with) a fresh binder.
-  \stopdesc
-
-  \startdesc{<context additions>}
-  These are templates for new functions to add to the context. This is a way
-  to have a transformation create new top level functiosn.
-
-  Each addition has the form \lam{binder = template}. As above, any
-  placeholder in the addition is replaced with the value bound to it, and any
-  binder placeholder that has no value bound to it yet will be bound to (and
-  replaced with) a fresh binder.
-  \stopdesc
-
-  As an example, we'll look at η-abstraction:
-
-\starttrans
-E                 \lam{E :: a -> b}
---------------    \lam{E} does not occur on a function position in an application
-λx.E x            \lam{E} is not a lambda abstraction.
-\stoptrans
-
-  Consider the following function, which is a fairly obvious way to specify a
-  simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this
-  function):
-
-\startlambda 
-alu :: Bit -> Word -> Word -> Word
-alu = λopcode. case opcode of
-  Low -> (+)
-  High -> (-)
-\stoplambda
-
-  There are a few subexpressions in this function to which we could possibly
-  apply the transformation. Since the pattern of the transformation is only
-  the placeholder \lam{E}, any expression will match that. Whether the
-  transformation applies to an expression is thus solely decided by the
-  conditions to the right of the transformation.
-
-  We will look at each expression in the function in a top down manner. The
-  first expression is the entire expression the function is bound to.
-
-\startlambda
-λopcode. case opcode of
-  Low -> (+)
-  High -> (-)
-\stoplambda
-
-  As said, the expression pattern matches this. The type of this expression is
-  \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in
-  this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}).
-
-  Since this expression is at top level, it does not occur at a function
-  position of an application. However, The expression is a lambda abstraction,
-  so this transformation does not apply.
-
-  The next expression we could apply this transformation to, is the body of
-  the lambda abstraction:
-
-\startlambda
-case opcode of
-  Low -> (+)
-  High -> (-)
-\stoplambda
-
-  The type of this expression is \lam{Word -> Word -> Word}, which again
-  matches \lam{a -> b}. The expression is the body of a lambda expression, so
-  it does not occur at a function position of an application. Finally, the
-  expression is not a lambda abstraction but a case expression, so all the
-  conditions match. There are no context conditions to match, so the
-  transformation applies.
-
-  By now, the placeholder \lam{E} is bound to the entire expression. The
-  placeholder \lam{x}, which occurs in the replacement template, is not bound
-  yet, so we need to generate a fresh binder for that. Let's use the binder
-  \lam{a}. This results in the following replacement expression:
-
-\startlambda
-λa.(case opcode of
-  Low -> (+)
-  High -> (-)) a
-\stoplambda
-
-  Continuing with this expression, we see that the transformation does not
-  apply again (it is a lambda expression). Next we look at the body of this
-  labmda abstraction:
-
-\startlambda
-(case opcode of
-  Low -> (+)
-  High -> (-)) a
-\stoplambda
-  
-  Here, the transformation does apply, binding \lam{E} to the entire
-  expression and \lam{x} to the fresh binder \lam{b}, resulting in the
-  replacement:
-
-\startlambda
-λb.(case opcode of
-  Low -> (+)
-  High -> (-)) a b
-\stoplambda
-
-  Again, the transformation does not apply to this lambda abstraction, so we
-  look at its body. For brevity, we'll put the case statement on one line from
-  now on.
-
-\startlambda
-(case opcode of Low -> (+); High -> (-)) a b
-\stoplambda
-
-  The type of this expression is \lam{Word}, so it does not match \lam{a -> b}
-  and the transformation does not apply. Next, we have two options for the
-  next expression to look at: The function position and argument position of
-  the application. The expression in the argument position is \lam{b}, which
-  has type \lam{Word}, so the transformation does not apply. The expression in
-  the function position is:
-
-\startlambda
-(case opcode of Low -> (+); High -> (-)) a
-\stoplambda
-
-  Obviously, the transformation does not apply here, since it occurs in
-  function position. In the same way the transformation does not apply to both
-  components of this expression (\lam{case opcode of Low -> (+); High -> (-)}
-  and \lam{a}), so we'll skip to the components of the case expression: The
-  scrutinee and both alternatives. Since the opcode is not a function, it does
-  not apply here, and we'll leave both alternatives as an exercise to the
-  reader. The final function, after all these transformations becomes:
-
-\startlambda 
-alu :: Bit -> Word -> Word -> Word
-alu = λopcode.λa.b. (case opcode of
-  Low -> λa1.λb1 (+) a1 b1
-  High -> λa2.λb2 (-) a2 b2) a b
-\stoplambda
-
-  In this case, the transformation does not apply anymore, though this might
-  not always be the case (e.g., the application of a transformation on a
-  subexpression might open up possibilities to apply the transformation
-  further up in the expression).
-
-\subsection{Transformation application}
-In this chapter we define a number of transformations, but how will we apply
-these? As stated before, our normal form is reached as soon as no
-transformation applies anymore. This means our application strategy is to
-simply apply any transformation that applies, and continuing to do that with
-the result of each transformation.
-
-In particular, we define no particular order of transformations. Since
-transformation order should not influence the resulting normal form (see TODO
-ref), this leaves the implementation free to choose any application order that
-results in an efficient implementation.
-
-When applying a single transformation, we try to apply it to every (sub)expression
-in a function, not just the top level function. This allows us to keep the
-transformation descriptions concise and powerful.
-
-\subsection{Definitions}
-In the following sections, we will be using a number of functions and
-notations, which we will define here.
-
-\subsubsection{Other concepts}
-A \emph{global variable} is any variable that is bound at the
-top level of a program, or an external module. A \emph{local variable} is any
-other variable (\eg, variables local to a function, which can be bound by
-lambda abstractions, let expressions and pattern matches of case
-alternatives).  Note that this is a slightly different notion of global versus
-local than what \small{GHC} uses internally.
-\defref{global variable} \defref{local variable}
-
-A \emph{hardware representable} (or just \emph{representable}) type or value
-is (a value of) a type that we can generate a signal for in hardware. For
-example, a bit, a vector of bits, a 32 bit unsigned word, etc. Types that are
-not runtime representable notably include (but are not limited to): Types,
-dictionaries, functions.
-\defref{representable}
-
-A \emph{builtin function} is a function supplied by the Cλash framework, whose
-implementation is not valid Cλash. The implementation is of course valid
-Haskell, for simulation, but it is not expressable in Cλash.
-\defref{builtin function} \defref{user-defined function}
-
-For these functions, Cλash has a \emph{builtin hardware translation}, so calls
-to these functions can still be translated. These are functions like
-\lam{map}, \lam{hwor} and \lam{length}.
-
-A \emph{user-defined} function is a function for which we do have a Cλash
-implementation available.
-
-\subsubsection{Functions}
-Here, we define a number of functions that can be used below to concisely
-specify conditions.
-
-\refdef{global variable}\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
-global variable. It is false when it references a local variable.
-
-\refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
-references a local variable, false when it references a global variable.
-
-\refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when
-\emph{expr} or \emph{var} is \emph{representable}.
-
-\subsection{Binder uniqueness}
-A common problem in transformation systems, is binder uniqueness. When not
-considering this problem, it is easy to create transformations that mix up
-bindings and cause name collisions. Take for example, the following core
-expression:
-
-\startlambda
-(λa.λb.λc. a * b * c) x c
-\stoplambda
-
-By applying β-reduction (see below) once, we can simplify this expression to:
-
-\startlambda
-(λb.λc. x * b * c) c
-\stoplambda
-
-Now, we have replaced the \lam{a} binder with a reference to the \lam{x}
-binder. No harm done here. But note that we see multiple occurences of the
-\lam{c} binder. The first is a binding occurence, to which the second refers.
-The last, however refers to \emph{another} instance of \lam{c}, which is
-bound somewhere outside of this expression. Now, if we would apply beta
-reduction without taking heed of binder uniqueness, we would get:
-
-\startlambda
-λc. x * c * c
-\stoplambda
-
-This is obviously not what was supposed to happen! The root of this problem is
-the reuse of binders: Identical binders can be bound in different scopes, such
-that only the inner one is \quote{visible} in the inner expression. In the example
-above, the \lam{c} binder was bound outside of the expression and in the inner
-lambda expression. Inside that lambda expression, only the inner \lam{c} is
-visible.
-
-There are a number of ways to solve this. \small{GHC} has isolated this
-problem to their binder substitution code, which performs \emph{deshadowing}
-during its expression traversal. This means that any binding that shadows
-another binding on a higher level is replaced by a new binder that does not
-shadow any other binding. This non-shadowing invariant is enough to prevent
-binder uniqueness problems in \small{GHC}.
-
-In our transformation system, maintaining this non-shadowing invariant is
-a bit harder to do (mostly due to implementation issues, the prototype doesn't
-use \small{GHC}'s subsitution code). Also, we can observe the following
-points.
-
-\startitemize
-\item Deshadowing does not guarantee overall uniqueness. For example, the
-following (slightly contrived) expression shows the identifier \lam{x} bound in
-two seperate places (and to different values), even though no shadowing
-occurs.
-
-\startlambda
-(let x = 1 in x) + (let x = 2 in x)
-\stoplambda
-
-\item In our normal form (and the resulting \small{VHDL}), all binders
-(signals) will end up in the same scope. To allow this, all binders within the
-same function should be unique.
-
-\item When we know that all binders in an expression are unique, moving around
-or removing a subexpression will never cause any binder conflicts. If we have
-some way to generate fresh binders, introducing new subexpressions will not
-cause any problems either. The only way to cause conflicts is thus to
-duplicate an existing subexpression.
-\stopitemize
-
-Given the above, our prototype maintains a unique binder invariant. This
-meanst that in any given moment during normalization, all binders \emph{within
-a single function} must be unique. To achieve this, we apply the following
-technique.
-
-TODO: Define fresh binders and unique supplies
-
-\startitemize
-\item Before starting normalization, all binders in the function are made
-unique. This is done by generating a fresh binder for every binder used. This
-also replaces binders that did not pose any conflict, but it does ensure that
-all binders within the function are generated by the same unique supply. See
-(TODO: ref fresh binder).
-\item Whenever a new binder must be generated, we generate a fresh binder that
-is guaranteed to be different from \emph{all binders generated so far}. This
-can thus never introduce duplication and will maintain the invariant.
-\item Whenever (part of) an expression is duplicated (for example when
-inlining), all binders in the expression are replaced with fresh binders
-(using the same method as at the start of normalization). These fresh binders
-can never introduce duplication, so this will maintain the invariant.
-\item Whenever we move part of an expression around within the function, there
-is no need to do anything special. There is obviously no way to introduce
-duplication by moving expressions around. Since we know that each of the
-binders is already unique, there is no way to introduce (incorrect) shadowing
-either.
-\stopitemize
-
-\section{Transform passes}
-In this section we describe the actual transforms. Here we're using
-the core language in a notation that resembles lambda calculus.
-
-Each of these transforms is meant to be applied to every (sub)expression
-in a program, for as long as it applies. Only when none of the
-transformations can be applied anymore, the program is in normal form (by
-definition). We hope to be able to prove that this form will obey all of the
-constraints defined above, but this has yet to happen (though it seems likely
-that it will).
-
-Each of the transforms will be described informally first, explaining
-the need for and goal of the transform. Then, a formal definition is
-given, using a familiar syntax from the world of logic. Each transform
-is specified as a number of conditions (above the horizontal line) and a
-number of conclusions (below the horizontal line). The details of using
-this notation are still a bit fuzzy, so comments are welcom.
-
-  \subsection{General cleanup}
-
-    \subsubsection{β-reduction}
-      β-reduction is a well known transformation from lambda calculus, where it is
-      the main reduction step. It reduces applications of labmda abstractions,
-      removing both the lambda abstraction and the application.
-
-      In our transformation system, this step helps to remove unwanted lambda
-      abstractions (basically all but the ones at the top level). Other
-      transformations (application propagation, non-representable inlining) make
-      sure that most lambda abstractions will eventually be reducable by
-      β-reduction.
-
-      TODO: Define substitution syntax
-
-      \starttrans
-      (λx.E) M
-      -----------------
-      E[M/x]
-      \stoptrans
-
-      % And an example
-      \startbuffer[from]
-      (λa. 2 * a) (2 * b)
-      \stopbuffer
-
-      \startbuffer[to]
-      2 * (2 * b)
-      \stopbuffer
-
-      \transexample{β-reduction}{from}{to}
-
-    \subsubsection{Application propagation}
-      This transformation is meant to propagate application expressions downwards
-      into expressions as far as possible. This allows partial applications inside
-      expressions to become fully applied and exposes new transformation
-      possibilities for other transformations (like β-reduction).
-
-      \starttrans
-      let binds in E) M
-      -----------------
-      let binds in E M
-      \stoptrans
-
-      % And an example
-      \startbuffer[from]
-      ( let 
-          val = 1
-        in 
-          add val
-      ) 3
-      \stopbuffer
-
-      \startbuffer[to]
-      let 
-        val = 1
-      in 
-        add val 3
-      \stopbuffer
-
-      \transexample{Application propagation for a let expression}{from}{to}
-
-      \starttrans
-      (case x of
-        p1 -> E1
-        \vdots
-        pn -> En) M
-      -----------------
-      case x of
-        p1 -> E1 M
-        \vdots
-        pn -> En M
-      \stoptrans
-
-      % And an example
-      \startbuffer[from]
-      ( case x of 
-          True -> id
-          False -> neg
-      ) 1
-      \stopbuffer
-
-      \startbuffer[to]
-      case x of 
-        True -> id 1
-        False -> neg 1
-      \stopbuffer
-
-      \transexample{Application propagation for a case expression}{from}{to}
-
-    \subsubsection{Empty let removal}
-      This transformation is simple: It removes recursive lets that have no bindings
-      (which usually occurs when let derecursification removes the last binding from
-      it).
-
-      \starttrans
-      letrec in M
-      --------------
-      M
-      \stoptrans
-
-      \subsubsection{Simple let binding removal}
-      This transformation inlines simple let bindings (\eg a = b).
-
-      This transformation is not needed to get into normal form, but makes the
-      resulting \small{VHDL} a lot shorter.
-
-      \starttrans
-      letnonrec
-        a = b
-      in
-        M
-      -----------------
-      M[b/a]
-      \stoptrans
-
-      \starttrans
-      letrec
-        \vdots
-        a = b
-        \vdots
-      in
-        M
-      -----------------
-      let
-        \vdots [b/a]
-        \vdots [b/a]
-      in
-        M[b/a]
-      \stoptrans
-
-    \subsubsection{Unused let binding removal}
-      This transformation removes let bindings that are never used. Usually,
-      the desugarer introduces some unused let bindings.
-
-      This normalization pass should really be unneeded to get into normal form
-      (since ununsed bindings are not forbidden by the normal form), but in practice
-      the desugarer or simplifier emits some unused bindings that cannot be
-      normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
-      this transformation makes the resulting \small{VHDL} a lot shorter.
-
-      \starttrans
-      let a = E in M
-      ----------------------------    \lam{a} does not occur free in \lam{M}
-      M
-      \stoptrans
-
-      \starttrans
-      letrec
-        \vdots
-        a = E
-        \vdots
-      in
-        M
-      ----------------------------    \lam{a} does not occur free in \lam{M}
-      letrec
-        \vdots
-        \vdots
-      in
-        M
-      \stoptrans
-
-    \subsubsection{Cast propagation / simplification}
-      This transform pushes casts down into the expression as far as possible.
-      Since its exact role and need is not clear yet, this transformation is
-      not yet specified.
-
-    \subsubsection{Compiler generated top level binding inlining}
-      TODO
-
-  \section{Program structure}
-
-    \subsubsection{η-abstraction}
-      This transformation makes sure that all arguments of a function-typed
-      expression are named, by introducing lambda expressions. When combined with
-      β-reduction and function inlining below, all function-typed expressions should
-      be lambda abstractions or global identifiers.
-
-      \starttrans
-      E                 \lam{E :: a -> b}
-      --------------    \lam{E} is not the first argument of an application.
-      λx.E x            \lam{E} is not a lambda abstraction.
-                        \lam{x} is a variable that does not occur free in \lam{E}.
-      \stoptrans
-
-      \startbuffer[from]
-      foo = λa.case a of 
-        True -> λb.mul b b
-        False -> id
-      \stopbuffer
-
-      \startbuffer[to]
-      foo = λa.λx.(case a of 
-          True -> λb.mul b b
-          False -> λy.id y) x
-      \stopbuffer
-
-      \transexample{η-abstraction}{from}{to}
-
-    \subsubsection{Let derecursification}
-      This transformation is meant to make lets non-recursive whenever possible.
-      This might allow other optimizations to do their work better. TODO: Why is
-      this needed exactly?
-
-    \subsubsection{Let flattening}
-      This transformation puts nested lets in the same scope, by lifting the
-      binding(s) of the inner let into a new let around the outer let. Eventually,
-      this will cause all let bindings to appear in the same scope (they will all be
-      in scope for the function return value).
-
-      Note that this transformation does not try to be smart when faced with
-      recursive lets, it will just leave the lets recursive (possibly joining a
-      recursive and non-recursive let into a single recursive let). The let
-      rederecursification transformation will do this instead.
-
-      \starttrans
-      letnonrec x = (let bindings in M) in N
-      ------------------------------------------
-      let bindings in (letnonrec x = M) in N
-      \stoptrans
-
-      \starttrans
-      letrec 
-        \vdots
-        x = (let bindings in M)
-        \vdots
-      in
-        N
-      ------------------------------------------
-      letrec
-        \vdots
-        bindings
-        x = M
-        \vdots
-      in
-        N
-      \stoptrans
  
-      \startbuffer[from]
-      let
-        a = letrec
-          x = 1
-          y = 2
-        in
-          x + y
-      in
-        letrec
-          b = let c = 3 in a + c
-          d = 4
+  The first step in the core to \small{VHDL} translation process, is normalization. We
+  aim to bring the core description into a simpler form, which we can
+  subsequently translate into \small{VHDL} easily. This normal form is needed because
+  the full core language is more expressive than \small{VHDL} in some areas and because
+  core can describe expressions that do not have a direct hardware
+  interpretation.
+
+  TODO: Describe core properties not supported in \small{VHDL}, and describe how the
+  \small{VHDL} we want to generate should look like.
+
+  \section{Normal form}
+    The transformations described here have a well-defined goal: To bring the
+    program in a well-defined form that is directly translatable to hardware,
+    while fully preserving the semantics of the program. We refer to this form as
+    the \emph{normal form} of the program. The formal definition of this normal
+    form is quite simple:
+
+    \placedefinition{}{A program is in \emph{normal form} if none of the
+    transformations from this chapter apply.}
+
+    Of course, this is an \quote{easy} definition of the normal form, since our
+    program will end up in normal form automatically. The more interesting part is
+    to see if this normal form actually has the properties we would like it to
+    have.
+
+    But, before getting into more definitions and details about this normal form,
+    let's try to get a feeling for it first. The easiest way to do this is by
+    describing the things we want to not have in a normal form.
+
+    \startitemize
+      \item Any \emph{polymorphism} must be removed. When laying down hardware, we
+      can't generate any signals that can have multiple types. All types must be
+      completely known to generate hardware.
+      
+      \item Any \emph{higher order} constructions must be removed. We can't
+      generate a hardware signal that contains a function, so all values,
+      arguments and returns values used must be first order.
+
+      \item Any complex \emph{nested scopes} must be removed. In the \small{VHDL}
+      description, every signal is in a single scope. Also, full expressions are
+      not supported everywhere (in particular port maps can only map signal names,
+      not expressions). To make the \small{VHDL} generation easy, all values must be bound
+      on the \quote{top level}.
+    \stopitemize
+
+    TODO: Intermezzo: functions vs plain values
+
+    A very simple example of a program in normal form is given in
+    \in{example}[ex:MulSum]. As you can see, all arguments to the function (which
+    will become input ports in the final hardware) are at the top. This means that
+    the body of the final lambda abstraction is never a function, but always a
+    plain value.
+
+    After the lambda abstractions, we see a single let expression, that binds two
+    variables (\lam{mul} and \lam{sum}). These variables will be signals in the
+    final hardware, bound to the output port of the \lam{*} and \lam{+}
+    components.
+
+    The final line (the \quote{return value} of the function) selects the
+    \lam{sum} signal to be the output port of the function. This \quote{return
+    value} can always only be a variable reference, never a more complex
+    expression.
+
+    \startbuffer[MulSum]
+    alu :: Bit -> Word -> Word -> Word
+    alu = λa.λb.λc.
+        let
+          mul = (*) a b
+          sum = (+) mul c
          in
-          d + b
-      \stopbuffer
-      \startbuffer[to]
-      letrec
-        x = 1
-        y = 2
-      in
+          sum
+    \stopbuffer
+
+    \startuseMPgraphic{MulSum}
+      save a, b, c, mul, add, sum;
+
+      % I/O ports
+      newCircle.a(btex $a$ etex) "framed(false)";
+      newCircle.b(btex $b$ etex) "framed(false)";
+      newCircle.c(btex $c$ etex) "framed(false)";
+      newCircle.sum(btex $res$ etex) "framed(false)";
+
+      % Components
+      newCircle.mul(btex - etex);
+      newCircle.add(btex + etex);
+
+      a.c      - b.c   = (0cm, 2cm);
+      b.c      - c.c   = (0cm, 2cm);
+      add.c            = c.c + (2cm, 0cm);
+      mul.c            = midpoint(a.c, b.c) + (2cm, 0cm);
+      sum.c            = add.c + (2cm, 0cm);
+      c.c              = origin;
+
+      % Draw objects and lines
+      drawObj(a, b, c, mul, add, sum);
+
+      ncarc(a)(mul) "arcangle(15)";
+      ncarc(b)(mul) "arcangle(-15)";
+      ncline(c)(add);
+      ncline(mul)(add);
+      ncline(add)(sum);
+    \stopuseMPgraphic
+
+    \placeexample[here][ex:MulSum]{Simple architecture consisting of an adder and a
+    subtractor.}
+      \startcombination[2*1]
+        {\typebufferlam{MulSum}}{Core description in normal form.}
+        {\boxedgraphic{MulSum}}{The architecture described by the normal form.}
+      \stopcombination
+
+    The previous example described composing an architecture by calling other
+    functions (operators), resulting in a simple architecture with component and
+    connection. There is of course also some mechanism for choice in the normal
+    form. In a normal Core program, the \emph{case} expression can be used in a
+    few different ways to describe choice. In normal form, this is limited to a
+    very specific form.
+
+    \in{Example}[ex:AddSubAlu] shows an example describing a
+    simple \small{ALU}, which chooses between two operations based on an opcode
+    bit. The main structure is the same as in \in{example}[ex:MulSum], but this
+    time the \lam{res} variable is bound to a case expression. This case
+    expression scrutinizes the variable \lam{opcode} (and scrutinizing more
+    complex expressions is not supported). The case expression can select a
+    different variable based on the constructor of \lam{opcode}.
+
+    \startbuffer[AddSubAlu]
+    alu :: Bit -> Word -> Word -> Word
+    alu = λopcode.λa.λb.
          let
-          a = x + y
+          res1 = (+) a b
+          res2 = (-) a b
+          res = case opcode of
+            Low -> res1
+            High -> res2
          in
-          letrec
-            c = 3
-            b = a + c
-            d = 4
-          in
-            d + b
-      \stopbuffer
-
-      \transexample{Let flattening}{from}{to}
-
-    \subsubsection{Return value simplification}
-      This transformation ensures that the return value of a function is always a
-      simple local variable reference.
-
-      Currently implemented using lambda simplification, let simplification, and
-      top simplification. Should change into something like the following, which
-      works only on the result of a function instead of any subexpression. This is
-      achieved by the contexts, like \lam{x = E}, though this is strictly not
-      correct (you could read this as "if there is any function \lam{x} that binds
-      \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
-      is bound by \lam{x}. This might need some extra notes or something).
-
-      \starttrans
-      x = E                            \lam{E} is representable
-      ~                                \lam{E} is not a lambda abstraction
-      E                                \lam{E} is not a let expression
-      ---------------------------      \lam{E} is not a local variable reference
-      let x = E in x
-      \stoptrans
-
-      \starttrans
-      x = λv0 ... λvn.E
-      ~                                \lam{E} is representable
-      E                                \lam{E} is not a let expression
-      ---------------------------      \lam{E} is not a local variable reference
-      let x = E in x
-      \stoptrans
-
-      \starttrans
-      x = λv0 ... λvn.let ... in E
-      ~                                \lam{E} is representable
-      E                                \lam{E} is not a local variable reference
-      ---------------------------
-      let x = E in x
-      \stoptrans
-
-      \startbuffer[from]
-      x = add 1 2
-      \stopbuffer
-
-      \startbuffer[to]
-      x = let x = add 1 2 in x
-      \stopbuffer
-
-      \transexample{Return value simplification}{from}{to}
-
-  \subsection{Argument simplification}
-    The transforms in this section deal with simplifying application
-    arguments into normal form. The goal here is to:
+          res
+    \stopbuffer
+
+    \startuseMPgraphic{AddSubAlu}
+      save opcode, a, b, add, sub, mux, res;
+
+      % I/O ports
+      newCircle.opcode(btex $opcode$ etex) "framed(false)";
+      newCircle.a(btex $a$ etex) "framed(false)";
+      newCircle.b(btex $b$ etex) "framed(false)";
+      newCircle.res(btex $res$ etex) "framed(false)";
+      % Components
+      newCircle.add(btex + etex);
+      newCircle.sub(btex - etex);
+      newMux.mux;
+
+      opcode.c - a.c   = (0cm, 2cm);
+      add.c    - a.c   = (4cm, 0cm);
+      sub.c    - b.c   = (4cm, 0cm);
+      a.c      - b.c   = (0cm, 3cm);
+      mux.c            = midpoint(add.c, sub.c) + (1.5cm, 0cm);
+      res.c    - mux.c = (1.5cm, 0cm);
+      b.c              = origin;
  
-    \startitemize
-     \item Make all arguments of user-defined functions (\eg, of which
-     we have a function body) simple variable references of a runtime
-     representable type. This is needed, since these applications will be turned
-     into component instantiations.
-     \item Make all arguments of builtin functions one of:
-       \startitemize
-        \item A type argument.
-        \item A dictionary argument.
-        \item A type level expression.
-        \item A variable reference of a runtime representable type.
-        \item A variable reference or partial application of a function type.
-       \stopitemize
-    \stopitemize
+      % Draw objects and lines
+      drawObj(opcode, a, b, res, add, sub, mux);
+
+      ncline(a)(add) "posA(e)";
+      ncline(b)(sub) "posA(e)";
+      nccurve(a)(sub) "posA(e)", "angleA(0)";
+      nccurve(b)(add) "posA(e)", "angleA(0)";
+      nccurve(add)(mux) "posB(inpa)", "angleB(0)";
+      nccurve(sub)(mux) "posB(inpb)", "angleB(0)";
+      nccurve(opcode)(mux) "posB(n)", "angleA(0)", "angleB(-90)";
+      ncline(mux)(res) "posA(out)";
+    \stopuseMPgraphic
  
-    When looking at the arguments of a user-defined function, we can
-    divide them into two categories:
-    \startitemize
-      \item Arguments of a runtime representable type (\eg bits or vectors).
-
-            These arguments can be preserved in the program, since they can
-            be translated to input ports later on.  However, since we can
-            only connect signals to input ports, these arguments must be
-            reduced to simple variables (for which signals will be
-            produced). This is taken care of by the argument extraction
-            transform.
-      \item Non-runtime representable typed arguments.
-            
-            These arguments cannot be preserved in the program, since we
-            cannot represent them as input or output ports in the resulting
-            \small{VHDL}. To remove them, we create a specialized version of the
-            called function with these arguments filled in. This is done by
-            the argument propagation transform.
-
-            Typically, these arguments are type and dictionary arguments that are
-            used to make functions polymorphic. By propagating these arguments, we
-            are essentially doing the same which GHC does when it specializes
-            functions: Creating multiple variants of the same function, one for
-            each type for which it is used. Other common non-representable
-            arguments are functions, e.g. when calling a higher order function
-            with another function or a lambda abstraction as an argument.
-
-            The reason for doing this is similar to the reasoning provided for
-            the inlining of non-representable let bindings above. In fact, this
-            argument propagation could be viewed as a form of cross-function
-            inlining.
-    \stopitemize
+    \placeexample[here][ex:AddSubAlu]{Simple \small{ALU} supporting two operations.}
+      \startcombination[2*1]
+        {\typebufferlam{AddSubAlu}}{Core description in normal form.}
+        {\boxedgraphic{AddSubAlu}}{The architecture described by the normal form.}
+      \stopcombination
+
+    As a more complete example, consider \in{example}[ex:NormalComplete]. This
+    example contains everything that is supported in normal form, with the
+    exception of builtin higher order functions. The graphical version of the
+    architecture contains a slightly simplified version, since the state tuple
+    packing and unpacking have been left out. Instead, two seperate registers are
+    drawn. Also note that most synthesis tools will further optimize this
+    architecture by removing the multiplexers at the register input and replace
+    them with some logic in the clock inputs, but we want to show the architecture
+    as close to the description as possible.
+
+    \startbuffer[NormalComplete]
+      regbank :: Bit 
+                 -> Word 
+                 -> State (Word, Word) 
+                 -> (State (Word, Word), Word)
+
+      -- All arguments are an inital lambda
+      regbank = λa.λd.λsp.
+      -- There are nested let expressions at top level
+      let
+        -- Unpack the state by coercion (\eg, cast from
+        -- State (Word, Word) to (Word, Word))
+        s = sp :: (Word, Word)
+        -- Extract both registers from the state
+        r1 = case s of (fst, snd) -> fst
+        r2 = case s of (fst, snd) -> snd
+        -- Calling some other user-defined function.
+        d' = foo d
+        -- Conditional connections
+        out = case a of
+          High -> r1
+          Low -> r2
+        r1' = case a of
+          High -> d'
+          Low -> r1
+        r2' = case a of
+          High -> r2
+          Low -> d'
+        -- Packing a tuple
+        s' = (,) r1' r2'
+        -- pack the state by coercion (\eg, cast from
+        -- (Word, Word) to State (Word, Word))
+        sp' = s' :: State (Word, Word)
+        -- Pack our return value
+        res = (,) sp' out
+      in
+        -- The actual result
+        res
+    \stopbuffer
+
+    \startuseMPgraphic{NormalComplete}
+      save a, d, r, foo, muxr, muxout, out;
+
+      % I/O ports
+      newCircle.a(btex \lam{a} etex) "framed(false)";
+      newCircle.d(btex \lam{d} etex) "framed(false)";
+      newCircle.out(btex \lam{out} etex) "framed(false)";
+      % Components
+      %newCircle.add(btex + etex);
+      newBox.foo(btex \lam{foo} etex);
+      newReg.r1(btex $\lam{r1}$ etex) "dx(4mm)", "dy(6mm)";
+      newReg.r2(btex $\lam{r2}$ etex) "dx(4mm)", "dy(6mm)", "reflect(true)";
+      newMux.muxr1;
+      % Reflect over the vertical axis
+      reflectObj(muxr1)((0,0), (0,1));
+      newMux.muxr2;
+      newMux.muxout;
+      rotateObj(muxout)(-90);
+
+      d.c               = foo.c + (0cm, 1.5cm); 
+      a.c               = (xpart r2.c + 2cm, ypart d.c - 0.5cm);
+      foo.c             = midpoint(muxr1.c, muxr2.c) + (0cm, 2cm);
+      muxr1.c           = r1.c + (0cm, 2cm);
+      muxr2.c           = r2.c + (0cm, 2cm);
+      r2.c              = r1.c + (4cm, 0cm);
+      r1.c              = origin;
+      muxout.c          = midpoint(r1.c, r2.c) - (0cm, 2cm);
+      out.c             = muxout.c - (0cm, 1.5cm);
+
+    %  % Draw objects and lines
+      drawObj(a, d, foo, r1, r2, muxr1, muxr2, muxout, out);
+      
+      ncline(d)(foo);
+      nccurve(foo)(muxr1) "angleA(-90)", "posB(inpa)", "angleB(180)";
+      nccurve(foo)(muxr2) "angleA(-90)", "posB(inpb)", "angleB(0)";
+      nccurve(muxr1)(r1) "posA(out)", "angleA(180)", "posB(d)", "angleB(0)";
+      nccurve(r1)(muxr1) "posA(out)", "angleA(0)", "posB(inpb)", "angleB(180)";
+      nccurve(muxr2)(r2) "posA(out)", "angleA(0)", "posB(d)", "angleB(180)";
+      nccurve(r2)(muxr2) "posA(out)", "angleA(180)", "posB(inpa)", "angleB(0)";
+      nccurve(r1)(muxout) "posA(out)", "angleA(0)", "posB(inpb)", "angleB(-90)";
+      nccurve(r2)(muxout) "posA(out)", "angleA(180)", "posB(inpa)", "angleB(-90)";
+      % Connect port a
+      nccurve(a)(muxout) "angleA(-90)", "angleB(180)", "posB(sel)";
+      nccurve(a)(muxr1) "angleA(180)", "angleB(-90)", "posB(sel)";
+      nccurve(a)(muxr2) "angleA(180)", "angleB(-90)", "posB(sel)";
+      ncline(muxout)(out) "posA(out)";
+    \stopuseMPgraphic
  
-    TODO: Check the following itemization.
+    \placeexample[here][ex:NormalComplete]{Simple architecture consisting of an adder and a
+    subtractor.}
+      \startcombination[2*1]
+        {\typebufferlam{NormalComplete}}{Core description in normal form.}
+        {\boxedgraphic{NormalComplete}}{The architecture described by the normal form.}
+      \stopcombination
  
-    When looking at the arguments of a builtin function, we can divide them
-    into categories: 
+    \subsection{Intended normal form definition}
+      Now we have some intuition for the normal form, we can describe how we want
+      the normal form to look like in a slightly more formal manner. The following
+      EBNF-like description completely captures the intended structure (and
+      generates a subset of GHC's core format).
  
-    \startitemize
-      \item Arguments of a runtime representable type.
-            
-            As we have seen with user-defined functions, these arguments can
-            always be reduced to a simple variable reference, by the
-            argument extraction transform. Performing this transform for
-            builtin functions as well, means that the translation of builtin
-            functions can be limited to signal references, instead of
-            needing to support all possible expressions.
-
-      \item Arguments of a function type.
-            
-            These arguments are functions passed to higher order builtins,
-            like \lam{map} and \lam{foldl}. Since implementing these
-            functions for arbitrary function-typed expressions (\eg, lambda
-            expressions) is rather comlex, we reduce these arguments to
-            (partial applications of) global functions.
-            
-            We can still support arbitrary expressions from the user code,
-            by creating a new global function containing that expression.
-            This way, we can simply replace the argument with a reference to
-            that new function. However, since the expression can contain any
-            number of free variables we also have to include partial
-            applications in our normal form.
-
-            This category of arguments is handled by the function extraction
-            transform.
-      \item Other unrepresentable arguments.
-            
-            These arguments can take a few different forms:
-            \startdesc{Type arguments}
-              In the core language, type arguments can only take a single
-              form: A type wrapped in the Type constructor. Also, there is
-              nothing that can be done with type expressions, except for
-              applying functions to them, so we can simply leave type
-              arguments as they are.
-            \stopdesc
-            \startdesc{Dictionary arguments}
-              In the core language, dictionary arguments are used to find
-              operations operating on one of the type arguments (mostly for
-              finding class methods). Since we will not actually evaluatie
-              the function body for builtin functions and can generate
-              code for builtin functions by just looking at the type
-              arguments, these arguments can be ignored and left as they
-              are.
-            \stopdesc
-            \startdesc{Type level arguments}
-              Sometimes, we want to pass a value to a builtin function, but
-              we need to know the value at compile time. Additionally, the
-              value has an impact on the type of the function. This is
-              encoded using type-level values, where the actual value of the
-              argument is not important, but the type encodes some integer,
-              for example. Since the value is not important, the actual form
-              of the expression does not matter either and we can leave
-              these arguments as they are.
-            \stopdesc
-            \startdesc{Other arguments}
-              Technically, there is still a wide array of arguments that can
-              be passed, but does not fall into any of the above categories.
-              However, none of the supported builtin functions requires such
-              an argument. This leaves use with passing unsupported types to
-              a function, such as calling \lam{head} on a list of functions.
-
-              In these cases, it would be impossible to generate hardware
-              for such a function call anyway, so we can ignore these
-              arguments.
-
-              The only way to generate hardware for builtin functions with
-              arguments like these, is to expand the function call into an
-              equivalent core expression (\eg, expand map into a series of
-              function applications). But for now, we choose to simply not
-              support expressions like these.
-            \stopdesc
-
-            From the above, we can conclude that we can simply ignore these
-            other unrepresentable arguments and focus on the first two
-            categories instead.
-    \stopitemize
+      Some clauses have an expression listed in parentheses. These are conditions
+      that need to apply to the clause.
  
-    \subsubsection{Argument simplification}
-      This transform deals with arguments to functions that
-      are of a runtime representable type. It ensures that they will all become
-      references to global variables, or local signals in the resulting \small{VHDL}. 
-
-      TODO: It seems we can map an expression to a port, not only a signal.
-      Perhaps this makes this transformation not needed?
-      TODO: Say something about dataconstructors (without arguments, like True
-      or False), which are variable references of a runtime representable
-      type, but do not result in a signal.
-
-      To reduce a complex expression to a simple variable reference, we create
-      a new let expression around the application, which binds the complex
-      expression to a new variable. The original function is then applied to
-      this variable.
-
-      \starttrans
-      M N
-      --------------------    \lam{N} is of a representable type
-      let x = N in M x        \lam{N} is not a local variable reference
-      \stoptrans
-
-      \startbuffer[from]
-      add (add a 1) 1
-      \stopbuffer
-
-      \startbuffer[to]
-      let x = add a 1 in add x 1
-      \stopbuffer
-
-      \transexample{Argument extraction}{from}{to}
-
-    \subsubsection{Function extraction}
-      This transform deals with function-typed arguments to builtin functions.
-      Since these arguments cannot be propagated, we choose to extract them
-      into a new global function instead.
-
-      Any free variables occuring in the extracted arguments will become
-      parameters to the new global function. The original argument is replaced
-      with a reference to the new function, applied to any free variables from
-      the original argument.
-
-      This transformation is useful when applying higher order builtin functions
-      like \hs{map} to a lambda abstraction, for example. In this case, the code
-      that generates \small{VHDL} for \hs{map} only needs to handle top level functions and
-      partial applications, not any other expression (such as lambda abstractions or
-      even more complicated expressions).
-
-      \starttrans
-      M N                     \lam{M} is a (partial aplication of a) builtin function.
-      ---------------------   \lam{f0 ... fn} = free local variables of \lam{N}
-      M x f0 ... fn           \lam{N :: a -> b}
-      ~                       \lam{N} is not a (partial application of) a top level function
-      x = λf0 ... λfn.N
-      \stoptrans
-
-      \startbuffer[from]
-      map (λa . add a b) xs
-
-      map (add b) ys
-      \stopbuffer
-
-      \startbuffer[to]
-      x0 = λb.λa.add a b
-      ~
-      map x0 xs
-
-      x1 = λb.add b
-      map x1 ys
-      \stopbuffer
-
-      \transexample{Function extraction}{from}{to}
-
-    \subsubsection{Argument propagation}
-      This transform deals with arguments to user-defined functions that are
-      not representable at runtime. This means these arguments cannot be
-      preserved in the final form and most be {\em propagated}.
-
-      Propagation means to create a specialized version of the called
-      function, with the propagated argument already filled in. As a simple
-      example, in the following program:
+      \startlambda
+      \italic{normal} = \italic{lambda}
+      \italic{lambda} = λvar.\italic{lambda} (representable(var))
+                      | \italic{toplet} 
+      \italic{toplet} = let \italic{binding} in \italic{toplet} 
+                      | letrec [\italic{binding}] in \italic{toplet}
+                      | var (representable(varvar))
+      \italic{binding} = var = \italic{rhs} (representable(rhs))
+                       -- State packing and unpacking by coercion
+                       | var0 = var1 :: State ty (lvar(var1))
+                       | var0 = var1 :: ty (var0 :: State ty) (lvar(var1))
+      \italic{rhs} = userapp
+                   | builtinapp
+                   -- Extractor case
+                   | case var of C a0 ... an -> ai (lvar(var))
+                   -- Selector case
+                   | case var of (lvar(var))
+                      DEFAULT -> var0 (lvar(var0))
+                      C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, lvar(resvar))
+      \italic{userapp} = \italic{userfunc}
+                       | \italic{userapp} {userarg}
+      \italic{userfunc} = var (gvar(var))
+      \italic{userarg} = var (lvar(var))
+      \italic{builtinapp} = \italic{builtinfunc}
+                          | \italic{builtinapp} \italic{builtinarg}
+      \italic{builtinfunc} = var (bvar(var))
+      \italic{builtinarg} = \italic{coreexpr}
+      \stoplambda
+
+      -- TODO: Limit builtinarg further
+
+      -- TODO: There can still be other casts around (which the code can handle,
+      e.g., ignore), which still need to be documented here.
+
+      -- TODO: Note about the selector case. It just supports Bit and Bool
+      currently, perhaps it should be generalized in the normal form?
+
+      When looking at such a program from a hardware perspective, the top level
+      lambda's define the input ports. The value produced by the let expression is
+      the output port. Most function applications bound by the let expression
+      define a component instantiation, where the input and output ports are mapped
+      to local signals or arguments. Some of the others use a builtin
+      construction (\eg the \lam{case} statement) or call a builtin function
+      (\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is
+      available.
+
+  \section{Transformation notation}
+    To be able to concisely present transformations, we use a specific format to
+    them. It is a simple format, similar to one used in logic reasoning.
+
+    Such a transformation description looks like the following.
+
+    \starttrans
+    <context conditions>
+    ~
+    <original expression>
+    --------------------------          <expression conditions>
+    <transformed expresssion>
+    ~
+    <context additions>
+    \stoptrans
+
+    This format desribes a transformation that applies to \lam{original
+    expresssion} and transforms it into \lam{transformed expression}, assuming
+    that all conditions apply. In this format, there are a number of placeholders
+    in pointy brackets, most of which should be rather obvious in their meaning.
+    Nevertheless, we will more precisely specify their meaning below:
+
+      \startdesc{<original expression>} The expression pattern that will be matched
+      against (subexpressions of) the expression to be transformed. We call this a
+      pattern, because it can contain \emph{placeholders} (variables), which match
+      any expression or binder. Any such placeholder is said to be \emph{bound} to
+      the expression it matches. It is convention to use an uppercase latter (\eg
+      \lam{M} or \lam{E} to refer to any expression (including a simple variable
+      reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to
+      (references to) binders.
+
+      For example, the pattern \lam{a + B} will match the expression 
+      \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to 
+      \lam{(2 * 2)}), but not \lam{v + (2 * w)}.
+      \stopdesc
+
+      \startdesc{<expression conditions>}
+      These are extra conditions on the expression that is matched. These
+      conditions can be used to further limit the cases in which the
+      transformation applies, in particular to prevent a transformation from
+      causing a loop with itself or another transformation.
+
+      Only if these if these conditions are \emph{all} true, this transformation
+      applies.
+      \stopdesc
+
+      \startdesc{<context conditions>}
+      These are a number of extra conditions on the context of the function. In
+      particular, these conditions can require some other top level function to be
+      present, whose value matches the pattern given here. The format of each of
+      these conditions is: \lam{binder = <pattern>}.
+
+      Typically, the binder is some placeholder bound in the \lam{<original
+      expression>}, while the pattern contains some placeholders that are used in
+      the \lam{transformed expression}.
+      
+      Only if a top level binder exists that matches each binder and pattern, this
+      transformation applies.
+      \stopdesc
+
+      \startdesc{<transformed expression>}
+      This is the expression template that is the result of the transformation. If, looking
+      at the above three items, the transformation applies, the \lam{original
+      expression} is completely replaced with the \lam{<transformed expression>}.
+      We call this a template, because it can contain placeholders, referring to
+      any placeholder bound by the \lam{<original expression>} or the
+      \lam{<context conditions>}. The resulting expression will have those
+      placeholders replaced by the values bound to them.
+
+      Any binder (lowercase) placeholder that has no value bound to it yet will be
+      bound to (and replaced with) a fresh binder.
+      \stopdesc
+
+      \startdesc{<context additions>}
+      These are templates for new functions to add to the context. This is a way
+      to have a transformation create new top level functiosn.
+
+      Each addition has the form \lam{binder = template}. As above, any
+      placeholder in the addition is replaced with the value bound to it, and any
+      binder placeholder that has no value bound to it yet will be bound to (and
+      replaced with) a fresh binder.
+      \stopdesc
+
+    As an example, we'll look at η-abstraction:
+
+    \starttrans
+    E                 \lam{E :: a -> b}
+    --------------    \lam{E} does not occur on a function position in an application
+    λx.E x            \lam{E} is not a lambda abstraction.
+    \stoptrans
+
+    Consider the following function, which is a fairly obvious way to specify a
+    simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this
+    function):
+
+    \startlambda 
+    alu :: Bit -> Word -> Word -> Word
+    alu = λopcode. case opcode of
+      Low -> (+)
+      High -> (-)
+    \stoplambda
+
+    There are a few subexpressions in this function to which we could possibly
+    apply the transformation. Since the pattern of the transformation is only
+    the placeholder \lam{E}, any expression will match that. Whether the
+    transformation applies to an expression is thus solely decided by the
+    conditions to the right of the transformation.
+
+    We will look at each expression in the function in a top down manner. The
+    first expression is the entire expression the function is bound to.
+
+    \startlambda
+    λopcode. case opcode of
+      Low -> (+)
+      High -> (-)
+    \stoplambda
+
+    As said, the expression pattern matches this. The type of this expression is
+    \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in
+    this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}).
+
+    Since this expression is at top level, it does not occur at a function
+    position of an application. However, The expression is a lambda abstraction,
+    so this transformation does not apply.
+
+    The next expression we could apply this transformation to, is the body of
+    the lambda abstraction:
+
+    \startlambda
+    case opcode of
+      Low -> (+)
+      High -> (-)
+    \stoplambda
+
+    The type of this expression is \lam{Word -> Word -> Word}, which again
+    matches \lam{a -> b}. The expression is the body of a lambda expression, so
+    it does not occur at a function position of an application. Finally, the
+    expression is not a lambda abstraction but a case expression, so all the
+    conditions match. There are no context conditions to match, so the
+    transformation applies.
+
+    By now, the placeholder \lam{E} is bound to the entire expression. The
+    placeholder \lam{x}, which occurs in the replacement template, is not bound
+    yet, so we need to generate a fresh binder for that. Let's use the binder
+    \lam{a}. This results in the following replacement expression:
+
+    \startlambda
+    λa.(case opcode of
+      Low -> (+)
+      High -> (-)) a
+    \stoplambda
+
+    Continuing with this expression, we see that the transformation does not
+    apply again (it is a lambda expression). Next we look at the body of this
+    labmda abstraction:
+
+    \startlambda
+    (case opcode of
+      Low -> (+)
+      High -> (-)) a
+    \stoplambda
+    
+    Here, the transformation does apply, binding \lam{E} to the entire
+    expression and \lam{x} to the fresh binder \lam{b}, resulting in the
+    replacement:
+
+    \startlambda
+    λb.(case opcode of
+      Low -> (+)
+      High -> (-)) a b
+    \stoplambda
+
+    Again, the transformation does not apply to this lambda abstraction, so we
+    look at its body. For brevity, we'll put the case statement on one line from
+    now on.
+
+    \startlambda
+    (case opcode of Low -> (+); High -> (-)) a b
+    \stoplambda
+
+    The type of this expression is \lam{Word}, so it does not match \lam{a -> b}
+    and the transformation does not apply. Next, we have two options for the
+    next expression to look at: The function position and argument position of
+    the application. The expression in the argument position is \lam{b}, which
+    has type \lam{Word}, so the transformation does not apply. The expression in
+    the function position is:
+
+    \startlambda
+    (case opcode of Low -> (+); High -> (-)) a
+    \stoplambda
+
+    Obviously, the transformation does not apply here, since it occurs in
+    function position. In the same way the transformation does not apply to both
+    components of this expression (\lam{case opcode of Low -> (+); High -> (-)}
+    and \lam{a}), so we'll skip to the components of the case expression: The
+    scrutinee and both alternatives. Since the opcode is not a function, it does
+    not apply here, and we'll leave both alternatives as an exercise to the
+    reader. The final function, after all these transformations becomes:
+
+    \startlambda 
+    alu :: Bit -> Word -> Word -> Word
+    alu = λopcode.λa.b. (case opcode of
+      Low -> λa1.λb1 (+) a1 b1
+      High -> λa2.λb2 (-) a2 b2) a b
+    \stoplambda
+
+    In this case, the transformation does not apply anymore, though this might
+    not always be the case (e.g., the application of a transformation on a
+    subexpression might open up possibilities to apply the transformation
+    further up in the expression).
+
+    \subsection{Transformation application}
+      In this chapter we define a number of transformations, but how will we apply
+      these? As stated before, our normal form is reached as soon as no
+      transformation applies anymore. This means our application strategy is to
+      simply apply any transformation that applies, and continuing to do that with
+      the result of each transformation.
+
+      In particular, we define no particular order of transformations. Since
+      transformation order should not influence the resulting normal form (see TODO
+      ref), this leaves the implementation free to choose any application order that
+      results in an efficient implementation.
+
+      When applying a single transformation, we try to apply it to every (sub)expression
+      in a function, not just the top level function. This allows us to keep the
+      transformation descriptions concise and powerful.
+
+    \subsection{Definitions}
+      In the following sections, we will be using a number of functions and
+      notations, which we will define here.
+
+      \subsubsection{Other concepts}
+        A \emph{global variable} is any variable that is bound at the
+        top level of a program, or an external module. A \emph{local variable} is any
+        other variable (\eg, variables local to a function, which can be bound by
+        lambda abstractions, let expressions and pattern matches of case
+        alternatives).  Note that this is a slightly different notion of global versus
+        local than what \small{GHC} uses internally.
+        \defref{global variable} \defref{local variable}
+
+        A \emph{hardware representable} (or just \emph{representable}) type or value
+        is (a value of) a type that we can generate a signal for in hardware. For
+        example, a bit, a vector of bits, a 32 bit unsigned word, etc. Types that are
+        not runtime representable notably include (but are not limited to): Types,
+        dictionaries, functions.
+        \defref{representable}
+
+        A \emph{builtin function} is a function supplied by the Cλash framework, whose
+        implementation is not valid Cλash. The implementation is of course valid
+        Haskell, for simulation, but it is not expressable in Cλash.
+        \defref{builtin function} \defref{user-defined function}
+
+      For these functions, Cλash has a \emph{builtin hardware translation}, so calls
+      to these functions can still be translated. These are functions like
+      \lam{map}, \lam{hwor} and \lam{length}.
+
+      A \emph{user-defined} function is a function for which we do have a Cλash
+      implementation available.
+
+      \subsubsection{Functions}
+        Here, we define a number of functions that can be used below to concisely
+        specify conditions.
+
+        \refdef{global variable}\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
+        global variable. It is false when it references a local variable.
+
+        \refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
+        references a local variable, false when it references a global variable.
+
+        \refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when
+        \emph{expr} or \emph{var} is \emph{representable}.
+
+    \subsection{Binder uniqueness}
+      A common problem in transformation systems, is binder uniqueness. When not
+      considering this problem, it is easy to create transformations that mix up
+      bindings and cause name collisions. Take for example, the following core
+      expression:
  
        \startlambda
-      f = λa.λb.a + b
-      inc = λa.f a 1
+      (λa.λb.λc. a * b * c) x c
        \stoplambda
  
-      we could {\em propagate} the constant argument 1, with the following
-      result:
+      By applying β-reduction (see below) once, we can simplify this expression to:
  
        \startlambda
-      f' = λa.a + 1
-      inc = λa.f' a
+      (λb.λc. x * b * c) c
        \stoplambda
  
-      Special care must be taken when the to-be-propagated expression has any
-      free variables. If this is the case, the original argument should not be
-      removed alltogether, but replaced by all the free variables of the
-      expression. In this way, the original expression can still be evaluated
-      inside the new function. Also, this brings us closer to our goal: All
-      these free variables will be simple variable references.
-
-      To prevent us from propagating the same argument over and over, a simple
-      local variable reference is not propagated (since is has exactly one
-      free variable, itself, we would only replace that argument with itself).
-
-      This shows that any free local variables that are not runtime representable
-      cannot be brought into normal form by this transform. We rely on an
-      inlining transformation to replace such a variable with an expression we
-      can propagate again.
-
-      \starttrans
-      x = E
-      ~
-      x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
-      ---------------------------------------------    \lam{Yi} is not a local variable reference
-      x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
-      ~
-      x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn .       
-            E y0 ... yi-1 Yi yi+1 ... yn   
-
-      \stoptrans
-
-      TODO: Example
-
-  \subsection{Case simplification}
-    \subsubsection{Scrutinee simplification}
-      This transform ensures that the scrutinee of a case expression is always
-      a simple variable reference.
-
-      \starttrans
-      case E of
-        alts
-      -----------------        \lam{E} is not a local variable reference
-      let x = E in 
-        case E of
-          alts
-      \stoptrans
+      Now, we have replaced the \lam{a} binder with a reference to the \lam{x}
+      binder. No harm done here. But note that we see multiple occurences of the
+      \lam{c} binder. The first is a binding occurence, to which the second refers.
+      The last, however refers to \emph{another} instance of \lam{c}, which is
+      bound somewhere outside of this expression. Now, if we would apply beta
+      reduction without taking heed of binder uniqueness, we would get:
  
-      \startbuffer[from]
-      case (foo a) of
-        True -> a
-        False -> b
-      \stopbuffer
+      \startlambda
+      λc. x * c * c
+      \stoplambda
  
-      \startbuffer[to]
-      let x = foo a in
+      This is obviously not what was supposed to happen! The root of this problem is
+      the reuse of binders: Identical binders can be bound in different scopes, such
+      that only the inner one is \quote{visible} in the inner expression. In the example
+      above, the \lam{c} binder was bound outside of the expression and in the inner
+      lambda expression. Inside that lambda expression, only the inner \lam{c} is
+      visible.
+
+      There are a number of ways to solve this. \small{GHC} has isolated this
+      problem to their binder substitution code, which performs \emph{deshadowing}
+      during its expression traversal. This means that any binding that shadows
+      another binding on a higher level is replaced by a new binder that does not
+      shadow any other binding. This non-shadowing invariant is enough to prevent
+      binder uniqueness problems in \small{GHC}.
+
+      In our transformation system, maintaining this non-shadowing invariant is
+      a bit harder to do (mostly due to implementation issues, the prototype doesn't
+      use \small{GHC}'s subsitution code). Also, we can observe the following
+      points.
+
+      \startitemize
+      \item Deshadowing does not guarantee overall uniqueness. For example, the
+      following (slightly contrived) expression shows the identifier \lam{x} bound in
+      two seperate places (and to different values), even though no shadowing
+      occurs.
+
+      \startlambda
+      (let x = 1 in x) + (let x = 2 in x)
+      \stoplambda
+
+      \item In our normal form (and the resulting \small{VHDL}), all binders
+      (signals) will end up in the same scope. To allow this, all binders within the
+      same function should be unique.
+
+      \item When we know that all binders in an expression are unique, moving around
+      or removing a subexpression will never cause any binder conflicts. If we have
+      some way to generate fresh binders, introducing new subexpressions will not
+      cause any problems either. The only way to cause conflicts is thus to
+      duplicate an existing subexpression.
+      \stopitemize
+
+      Given the above, our prototype maintains a unique binder invariant. This
+      meanst that in any given moment during normalization, all binders \emph{within
+      a single function} must be unique. To achieve this, we apply the following
+      technique.
+
+      TODO: Define fresh binders and unique supplies
+
+      \startitemize
+      \item Before starting normalization, all binders in the function are made
+      unique. This is done by generating a fresh binder for every binder used. This
+      also replaces binders that did not pose any conflict, but it does ensure that
+      all binders within the function are generated by the same unique supply. See
+      (TODO: ref fresh binder).
+      \item Whenever a new binder must be generated, we generate a fresh binder that
+      is guaranteed to be different from \emph{all binders generated so far}. This
+      can thus never introduce duplication and will maintain the invariant.
+      \item Whenever (part of) an expression is duplicated (for example when
+      inlining), all binders in the expression are replaced with fresh binders
+      (using the same method as at the start of normalization). These fresh binders
+      can never introduce duplication, so this will maintain the invariant.
+      \item Whenever we move part of an expression around within the function, there
+      is no need to do anything special. There is obviously no way to introduce
+      duplication by moving expressions around. Since we know that each of the
+      binders is already unique, there is no way to introduce (incorrect) shadowing
+      either.
+      \stopitemize
+
+  \section{Transform passes}
+    In this section we describe the actual transforms. Here we're using
+    the core language in a notation that resembles lambda calculus.
+
+    Each of these transforms is meant to be applied to every (sub)expression
+    in a program, for as long as it applies. Only when none of the
+    transformations can be applied anymore, the program is in normal form (by
+    definition). We hope to be able to prove that this form will obey all of the
+    constraints defined above, but this has yet to happen (though it seems likely
+    that it will).
+
+    Each of the transforms will be described informally first, explaining
+    the need for and goal of the transform. Then, a formal definition is
+    given, using a familiar syntax from the world of logic. Each transform
+    is specified as a number of conditions (above the horizontal line) and a
+    number of conclusions (below the horizontal line). The details of using
+    this notation are still a bit fuzzy, so comments are welcom.
+
+    \subsection{General cleanup}
+
+      \subsubsection{β-reduction}
+        β-reduction is a well known transformation from lambda calculus, where it is
+        the main reduction step. It reduces applications of labmda abstractions,
+        removing both the lambda abstraction and the application.
+
+        In our transformation system, this step helps to remove unwanted lambda
+        abstractions (basically all but the ones at the top level). Other
+        transformations (application propagation, non-representable inlining) make
+        sure that most lambda abstractions will eventually be reducable by
+        β-reduction.
+
+        TODO: Define substitution syntax
+
+        \starttrans
+        (λx.E) M
+        -----------------
+        E[M/x]
+        \stoptrans
+
+        % And an example
+        \startbuffer[from]
+        (λa. 2 * a) (2 * b)
+        \stopbuffer
+
+        \startbuffer[to]
+        2 * (2 * b)
+        \stopbuffer
+
+        \transexample{β-reduction}{from}{to}
+
+      \subsubsection{Application propagation}
+        This transformation is meant to propagate application expressions downwards
+        into expressions as far as possible. This allows partial applications inside
+        expressions to become fully applied and exposes new transformation
+        possibilities for other transformations (like β-reduction).
+
+        \starttrans
+        let binds in E) M
+        -----------------
+        let binds in E M
+        \stoptrans
+
+        % And an example
+        \startbuffer[from]
+        ( let 
+            val = 1
+          in 
+            add val
+        ) 3
+        \stopbuffer
+
+        \startbuffer[to]
+        let 
+          val = 1
+        in 
+          add val 3
+        \stopbuffer
+
+        \transexample{Application propagation for a let expression}{from}{to}
+
+        \starttrans
+        (case x of
+          p1 -> E1
+          \vdots
+          pn -> En) M
+        -----------------
          case x of
-          True -> a
-          False -> b
-      \stopbuffer
+          p1 -> E1 M
+          \vdots
+          pn -> En M
+        \stoptrans
+
+        % And an example
+        \startbuffer[from]
+        ( case x of 
+            True -> id
+            False -> neg
+        ) 1
+        \stopbuffer
+
+        \startbuffer[to]
+        case x of 
+          True -> id 1
+          False -> neg 1
+        \stopbuffer
+
+        \transexample{Application propagation for a case expression}{from}{to}
+
+      \subsubsection{Empty let removal}
+        This transformation is simple: It removes recursive lets that have no bindings
+        (which usually occurs when let derecursification removes the last binding from
+        it).
+
+        \starttrans
+        letrec in M
+        --------------
+        M
+        \stoptrans
+
+        \subsubsection{Simple let binding removal}
+        This transformation inlines simple let bindings (\eg a = b).
+
+        This transformation is not needed to get into normal form, but makes the
+        resulting \small{VHDL} a lot shorter.
+
+        \starttrans
+        letnonrec
+          a = b
+        in
+          M
+        -----------------
+        M[b/a]
+        \stoptrans
+
+        \starttrans
+        letrec
+          \vdots
+          a = b
+          \vdots
+        in
+          M
+        -----------------
+        let
+          \vdots [b/a]
+          \vdots [b/a]
+        in
+          M[b/a]
+        \stoptrans
+
+      \subsubsection{Unused let binding removal}
+        This transformation removes let bindings that are never used. Usually,
+        the desugarer introduces some unused let bindings.
+
+        This normalization pass should really be unneeded to get into normal form
+        (since ununsed bindings are not forbidden by the normal form), but in practice
+        the desugarer or simplifier emits some unused bindings that cannot be
+        normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
+        this transformation makes the resulting \small{VHDL} a lot shorter.
+
+        \starttrans
+        let a = E in M
+        ----------------------------    \lam{a} does not occur free in \lam{M}
+        M
+        \stoptrans
+
+        \starttrans
+        letrec
+          \vdots
+          a = E
+          \vdots
+        in
+          M
+        ----------------------------    \lam{a} does not occur free in \lam{M}
+        letrec
+          \vdots
+          \vdots
+        in
+          M
+        \stoptrans
+
+      \subsubsection{Cast propagation / simplification}
+        This transform pushes casts down into the expression as far as possible.
+        Since its exact role and need is not clear yet, this transformation is
+        not yet specified.
+
+      \subsubsection{Compiler generated top level binding inlining}
+        TODO
+
+    \section{Program structure}
+
+      \subsubsection{η-abstraction}
+        This transformation makes sure that all arguments of a function-typed
+        expression are named, by introducing lambda expressions. When combined with
+        β-reduction and function inlining below, all function-typed expressions should
+        be lambda abstractions or global identifiers.
+
+        \starttrans
+        E                 \lam{E :: a -> b}
+        --------------    \lam{E} is not the first argument of an application.
+        λx.E x            \lam{E} is not a lambda abstraction.
+                          \lam{x} is a variable that does not occur free in \lam{E}.
+        \stoptrans
+
+        \startbuffer[from]
+        foo = λa.case a of 
+          True -> λb.mul b b
+          False -> id
+        \stopbuffer
+
+        \startbuffer[to]
+        foo = λa.λx.(case a of 
+            True -> λb.mul b b
+            False -> λy.id y) x
+        \stopbuffer
+
+        \transexample{η-abstraction}{from}{to}
+
+      \subsubsection{Let derecursification}
+        This transformation is meant to make lets non-recursive whenever possible.
+        This might allow other optimizations to do their work better. TODO: Why is
+        this needed exactly?
+
+      \subsubsection{Let flattening}
+        This transformation puts nested lets in the same scope, by lifting the
+        binding(s) of the inner let into a new let around the outer let. Eventually,
+        this will cause all let bindings to appear in the same scope (they will all be
+        in scope for the function return value).
+
+        Note that this transformation does not try to be smart when faced with
+        recursive lets, it will just leave the lets recursive (possibly joining a
+        recursive and non-recursive let into a single recursive let). The let
+        rederecursification transformation will do this instead.
+
+        \starttrans
+        letnonrec x = (let bindings in M) in N
+        ------------------------------------------
+        let bindings in (letnonrec x = M) in N
+        \stoptrans
+
+        \starttrans
+        letrec 
+          \vdots
+          x = (let bindings in M)
+          \vdots
+        in
+          N
+        ------------------------------------------
+        letrec
+          \vdots
+          bindings
+          x = M
+          \vdots
+        in
+          N
+        \stoptrans
  
-      \transexample{Let flattening}{from}{to}
+        \startbuffer[from]
+        let
+          a = letrec
+            x = 1
+            y = 2
+          in
+            x + y
+        in
+          letrec
+            b = let c = 3 in a + c
+            d = 4
+          in
+            d + b
+        \stopbuffer
+        \startbuffer[to]
+        letrec
+          x = 1
+          y = 2
+        in
+          let
+            a = x + y
+          in
+            letrec
+              c = 3
+              b = a + c
+              d = 4
+            in
+              d + b
+        \stopbuffer
+
+        \transexample{Let flattening}{from}{to}
+
+      \subsubsection{Return value simplification}
+        This transformation ensures that the return value of a function is always a
+        simple local variable reference.
+
+        Currently implemented using lambda simplification, let simplification, and
+        top simplification. Should change into something like the following, which
+        works only on the result of a function instead of any subexpression. This is
+        achieved by the contexts, like \lam{x = E}, though this is strictly not
+        correct (you could read this as "if there is any function \lam{x} that binds
+        \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
+        is bound by \lam{x}. This might need some extra notes or something).
+
+        \starttrans
+        x = E                            \lam{E} is representable
+        ~                                \lam{E} is not a lambda abstraction
+        E                                \lam{E} is not a let expression
+        ---------------------------      \lam{E} is not a local variable reference
+        let x = E in x
+        \stoptrans
+
+        \starttrans
+        x = λv0 ... λvn.E
+        ~                                \lam{E} is representable
+        E                                \lam{E} is not a let expression
+        ---------------------------      \lam{E} is not a local variable reference
+        let x = E in x
+        \stoptrans
+
+        \starttrans
+        x = λv0 ... λvn.let ... in E
+        ~                                \lam{E} is representable
+        E                                \lam{E} is not a local variable reference
+        ---------------------------
+        let x = E in x
+        \stoptrans
+
+        \startbuffer[from]
+        x = add 1 2
+        \stopbuffer
+
+        \startbuffer[to]
+        x = let x = add 1 2 in x
+        \stopbuffer
+
+        \transexample{Return value simplification}{from}{to}
+
+    \subsection{Argument simplification}
+      The transforms in this section deal with simplifying application
+      arguments into normal form. The goal here is to:
  
+      \startitemize
+       \item Make all arguments of user-defined functions (\eg, of which
+       we have a function body) simple variable references of a runtime
+       representable type. This is needed, since these applications will be turned
+       into component instantiations.
+       \item Make all arguments of builtin functions one of:
+         \startitemize
+          \item A type argument.
+          \item A dictionary argument.
+          \item A type level expression.
+          \item A variable reference of a runtime representable type.
+          \item A variable reference or partial application of a function type.
+         \stopitemize
+      \stopitemize
  
-    \subsubsection{Case simplification}
-      This transformation ensures that all case expressions become normal form. This
-      means they will become one of:
+      When looking at the arguments of a user-defined function, we can
+      divide them into two categories:
        \startitemize
-      \item An extractor case with a single alternative that picks a single field
-      from a datatype, \eg \lam{case x of (a, b) -> a}.
-      \item A selector case with multiple alternatives and only wild binders, that
-      makes a choice between expressions based on the constructor of another
-      expression, \eg \lam{case x of Low -> a; High -> b}.
+        \item Arguments of a runtime representable type (\eg bits or vectors).
+
+              These arguments can be preserved in the program, since they can
+              be translated to input ports later on.  However, since we can
+              only connect signals to input ports, these arguments must be
+              reduced to simple variables (for which signals will be
+              produced). This is taken care of by the argument extraction
+              transform.
+        \item Non-runtime representable typed arguments.
+              
+              These arguments cannot be preserved in the program, since we
+              cannot represent them as input or output ports in the resulting
+              \small{VHDL}. To remove them, we create a specialized version of the
+              called function with these arguments filled in. This is done by
+              the argument propagation transform.
+
+              Typically, these arguments are type and dictionary arguments that are
+              used to make functions polymorphic. By propagating these arguments, we
+              are essentially doing the same which GHC does when it specializes
+              functions: Creating multiple variants of the same function, one for
+              each type for which it is used. Other common non-representable
+              arguments are functions, e.g. when calling a higher order function
+              with another function or a lambda abstraction as an argument.
+
+              The reason for doing this is similar to the reasoning provided for
+              the inlining of non-representable let bindings above. In fact, this
+              argument propagation could be viewed as a form of cross-function
+              inlining.
        \stopitemize
  
-      \starttrans
-      case E of
-        C0 v0,0 ... v0,m -> E0
-        \vdots
-        Cn vn,0 ... vn,m -> En
-      --------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
-      letnonrec
-        v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
-        \vdots
-        v0,m = case x of C0 v0,0 .. v0,m -> v0,m
-        x0 = E0
-        \dots
-        vn,m = case x of Cn vn,0 .. vn,m -> vn,m
-        xn = En
-      in
+      TODO: Check the following itemization.
+
+      When looking at the arguments of a builtin function, we can divide them
+      into categories: 
+
+      \startitemize
+        \item Arguments of a runtime representable type.
+              
+              As we have seen with user-defined functions, these arguments can
+              always be reduced to a simple variable reference, by the
+              argument extraction transform. Performing this transform for
+              builtin functions as well, means that the translation of builtin
+              functions can be limited to signal references, instead of
+              needing to support all possible expressions.
+
+        \item Arguments of a function type.
+              
+              These arguments are functions passed to higher order builtins,
+              like \lam{map} and \lam{foldl}. Since implementing these
+              functions for arbitrary function-typed expressions (\eg, lambda
+              expressions) is rather comlex, we reduce these arguments to
+              (partial applications of) global functions.
+              
+              We can still support arbitrary expressions from the user code,
+              by creating a new global function containing that expression.
+              This way, we can simply replace the argument with a reference to
+              that new function. However, since the expression can contain any
+              number of free variables we also have to include partial
+              applications in our normal form.
+
+              This category of arguments is handled by the function extraction
+              transform.
+        \item Other unrepresentable arguments.
+              
+              These arguments can take a few different forms:
+              \startdesc{Type arguments}
+                In the core language, type arguments can only take a single
+                form: A type wrapped in the Type constructor. Also, there is
+                nothing that can be done with type expressions, except for
+                applying functions to them, so we can simply leave type
+                arguments as they are.
+              \stopdesc
+              \startdesc{Dictionary arguments}
+                In the core language, dictionary arguments are used to find
+                operations operating on one of the type arguments (mostly for
+                finding class methods). Since we will not actually evaluatie
+                the function body for builtin functions and can generate
+                code for builtin functions by just looking at the type
+                arguments, these arguments can be ignored and left as they
+                are.
+              \stopdesc
+              \startdesc{Type level arguments}
+                Sometimes, we want to pass a value to a builtin function, but
+                we need to know the value at compile time. Additionally, the
+                value has an impact on the type of the function. This is
+                encoded using type-level values, where the actual value of the
+                argument is not important, but the type encodes some integer,
+                for example. Since the value is not important, the actual form
+                of the expression does not matter either and we can leave
+                these arguments as they are.
+              \stopdesc
+              \startdesc{Other arguments}
+                Technically, there is still a wide array of arguments that can
+                be passed, but does not fall into any of the above categories.
+                However, none of the supported builtin functions requires such
+                an argument. This leaves use with passing unsupported types to
+                a function, such as calling \lam{head} on a list of functions.
+
+                In these cases, it would be impossible to generate hardware
+                for such a function call anyway, so we can ignore these
+                arguments.
+
+                The only way to generate hardware for builtin functions with
+                arguments like these, is to expand the function call into an
+                equivalent core expression (\eg, expand map into a series of
+                function applications). But for now, we choose to simply not
+                support expressions like these.
+              \stopdesc
+
+              From the above, we can conclude that we can simply ignore these
+              other unrepresentable arguments and focus on the first two
+              categories instead.
+      \stopitemize
+
+      \subsubsection{Argument simplification}
+        This transform deals with arguments to functions that
+        are of a runtime representable type. It ensures that they will all become
+        references to global variables, or local signals in the resulting \small{VHDL}. 
+
+        TODO: It seems we can map an expression to a port, not only a signal.
+        Perhaps this makes this transformation not needed?
+        TODO: Say something about dataconstructors (without arguments, like True
+        or False), which are variable references of a runtime representable
+        type, but do not result in a signal.
+
+        To reduce a complex expression to a simple variable reference, we create
+        a new let expression around the application, which binds the complex
+        expression to a new variable. The original function is then applied to
+        this variable.
+
+        \starttrans
+        M N
+        --------------------    \lam{N} is of a representable type
+        let x = N in M x        \lam{N} is not a local variable reference
+        \stoptrans
+
+        \startbuffer[from]
+        add (add a 1) 1
+        \stopbuffer
+
+        \startbuffer[to]
+        let x = add a 1 in add x 1
+        \stopbuffer
+
+        \transexample{Argument extraction}{from}{to}
+
+      \subsubsection{Function extraction}
+        This transform deals with function-typed arguments to builtin functions.
+        Since these arguments cannot be propagated, we choose to extract them
+        into a new global function instead.
+
+        Any free variables occuring in the extracted arguments will become
+        parameters to the new global function. The original argument is replaced
+        with a reference to the new function, applied to any free variables from
+        the original argument.
+
+        This transformation is useful when applying higher order builtin functions
+        like \hs{map} to a lambda abstraction, for example. In this case, the code
+        that generates \small{VHDL} for \hs{map} only needs to handle top level functions and
+        partial applications, not any other expression (such as lambda abstractions or
+        even more complicated expressions).
+
+        \starttrans
+        M N                     \lam{M} is a (partial aplication of a) builtin function.
+        ---------------------   \lam{f0 ... fn} = free local variables of \lam{N}
+        M x f0 ... fn           \lam{N :: a -> b}
+        ~                       \lam{N} is not a (partial application of) a top level function
+        x = λf0 ... λfn.N
+        \stoptrans
+
+        \startbuffer[from]
+        map (λa . add a b) xs
+
+        map (add b) ys
+        \stopbuffer
+
+        \startbuffer[to]
+        x0 = λb.λa.add a b
+        ~
+        map x0 xs
+
+        x1 = λb.add b
+        map x1 ys
+        \stopbuffer
+
+        \transexample{Function extraction}{from}{to}
+
+      \subsubsection{Argument propagation}
+        This transform deals with arguments to user-defined functions that are
+        not representable at runtime. This means these arguments cannot be
+        preserved in the final form and most be {\em propagated}.
+
+        Propagation means to create a specialized version of the called
+        function, with the propagated argument already filled in. As a simple
+        example, in the following program:
+
+        \startlambda
+        f = λa.λb.a + b
+        inc = λa.f a 1
+        \stoplambda
+
+        we could {\em propagate} the constant argument 1, with the following
+        result:
+
+        \startlambda
+        f' = λa.a + 1
+        inc = λa.f' a
+        \stoplambda
+
+        Special care must be taken when the to-be-propagated expression has any
+        free variables. If this is the case, the original argument should not be
+        removed alltogether, but replaced by all the free variables of the
+        expression. In this way, the original expression can still be evaluated
+        inside the new function. Also, this brings us closer to our goal: All
+        these free variables will be simple variable references.
+
+        To prevent us from propagating the same argument over and over, a simple
+        local variable reference is not propagated (since is has exactly one
+        free variable, itself, we would only replace that argument with itself).
+
+        This shows that any free local variables that are not runtime representable
+        cannot be brought into normal form by this transform. We rely on an
+        inlining transformation to replace such a variable with an expression we
+        can propagate again.
+
+        \starttrans
+        x = E
+        ~
+        x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
+        ---------------------------------------------    \lam{Yi} is not a local variable reference
+        x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
+        ~
+        x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn .       
+              E y0 ... yi-1 Yi yi+1 ... yn   
+
+        \stoptrans
+
+        TODO: Example
+
+    \subsection{Case simplification}
+      \subsubsection{Scrutinee simplification}
+        This transform ensures that the scrutinee of a case expression is always
+        a simple variable reference.
+
+        \starttrans
+        case E of
+          alts
+        -----------------        \lam{E} is not a local variable reference
+        let x = E in 
+          case E of
+            alts
+        \stoptrans
+
+        \startbuffer[from]
+        case (foo a) of
+          True -> a
+          False -> b
+        \stopbuffer
+
+        \startbuffer[to]
+        let x = foo a in
+          case x of
+            True -> a
+            False -> b
+        \stopbuffer
+
+        \transexample{Let flattening}{from}{to}
+
+
+      \subsubsection{Case simplification}
+        This transformation ensures that all case expressions become normal form. This
+        means they will become one of:
+        \startitemize
+        \item An extractor case with a single alternative that picks a single field
+        from a datatype, \eg \lam{case x of (a, b) -> a}.
+        \item A selector case with multiple alternatives and only wild binders, that
+        makes a choice between expressions based on the constructor of another
+        expression, \eg \lam{case x of Low -> a; High -> b}.
+        \stopitemize
+
+        \starttrans
          case E of
-          C0 w0,0 ... w0,m -> x0
+          C0 v0,0 ... v0,m -> E0
            \vdots
-          Cn wn,0 ... wn,m -> xn
-      \stoptrans
-
-      TODO: This transformation specified like this is complicated and misses
-      conditions to prevent looping with itself. Perhaps we should split it here for
-      discussion?
-
-      \startbuffer[from]
-      case a of
-        True -> add b 1
-        False -> add b 2
-      \stopbuffer
-
-      \startbuffer[to]
-      letnonrec
-        x0 = add b 1
-        x1 = add b 2
-      in
+          Cn vn,0 ... vn,m -> En
+        --------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
+        letnonrec
+          v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
+          \vdots
+          v0,m = case x of C0 v0,0 .. v0,m -> v0,m
+          x0 = E0
+          \dots
+          vn,m = case x of Cn vn,0 .. vn,m -> vn,m
+          xn = En
+        in
+          case E of
+            C0 w0,0 ... w0,m -> x0
+            \vdots
+            Cn wn,0 ... wn,m -> xn
+        \stoptrans
+
+        TODO: This transformation specified like this is complicated and misses
+        conditions to prevent looping with itself. Perhaps we should split it here for
+        discussion?
+
+        \startbuffer[from]
          case a of
-          True -> x0
-          False -> x1
-      \stopbuffer
-
-      \transexample{Selector case simplification}{from}{to}
-
-      \startbuffer[from]
-      case a of
-        (,) b c -> add b c
-      \stopbuffer
-      \startbuffer[to]
-      letnonrec
-        b = case a of (,) b c -> b
-        c = case a of (,) b c -> c
-        x0 = add b c
-      in
+          True -> add b 1
+          False -> add b 2
+        \stopbuffer
+
+        \startbuffer[to]
+        letnonrec
+          x0 = add b 1
+          x1 = add b 2
+        in
+          case a of
+            True -> x0
+            False -> x1
+        \stopbuffer
+
+        \transexample{Selector case simplification}{from}{to}
+
+        \startbuffer[from]
+        case a of
+          (,) b c -> add b c
+        \stopbuffer
+        \startbuffer[to]
+        letnonrec
+          b = case a of (,) b c -> b
+          c = case a of (,) b c -> c
+          x0 = add b c
+        in
+          case a of
+            (,) w0 w1 -> x0
+        \stopbuffer
+
+        \transexample{Extractor case simplification}{from}{to}
+
+      \subsubsection{Case removal}
+        This transform removes any case statements with a single alternative and
+        only wild binders.
+
+        These "useless" case statements are usually leftovers from case simplification
+        on extractor case (see the previous example).
+
+        \starttrans
+        case x of
+          C v0 ... vm -> E
+        ----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
+        E
+        \stoptrans
+
+        \startbuffer[from]
          case a of
            (,) w0 w1 -> x0
-      \stopbuffer
-
-      \transexample{Extractor case simplification}{from}{to}
-
-    \subsubsection{Case removal}
-      This transform removes any case statements with a single alternative and
-      only wild binders.
-
-      These "useless" case statements are usually leftovers from case simplification
-      on extractor case (see the previous example).
-
-      \starttrans
-      case x of
-        C v0 ... vm -> E
-      ----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
-      E
-      \stoptrans
-
-      \startbuffer[from]
-      case a of
-        (,) w0 w1 -> x0
-      \stopbuffer
-
-      \startbuffer[to]
-      x0
-      \stopbuffer
-
-      \transexample{Case removal}{from}{to}
-
-\subsection{Monomorphisation}
-  TODO: Better name for this section
-
-  Reference type-specialization (== argument propagation)
-
-\subsubsection{Defunctionalization}
-  Reference higher-order-specialization (== argument propagation)
-
-    \subsubsection{Non-representable binding inlining}
-      This transform inlines let bindings that have a non-representable type. Since
-      we can never generate a signal assignment for these bindings (we cannot
-      declare a signal assignment with a non-representable type, for obvious
-      reasons), we have no choice but to inline the binding to remove it.
-
-      If the binding is non-representable because it is a lambda abstraction, it is
-      likely that it will inlined into an application and β-reduction will remove
-      the lambda abstraction and turn it into a representable expression at the
-      inline site. The same holds for partial applications, which can be turned into
-      full applications by inlining.
-
-      Other cases of non-representable bindings we see in practice are primitive
-      Haskell types. In most cases, these will not result in a valid normalized
-      output, but then the input would have been invalid to start with. There is one
-      exception to this: When a builtin function is applied to a non-representable
-      expression, things might work out in some cases. For example, when you write a
-      literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
-      the following core: \lam{fromInteger (smallInteger 10)}, where for example
-      \lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
-      non-representable types. TODO: This/these paragraph(s) should probably become a
-      separate discussion somewhere else.
-
-      \starttrans
-      letnonrec a = E in M
-      --------------------------    \lam{E} has a non-representable type.
-      M[E/a]
-      \stoptrans
-
-      \starttrans
-      letrec 
-        \vdots
-        a = E
-        \vdots
-      in
-        M
-      --------------------------    \lam{E} has a non-representable type.
-      letrec
-        \vdots [E/a]
-        \vdots [E/a]
-      in
+        \stopbuffer
+
+        \startbuffer[to]
+        x0
+        \stopbuffer
+
+        \transexample{Case removal}{from}{to}
+
+  \subsection{Monomorphisation}
+    TODO: Better name for this section
+
+    Reference type-specialization (== argument propagation)
+
+  \subsubsection{Defunctionalization}
+    Reference higher-order-specialization (== argument propagation)
+
+      \subsubsection{Non-representable binding inlining}
+        This transform inlines let bindings that have a non-representable type. Since
+        we can never generate a signal assignment for these bindings (we cannot
+        declare a signal assignment with a non-representable type, for obvious
+        reasons), we have no choice but to inline the binding to remove it.
+
+        If the binding is non-representable because it is a lambda abstraction, it is
+        likely that it will inlined into an application and β-reduction will remove
+        the lambda abstraction and turn it into a representable expression at the
+        inline site. The same holds for partial applications, which can be turned into
+        full applications by inlining.
+
+        Other cases of non-representable bindings we see in practice are primitive
+        Haskell types. In most cases, these will not result in a valid normalized
+        output, but then the input would have been invalid to start with. There is one
+        exception to this: When a builtin function is applied to a non-representable
+        expression, things might work out in some cases. For example, when you write a
+        literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
+        the following core: \lam{fromInteger (smallInteger 10)}, where for example
+        \lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
+        non-representable types. TODO: This/these paragraph(s) should probably become a
+        separate discussion somewhere else.
+
+        \starttrans
+        letnonrec a = E in M
+        --------------------------    \lam{E} has a non-representable type.
          M[E/a]
-      \stoptrans
-
-      \startbuffer[from]
-      letrec
-        a = smallInteger 10
-        inc = λa -> add a 1
-        inc' = add 1
-        x = fromInteger a 
-      in
-        inc (inc' x)
-      \stopbuffer
-
-      \startbuffer[to]
-      letrec
-        x = fromInteger (smallInteger 10)
-      in
-        (λa -> add a 1) (add 1 x)
-      \stopbuffer
-
-      \transexample{Let flattening}{from}{to}
-
-
-\section{Provable properties}
-  When looking at the system of transformations outlined above, there are a
-  number of questions that we can ask ourselves. The main question is of course:
-  \quote{Does our system work as intended?}. We can split this question into a
-  number of subquestions:
-
-  \startitemize[KR]
-  \item[q:termination] Does our system \emph{terminate}? Since our system will
-  keep running as long as transformations apply, there is an obvious risk that
-  it will keep running indefinitely. One transformation produces a result that
-  is transformed back to the original by another transformation, for example.
-  \item[q:soundness] Is our system \emph{sound}? Since our transformations
-  continuously modify the expression, there is an obvious risk that the final
-  normal form will not be equivalent to the original program: Its meaning could
-  have changed.
-  \item[q:completeness] Is our system \emph{complete}? Since we have a complex
-  system of transformations, there is an obvious risk that some expressions will
-  not end up in our intended normal form, because we forgot some transformation.
-  In other words: Does our transformation system result in our intended normal
-  form for all possible inputs?
-  \item[q:determinism] Is our system \emph{deterministic}? Since we have defined
-  no particular order in which the transformation should be applied, there is an
-  obvious risk that different transformation orderings will result in
-  \emph{different} normal forms. They might still both be intended normal forms
-  (if our system is \emph{complete}) and describe correct hardware (if our
-  system is \emph{sound}), so this property is less important than the previous
-  three: The translator would still function properly without it.
-  \stopitemize
-
-  \subsection{Graph representation}
-    Before looking into how to prove these properties, we'll look at our
-    transformation system from a graph perspective. The nodes of the graph are
-    all possible Core expressions. The (directed) edges of the graph are
-    transformations. When a transformation α applies to an expression \lam{A} to
-    produce an expression \lam{B}, we add an edge from the node for \lam{A} to the
-    node for \lam{B}, labeled α.
-
-    \startuseMPgraphic{TransformGraph}
-      save a, b, c, d;
-
-      % Nodes
-      newCircle.a(btex \lam{(λx.λy. (+) x y) 1} etex);
-      newCircle.b(btex \lam{λy. (+) 1 y} etex);
-      newCircle.c(btex \lam{(λx.(+) x) 1} etex);
-      newCircle.d(btex \lam{(+) 1} etex);
-
-      b.c = origin;
-      c.c = b.c + (4cm, 0cm);
-      a.c = midpoint(b.c, c.c) + (0cm, 4cm);
-      d.c = midpoint(b.c, c.c) - (0cm, 3cm);
-
-      % β-conversion between a and b
-      ncarc.a(a)(b) "name(bred)";
-      ObjLabel.a(btex $\xrightarrow[normal]{}{β}$ etex) "labpathname(bred)", "labdir(rt)";
-      ncarc.b(b)(a) "name(bexp)", "linestyle(dashed withdots)";
-      ObjLabel.b(btex $\xleftarrow[normal]{}{β}$ etex) "labpathname(bexp)", "labdir(lft)";
-
-      % η-conversion between a and c
-      ncarc.a(a)(c) "name(ered)";
-      ObjLabel.a(btex $\xrightarrow[normal]{}{η}$ etex) "labpathname(ered)", "labdir(rt)";
-      ncarc.c(c)(a) "name(eexp)", "linestyle(dashed withdots)";
-      ObjLabel.c(btex $\xleftarrow[normal]{}{η}$ etex) "labpathname(eexp)", "labdir(lft)";
-
-      % η-conversion between b and d
-      ncarc.b(b)(d) "name(ered)";
-      ObjLabel.b(btex $\xrightarrow[normal]{}{η}$ etex) "labpathname(ered)", "labdir(rt)";
-      ncarc.d(d)(b) "name(eexp)", "linestyle(dashed withdots)";
-      ObjLabel.d(btex $\xleftarrow[normal]{}{η}$ etex) "labpathname(eexp)", "labdir(lft)";
-
-      % β-conversion between c and d
-      ncarc.c(c)(d) "name(bred)";
-      ObjLabel.c(btex $\xrightarrow[normal]{}{β}$ etex) "labpathname(bred)", "labdir(rt)";
-      ncarc.d(d)(c) "name(bexp)", "linestyle(dashed withdots)";
-      ObjLabel.d(btex $\xleftarrow[normal]{}{β}$ etex) "labpathname(bexp)", "labdir(lft)";
+        \stoptrans
  
-      % Draw objects and lines
-      drawObj(a, b, c, d);
-    \stopuseMPgraphic
+        \starttrans
+        letrec 
+          \vdots
+          a = E
+          \vdots
+        in
+          M
+        --------------------------    \lam{E} has a non-representable type.
+        letrec
+          \vdots [E/a]
+          \vdots [E/a]
+        in
+          M[E/a]
+        \stoptrans
  
-    \placeexample[right][ex:TransformGraph]{Partial graph of a labmda calculus
-    system with β and η reduction (solid lines) and expansion (dotted lines).}
-        \boxedgraphic{TransformGraph}
+        \startbuffer[from]
+        letrec
+          a = smallInteger 10
+          inc = λa -> add a 1
+          inc' = add 1
+          x = fromInteger a 
+        in
+          inc (inc' x)
+        \stopbuffer
  
-    Of course our graph is unbounded, since we can construct an infinite amount of
-    Core expressions. Also, there might potentially be multiple edges between two
-    given nodes (with different labels), though seems unlikely to actually happen
-    in our system.
+        \startbuffer[to]
+        letrec
+          x = fromInteger (smallInteger 10)
+        in
+          (λa -> add a 1) (add 1 x)
+        \stopbuffer
  
-    See \in{example}[ex:TransformGraph] for the graph representation of a very
-    simple lambda calculus that contains just the expressions \lam{(λx.λy. (+) x
-    y) 1}, \lam{λy. (+) 1 y}, \lam{(λx.(+) x) 1} and \lam{(+) 1}. The
-    transformation system consists of β-reduction and η-reduction (solid edges) or
-    β-reduction and η-reduction (dotted edges).
+        \transexample{Let flattening}{from}{to}
  
-    TODO: Define β-reduction and η-reduction?
  
-    Note that the normal form of such a system consists of the set of nodes
-    (expressions) without outgoing edges, since those are the expression to which
-    no transformation applies anymore. We call this set of nodes the \emph{normal
-    set}.
+  \section{Provable properties}
+    When looking at the system of transformations outlined above, there are a
+    number of questions that we can ask ourselves. The main question is of course:
+    \quote{Does our system work as intended?}. We can split this question into a
+    number of subquestions:
  
-    From such a graph, we can derive some properties easily:
      \startitemize[KR]
-      \item A system will \emph{terminate} if there is no path of infinite length
-      in the graph (this includes cycles).
-      \item Soundness is not easily represented in the graph.
-      \item A system is \emph{complete} if all of the nodes in the normal set have
-      the intended normal form. The inverse (that all of the nodes outside of
-      the normal set are \emph{not} in the intended normal form) is not
-      strictly required.
-      \item A system is deterministic if all paths from a node, which end in a node
-      in the normal set, end at the same node.
+    \item[q:termination] Does our system \emph{terminate}? Since our system will
+    keep running as long as transformations apply, there is an obvious risk that
+    it will keep running indefinitely. One transformation produces a result that
+    is transformed back to the original by another transformation, for example.
+    \item[q:soundness] Is our system \emph{sound}? Since our transformations
+    continuously modify the expression, there is an obvious risk that the final
+    normal form will not be equivalent to the original program: Its meaning could
+    have changed.
+    \item[q:completeness] Is our system \emph{complete}? Since we have a complex
+    system of transformations, there is an obvious risk that some expressions will
+    not end up in our intended normal form, because we forgot some transformation.
+    In other words: Does our transformation system result in our intended normal
+    form for all possible inputs?
+    \item[q:determinism] Is our system \emph{deterministic}? Since we have defined
+    no particular order in which the transformation should be applied, there is an
+    obvious risk that different transformation orderings will result in
+    \emph{different} normal forms. They might still both be intended normal forms
+    (if our system is \emph{complete}) and describe correct hardware (if our
+    system is \emph{sound}), so this property is less important than the previous
+    three: The translator would still function properly without it.
      \stopitemize
  
-    When looking at the \in{example}[ex:TransformGraph], we see that the system
-    terminates for both the reduction and expansion systems (but note that, for
-    expansion, this is only true because we've limited the possible expressions!
-    In comlete lambda calculus, there would be a path from \lam{(λx.λy. (+) x y)
-    1} to \lam{(λx.λy.(λz.(+) z) x y) 1} to \lam{(λx.λy.(λz.(λq.(+) q) z) x y) 1}
-    etc.)
-
-    If we would consider the system with both expansion and reduction, there would
-    no longer be termination, since there would be cycles all over the place.
-
-    The reduction and expansion systems have a normal set of containing just
-    \lam{(+) 1} or \lam{(λx.λy. (+) x y) 1} respectively. Since all paths in
-    either system end up in these normal forms, both systems are \emph{complete}.
-    Also, since there is only one normal form, it must obviously be
-    \emph{deterministic} as well.
-
-  \subsection{Termination}
-    Approach: Counting.
-
-    Church-Rosser?
-
-  \subsection{Soundness}
-    Needs formal definition of semantics.
-    Prove for each transformation seperately, implies soundness of the system.
-   
-  \subsection{Completeness}
-    Show that any transformation applies to every Core expression that is not
-    in normal form. To prove: no transformation applies => in intended form.
-    Show the reverse: Not in intended form => transformation applies.
-
-  \subsection{Determinism}
-    How to prove this?
+    \subsection{Graph representation}
+      Before looking into how to prove these properties, we'll look at our
+      transformation system from a graph perspective. The nodes of the graph are
+      all possible Core expressions. The (directed) edges of the graph are
+      transformations. When a transformation α applies to an expression \lam{A} to
+      produce an expression \lam{B}, we add an edge from the node for \lam{A} to the
+      node for \lam{B}, labeled α.
+
+      \startuseMPgraphic{TransformGraph}
+        save a, b, c, d;
+
+        % Nodes
+        newCircle.a(btex \lam{(λx.λy. (+) x y) 1} etex);
+        newCircle.b(btex \lam{λy. (+) 1 y} etex);
+        newCircle.c(btex \lam{(λx.(+) x) 1} etex);
+        newCircle.d(btex \lam{(+) 1} etex);
+
+        b.c = origin;
+        c.c = b.c + (4cm, 0cm);
+        a.c = midpoint(b.c, c.c) + (0cm, 4cm);
+        d.c = midpoint(b.c, c.c) - (0cm, 3cm);
+
+        % β-conversion between a and b
+        ncarc.a(a)(b) "name(bred)";
+        ObjLabel.a(btex $\xrightarrow[normal]{}{β}$ etex) "labpathname(bred)", "labdir(rt)";
+        ncarc.b(b)(a) "name(bexp)", "linestyle(dashed withdots)";
+        ObjLabel.b(btex $\xleftarrow[normal]{}{β}$ etex) "labpathname(bexp)", "labdir(lft)";
+
+        % η-conversion between a and c
+        ncarc.a(a)(c) "name(ered)";
+        ObjLabel.a(btex $\xrightarrow[normal]{}{η}$ etex) "labpathname(ered)", "labdir(rt)";
+        ncarc.c(c)(a) "name(eexp)", "linestyle(dashed withdots)";
+        ObjLabel.c(btex $\xleftarrow[normal]{}{η}$ etex) "labpathname(eexp)", "labdir(lft)";
+
+        % η-conversion between b and d
+        ncarc.b(b)(d) "name(ered)";
+        ObjLabel.b(btex $\xrightarrow[normal]{}{η}$ etex) "labpathname(ered)", "labdir(rt)";
+        ncarc.d(d)(b) "name(eexp)", "linestyle(dashed withdots)";
+        ObjLabel.d(btex $\xleftarrow[normal]{}{η}$ etex) "labpathname(eexp)", "labdir(lft)";
+
+        % β-conversion between c and d
+        ncarc.c(c)(d) "name(bred)";
+        ObjLabel.c(btex $\xrightarrow[normal]{}{β}$ etex) "labpathname(bred)", "labdir(rt)";
+        ncarc.d(d)(c) "name(bexp)", "linestyle(dashed withdots)";
+        ObjLabel.d(btex $\xleftarrow[normal]{}{β}$ etex) "labpathname(bexp)", "labdir(lft)";
+
+        % Draw objects and lines
+        drawObj(a, b, c, d);
+      \stopuseMPgraphic
+
+      \placeexample[right][ex:TransformGraph]{Partial graph of a labmda calculus
+      system with β and η reduction (solid lines) and expansion (dotted lines).}
+          \boxedgraphic{TransformGraph}
+
+      Of course our graph is unbounded, since we can construct an infinite amount of
+      Core expressions. Also, there might potentially be multiple edges between two
+      given nodes (with different labels), though seems unlikely to actually happen
+      in our system.
+
+      See \in{example}[ex:TransformGraph] for the graph representation of a very
+      simple lambda calculus that contains just the expressions \lam{(λx.λy. (+) x
+      y) 1}, \lam{λy. (+) 1 y}, \lam{(λx.(+) x) 1} and \lam{(+) 1}. The
+      transformation system consists of β-reduction and η-reduction (solid edges) or
+      β-reduction and η-reduction (dotted edges).
+
+      TODO: Define β-reduction and η-reduction?
+
+      Note that the normal form of such a system consists of the set of nodes
+      (expressions) without outgoing edges, since those are the expression to which
+      no transformation applies anymore. We call this set of nodes the \emph{normal
+      set}.
+
+      From such a graph, we can derive some properties easily:
+      \startitemize[KR]
+        \item A system will \emph{terminate} if there is no path of infinite length
+        in the graph (this includes cycles).
+        \item Soundness is not easily represented in the graph.
+        \item A system is \emph{complete} if all of the nodes in the normal set have
+        the intended normal form. The inverse (that all of the nodes outside of
+        the normal set are \emph{not} in the intended normal form) is not
+        strictly required.
+        \item A system is deterministic if all paths from a node, which end in a node
+        in the normal set, end at the same node.
+      \stopitemize
+
+      When looking at the \in{example}[ex:TransformGraph], we see that the system
+      terminates for both the reduction and expansion systems (but note that, for
+      expansion, this is only true because we've limited the possible expressions!
+      In comlete lambda calculus, there would be a path from \lam{(λx.λy. (+) x y)
+      1} to \lam{(λx.λy.(λz.(+) z) x y) 1} to \lam{(λx.λy.(λz.(λq.(+) q) z) x y) 1}
+      etc.)
+
+      If we would consider the system with both expansion and reduction, there would
+      no longer be termination, since there would be cycles all over the place.
+
+      The reduction and expansion systems have a normal set of containing just
+      \lam{(+) 1} or \lam{(λx.λy. (+) x y) 1} respectively. Since all paths in
+      either system end up in these normal forms, both systems are \emph{complete}.
+      Also, since there is only one normal form, it must obviously be
+      \emph{deterministic} as well.
+
+    \subsection{Termination}
+      Approach: Counting.
+
+      Church-Rosser?
+
+    \subsection{Soundness}
+      Needs formal definition of semantics.
+      Prove for each transformation seperately, implies soundness of the system.
+     
+    \subsection{Completeness}
+      Show that any transformation applies to every Core expression that is not
+      in normal form. To prove: no transformation applies => in intended form.
+      Show the reverse: Not in intended form => transformation applies.
+
+    \subsection{Determinism}
+      How to prove this?
author	Matthijs Kooijman <matthijs@stdin.nl>
	Mon, 2 Nov 2009 11:21:06 +0000 (12:21 +0100)
committer	Matthijs Kooijman <matthijs@stdin.nl>
	Mon, 2 Nov 2009 12:12:32 +0000 (13:12 +0100)