(\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is
available.
-\subsection{Definitions}
-In the following sections, we will be using a number of functions and
-notations, which we will define here.
+\section{Transformation notation}
+To be able to concisely present transformations, we use a specific format to
+them. It is a simple format, similar to one used in logic reasoning.
-\subsubsection{Transformations}
-The most important notation is the one for transformation, which looks like
-the following:
+Such a transformation description looks like the following.
\starttrans
-context conditions
+<context conditions>
~
-from
------------------------- expression conditions
-to
+<original expression>
+-------------------------- <expression conditions>
+<transformed expresssion>
~
-context additions
+<context additions>
+\stoptrans
+
+This format desribes a transformation that applies to \lam{original
+expresssion} and transforms it into \lam{transformed expression}, assuming
+that all conditions apply. In this format, there are a number of placeholders
+in pointy brackets, most of which should be rather obvious in their meaning.
+Nevertheless, we will more precisely specify their meaning below:
+
+ \startdesc{<original expression>} The expression pattern that will be matched
+ against (subexpressions of) the expression to be transformed. We call this a
+ pattern, because it can contain \emph{placeholders} (variables), which match
+ any expression or binder. Any such placeholder is said to be \emph{bound} to
+ the expression it matches. It is convention to use an uppercase latter (\eg
+ \lam{M} or \lam{E} to refer to any expression (including a simple variable
+ reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to
+ (references to) binders.
+
+ For example, the pattern \lam{a + B} will match the expression
+ \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to
+ \lam{(2 * 2)}), but not \lam{v + (2 * w)}.
+ \stopdesc
+
+ \startdesc{<expression conditions>}
+ These are extra conditions on the expression that is matched. These
+ conditions can be used to further limit the cases in which the
+ transformation applies, in particular to prevent a transformation from
+ causing a loop with itself or another transformation.
+
+ Only if these if these conditions are \emph{all} true, this transformation
+ applies.
+ \stopdesc
+
+ \startdesc{<context conditions>}
+ These are a number of extra conditions on the context of the function. In
+ particular, these conditions can require some other top level function to be
+ present, whose value matches the pattern given here. The format of each of
+ these conditions is: \lam{binder = <pattern>}.
+
+ Typically, the binder is some placeholder bound in the \lam{<original
+ expression>}, while the pattern contains some placeholders that are used in
+ the \lam{transformed expression}.
+
+ Only if a top level binder exists that matches each binder and pattern, this
+ transformation applies.
+ \stopdesc
+
+ \startdesc{<transformed expression>}
+ This is the expression template that is the result of the transformation. If, looking
+ at the above three items, the transformation applies, the \lam{original
+ expression} is completely replaced with the \lam{<transformed expression>}.
+ We call this a template, because it can contain placeholders, referring to
+ any placeholder bound by the \lam{<original expression>} or the
+ \lam{<context conditions>}. The resulting expression will have those
+ placeholders replaced by the values bound to them.
+
+ Any binder (lowercase) placeholder that has no value bound to it yet will be
+ bound to (and replaced with) a fresh binder.
+ \stopdesc
+
+ \startdesc{<context additions>}
+ These are templates for new functions to add to the context. This is a way
+ to have a transformation create new top level functiosn.
+
+ Each addition has the form \lam{binder = template}. As above, any
+ placeholder in the addition is replaced with the value bound to it, and any
+ binder placeholder that has no value bound to it yet will be bound to (and
+ replaced with) a fresh binder.
+ \stopdesc
+
+ As an example, we'll look at η-abstraction:
+
+\starttrans
+E \lam{E :: a -> b}
+-------------- \lam{E} does not occur on a function position in an application
+λx.E x \lam{E} is not a lambda abstraction.
\stoptrans
-Here, we describe a transformation. The most import parts are \lam{from} and
-\lam{to}, which describe the Core expresssion that should be matched and the
-expression that it should be replaced with. This matching can occur anywhere
-in function that is being normalized, so it applies to any subexpression as
-well.
+ Consider the following function, which is a fairly obvious way to specify a
+ simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this
+ function):
-The \lam{expression conditions} list a number of conditions on the \lam{from}
-expression that must hold for the transformation to apply.
+\startlambda
+alu :: Bit -> Word -> Word -> Word
+alu = λopcode. case opcode of
+ Low -> (+)
+ High -> (-)
+\stoplambda
-Furthermore, there is some way to look into the environment (\eg, other top
-level bindings). The \lam{context conditions} part specifies any number of
-top level bindings that must be present for the transformation to apply.
-Usually, this lists a top level binding that binds an identfier that is also
-used in the \lam{from} expression, allowing us to "access" the value of a top
-level binding in the \lam{to} expression (\eg, for inlining).
+ There are a few subexpressions in this function to which we could possibly
+ apply the transformation. Since the pattern of the transformation is only
+ the placeholder \lam{E}, any expression will match that. Whether the
+ transformation applies to an expression is thus solely decided by the
+ conditions to the right of the transformation.
-Finally, there is a way to influence the environment. The \lam{context
-additions} part lists any number of new top level bindings that should be
-added.
+ We will look at each expression in the function in a top down manner. The
+ first expression is the entire expression the function is bound to.
-If there are no \lam{context conditions} or \lam{context additions}, they can
-be left out alltogether, along with the separator \lam{~}.
+\startlambda
+λopcode. case opcode of
+ Low -> (+)
+ High -> (-)
+\stoplambda
-TODO: Example
+ As said, the expression pattern matches this. The type of this expression is
+ \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in
+ this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}).
-\subsubsection{Other concepts}
-A \emph{global variable} is any variable that is bound at the
-top level of a program, or an external module. A local variable is any other
-variable (\eg, variables local to a function, which can be bound by lambda
-abstractions, let expressions and case expressions).
+ Since this expression is at top level, it does not occur at a function
+ position of an application. However, The expression is a lambda abstraction,
+ so this transformation does not apply.
-A \emph{hardware representable} type is a type that we can generate
-a signal for in hardware. For example, a bit, a vector of bits, a 32 bit
-unsigned word, etc. Types that are not runtime representable notably
-include (but are not limited to): Types, dictionaries, functions.
+ The next expression we could apply this transformation to, is the body of
+ the lambda abstraction:
+
+\startlambda
+case opcode of
+ Low -> (+)
+ High -> (-)
+\stoplambda
+
+ The type of this expression is \lam{Word -> Word -> Word}, which again
+ matches \lam{a -> b}. The expression is the body of a lambda expression, so
+ it does not occur at a function position of an application. Finally, the
+ expression is not a lambda abstraction but a case expression, so all the
+ conditions match. There are no context conditions to match, so the
+ transformation applies.
+
+ By now, the placeholder \lam{E} is bound to the entire expression. The
+ placeholder \lam{x}, which occurs in the replacement template, is not bound
+ yet, so we need to generate a fresh binder for that. Let's use the binder
+ \lam{a}. This results in the following replacement expression:
+
+\startlambda
+λa.(case opcode of
+ Low -> (+)
+ High -> (-)) a
+\stoplambda
+
+ Continuing with this expression, we see that the transformation does not
+ apply again (it is a lambda expression). Next we look at the body of this
+ labmda abstraction:
+
+\startlambda
+(case opcode of
+ Low -> (+)
+ High -> (-)) a
+\stoplambda
+
+ Here, the transformation does apply, binding \lam{E} to the entire
+ expression and \lam{x} to the fresh binder \lam{b}, resulting in the
+ replacement:
+
+\startlambda
+λb.(case opcode of
+ Low -> (+)
+ High -> (-)) a b
+\stoplambda
-A \emph{builtin function} is a function for which a builtin
-hardware translation is available, because its actual definition is not
-translatable. A user-defined function is any other function.
+ Again, the transformation does not apply to this lambda abstraction, so we
+ look at its body. For brevity, we'll put the case statement on one line from
+ now on.
+
+\startlambda
+(case opcode of Low -> (+); High -> (-)) a b
+\stoplambda
+
+ The type of this expression is \lam{Word}, so it does not match \lam{a -> b}
+ and the transformation does not apply. Next, we have two options for the
+ next expression to look at: The function position and argument position of
+ the application. The expression in the argument position is \lam{b}, which
+ has type \lam{Word}, so the transformation does not apply. The expression in
+ the function position is:
+
+\startlambda
+(case opcode of Low -> (+); High -> (-)) a
+\stoplambda
+
+ Obviously, the transformation does not apply here, since it occurs in
+ function position. In the same way the transformation does not apply to both
+ components of this expression (\lam{case opcode of Low -> (+); High -> (-)}
+ and \lam{a}), so we'll skip to the components of the case expression: The
+ scrutinee and both alternatives. Since the opcode is not a function, it does
+ not apply here, and we'll leave both alternatives as an exercise to the
+ reader. The final function, after all these transformations becomes:
+
+\startlambda
+alu :: Bit -> Word -> Word -> Word
+alu = λopcode.λa.b. (case opcode of
+ Low -> λa1.λb1 (+) a1 b1
+ High -> λa2.λb2 (-) a2 b2) a b
+\stoplambda
+
+ In this case, the transformation does not apply anymore, though this might
+ not always be the case (e.g., the application of a transformation on a
+ subexpression might open up possibilities to apply the transformation
+ further up in the expression).
+
+\subsection{Transformation application}
+In this chapter we define a number of transformations, but how will we apply
+these? As stated before, our normal form is reached as soon as no
+transformation applies anymore. This means our application strategy is to
+simply apply any transformation that applies, and continuing to do that with
+the result of each transformation.
+
+In particular, we define no particular order of transformations. Since
+transformation order should not influence the resulting normal form (see TODO
+ref), this leaves the implementation free to choose any application order that
+results in an efficient implementation.
+
+When applying a single transformation, we try to apply it to every (sub)expression
+in a function, not just the top level function. This allows us to keep the
+transformation descriptions concise and powerful.
+
+\subsection{Definitions}
+In the following sections, we will be using a number of functions and
+notations, which we will define here.
+
+\subsubsection{Other concepts}
+A \emph{global variable} is any variable that is bound at the
+top level of a program, or an external module. A \emph{local variable} is any
+other variable (\eg, variables local to a function, which can be bound by
+lambda abstractions, let expressions and pattern matches of case
+alternatives). Note that this is a slightly different notion of global versus
+local than what \small{GHC} uses internally.
+\defref{global variable} \defref{local variable}
+
+A \emph{hardware representable} (or just \emph{representable}) type or value
+is (a value of) a type that we can generate a signal for in hardware. For
+example, a bit, a vector of bits, a 32 bit unsigned word, etc. Types that are
+not runtime representable notably include (but are not limited to): Types,
+dictionaries, functions.
+\defref{representable}
+
+A \emph{builtin function} is a function supplied by the Cλash framework, whose
+implementation is not valid Cλash. The implementation is of course valid
+Haskell, for simulation, but it is not expressable in Cλash.
+\defref{builtin function} \defref{user-defined function}
+
+For these functions, Cλash has a \emph{builtin hardware translation}, so calls
+to these functions can still be translated. These are functions like
+\lam{map}, \lam{hwor} and \lam{length}.
+
+A \emph{user-defined} function is a function for which we do have a Cλash
+implementation available.
\subsubsection{Functions}
Here, we define a number of functions that can be used below to concisely
specify conditions.
-\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
+\refdef{global variable}\emph{gvar(expr)} is true when \emph{expr} is a variable that references a
global variable. It is false when it references a local variable.
-\emph{lvar(expr)} is the inverse of \emph{gvar}; it is true when \emph{expr}
+\refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr}
references a local variable, false when it references a global variable.
-\emph{representable(expr)} or \emph{representable(var)} is true when
-\emph{expr} or \emph{var} has a type that is representable at runtime.
-
-\section{Transform passes}
-In this section we describe the actual transforms. Here we're using
-the core language in a notation that resembles lambda calculus.
-
-Each of these transforms is meant to be applied to every (sub)expression
-in a program, for as long as it applies. Only when none of the
-transformations can be applied anymore, the program is in normal form (by
-definition). We hope to be able to prove that this form will obey all of the
-constraints defined above, but this has yet to happen (though it seems likely
-that it will).
-
-Each of the transforms will be described informally first, explaining
-the need for and goal of the transform. Then, a formal definition is
-given, using a familiar syntax from the world of logic. Each transform
-is specified as a number of conditions (above the horizontal line) and a
-number of conclusions (below the horizontal line). The details of using
-this notation are still a bit fuzzy, so comments are welcom.
-
-TODO: Formally describe the "apply to every (sub)expression" in terms of
-rules with full transformations in the conditions.
+\refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when
+\emph{expr} or \emph{var} is \emph{representable}.
\subsection{Binder uniqueness}
A common problem in transformation systems, is binder uniqueness. When not
either.
\stopitemize
+\section{Transform passes}
+In this section we describe the actual transforms. Here we're using
+the core language in a notation that resembles lambda calculus.
+
+Each of these transforms is meant to be applied to every (sub)expression
+in a program, for as long as it applies. Only when none of the
+transformations can be applied anymore, the program is in normal form (by
+definition). We hope to be able to prove that this form will obey all of the
+constraints defined above, but this has yet to happen (though it seems likely
+that it will).
+
+Each of the transforms will be described informally first, explaining
+the need for and goal of the transform. Then, a formal definition is
+given, using a familiar syntax from the world of logic. Each transform
+is specified as a number of conditions (above the horizontal line) and a
+number of conclusions (below the horizontal line). The details of using
+this notation are still a bit fuzzy, so comments are welcom.
+
\subsection{η-abstraction}
This transformation makes sure that all arguments of a function-typed
expression are named, by introducing lambda expressions. When combined with
be lambda abstractions or global identifiers.
\starttrans
-E \lam{E :: * -> *}
+E \lam{E :: a -> b}
-------------- \lam{E} is not the first argument of an application.
λx.E x \lam{E} is not a lambda abstraction.
\lam{x} is a variable that does not occur free in \lam{E}.
\transexample{η-abstraction}{from}{to}
-\subsection{Extended β-reduction}
+\subsection{β-reduction}
+β-reduction is a well known transformation from lambda calculus, where it is
+the main reduction step. It reduces applications of labmda abstractions,
+removing both the lambda abstraction and the application.
+
+In our transformation system, this step helps to remove unwanted lambda
+abstractions (basically all but the ones at the top level). Other
+transformations (application propagation, non-representable inlining) make
+sure that most lambda abstractions will eventually be reducable by
+β-reduction.
+
+TODO: Define substitution syntax
+
+\starttrans
+(λx.E) M
+-----------------
+E[M/x]
+\stoptrans
+
+% And an example
+\startbuffer[from]
+(λa. 2 * a) (2 * b)
+\stopbuffer
+
+\startbuffer[to]
+2 * (2 * b)
+\stopbuffer
+
+\transexample{β-reduction}{from}{to}
+
+\subsection{Application propagation}
This transformation is meant to propagate application expressions downwards
-into expressions as far as possible. In lambda calculus, this reduction
-is known as β-reduction, but it is of course only defined for
-applications of lambda abstractions. We extend this reduction to also
-work for the rest of core (case and let expressions).
+into expressions as far as possible. This allows partial applications inside
+expressions to become fully applied and exposes new transformation
+possibilities for other transformations (like β-reduction).
-For let expressions:
\starttrans
let binds in E) M
-----------------
let binds in E M
\stoptrans
-For case statements:
+% And an example
+\startbuffer[from]
+( let
+ val = 1
+ in
+ add val
+) 3
+\stopbuffer
+
+\startbuffer[to]
+let
+ val = 1
+in
+ add val 3
+\stopbuffer
+
+\transexample{Application propagation for a let expression}{from}{to}
+
\starttrans
(case x of
p1 -> E1
pn -> En M
\stoptrans
-For lambda expressions:
-\starttrans
-(λx.E) M
------------------
-E[M/x]
-\stoptrans
-
% And an example
\startbuffer[from]
-( let a = (case x of
- True -> id
- False -> neg
- ) 1
- b = (let y = 3 in add y) 2
- in
- (λz.add 1 z)
-) 3
+( case x of
+ True -> id
+ False -> neg
+) 1
\stopbuffer
\startbuffer[to]
-let a = case x of
- True -> id 1
- False -> neg 1
- b = let y = 3 in add y 2
-in
- add 1 3
+case x of
+ True -> id 1
+ False -> neg 1
\stopbuffer
-\transexample{Extended β-reduction}{from}{to}
+\transexample{Application propagation for a case expression}{from}{to}
\subsection{Let derecursification}
This transformation is meant to make lets non-recursive whenever possible.
Note that this transformation does not try to be smart when faced with
recursive lets, it will just leave the lets recursive (possibly joining a
recursive and non-recursive let into a single recursive let). The let
-rederursification transformation will do this instead.
+rederecursification transformation will do this instead.
\starttrans
letnonrec x = (let bindings in M) in N