\stopframedtext
}
-% A shortcut for italicized e.g.
+% A shortcut for italicized e.g. and i.e.
\define[0]\eg{{\em e.g.}}
+\define[0]\ie{{\em i.e.}}
+
+\definedescription
+ [desc]
+ [location=hanging,hang=20,width=broad]
+ %command=\hskip-1cm,margin=1cm]
% Install the lambda calculus pretty-printer, as defined in pret-lam.lua.
\installprettytype [LAM] [LAM]
\definetyping[lambda][option=LAM,style=sans]
\definetype[lam][option=LAM,style=sans]
+\installprettytype [TRANS] [TRANS]
+\definetyping[trans][option=TRANS,style=normal,before=,after=]
+
% An (invisible) frame to hold a lambda expression
\define[1]\lamframe{
% Put a frame around lambda expressions, so they can have multiple
\section{Introduction}
As a new approach to translating Core to VHDL, we investigate a number of
-transformations on our Core program, which should bring the program into a
-well-defined "canonical" form, which is subsequently trivial to translate to
-VHDL.
-
-The transformations as presented here are far from complete, but are meant as
-an exploration of possible transformations. In the running example below, we
-apply each of the transformations exactly once, in sequence. As will be
-apparent from the end result, there will be additional transformations needed
-to fully reach our goal, and some transformations must be applied more than
-once. How exactly to (efficiently) do this, has not been investigated.
-
-Lastly, I hope to be able to state a number of pre- and postconditions for
-each transformation. If these can be proven for each transformation, and it
-can be shown that there exists some ordering of transformations for which the
-postcondition implies the canonical form, we can show that the transformations
-do indeed transform any program (probably satisfying a number of
-preconditions) to the canonical form.
+transforms on our Core program, which should bring the program into a
+well-defined {\em normal} form, which is subsequently trivial to
+translate to VHDL.
+
+The transforms as presented here are far from complete, but are meant as
+an exploration of possible transformations.
\section{Goal}
The transformations described here have a well-defined goal: To bring the
program in a well-defined form that is directly translatable to hardware,
while fully preserving the semantics of the program.
-This {\em canonical form} is again a Core program, but with a very specific
-structure. A function in canonical form has nested lambda's at the top, which
+This {\em normal form} is again a Core program, but with a very specific
+structure. A function in normal form has nested lambda's at the top, which
produce a let expression. This let expression binds every function application
-in the function and produces either a simple identifier or a tuple of
-identifiers. Every bound value in the let expression is either a simple
-function application or a case expression to extract a single element from a
-tuple returned by a function.
+in the function and produces a simple identifier. Every bound value in
+the let expression is either a simple function application or a case
+expression to extract a single element from a tuple returned by a
+function.
An example of a program in canonical form would be:
-\starttyping
+\startlambda
-- All arguments are an inital lambda
- \x c d ->
+ λx.λc.λd.
-- There is one let expression at the top level
let
-- Calling some other user-defined function.
in
-- The actual result
r
-\stoptyping
-
-In this and all following programs, the following definitions are assumed to
-be available:
-
-\starttyping
-data Bit = Low | High
-
-foo :: Int -> (Bit, Bit)
-add :: Int -> Int -> Int
-sub :: Int -> Int -> Int
-\stoptyping
+\stoplambda
When looking at such a program from a hardware perspective, the top level
-lambda's define the input ports. The value produced by the let expression are
-the output ports. Each function application bound by the let expression
-defines a component instantiation, where the input and output ports are mapped
-to local signals or arguments. The tuple extracting case expressions don't map
-to
-
-\subsection{Canonical form definition}
-Formally, the canonical form is a core program obeying the following
-constraints.
+lambda's define the input ports. The value produced by the let expression is
+the output port. Most function applications bound by the let expression
+define a component instantiation, where the input and output ports are mapped
+to local signals or arguments. Some of the others use a builtin
+construction (\eg the \lam{case} statement) or call a builtin function
+(\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is
+available.
+
+\subsection{Normal definition}
+Formally, the normal form is a core program obeying the following
+constraints. TODO: Update this section, this is probably not completely
+accurate or relevant anymore.
\startitemize[R,inmargin]
\item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$.
\item TODO: Many more!
\stopitemize
-\section{Transformation passes}
+\section{Transform passes}
-In this section we describe the actual transformations. Here we're using
-mostly Core-like notation, with a few notable points.
+In this section we describe the actual transforms. Here we're using
+the core language in a notation that resembles lambda calculus.
-\startitemize
-\item A core expression (in contrast with a transformation function, for
-example), is enclosed in pipes. For example, $\app{transform}{\expr{\lam{z}{z+1}}}$
-is the transform function applied to a lambda core expression.
-
-Note that this notation might not be consistently applied everywhere. In
-particular where a non-core function is used inside a core expression, things
-get slightly confusing.
-\item A bind is written as $\expr{\bind{x}{expr}}$. This means binding the identifier
-$x$ to the expression $expr$.
-\item In the core examples, the layout rule from Haskell is loosely applied.
-It should be evident what was meant from indentation, even though it might nog
-be strictly correct.
-\stopitemize
+Each of these transforms is meant to be applied to every (sub)expression
+in a program, for as long as it applies. Only when none of the
+expressions can be applied anymore, the program is in normal form. We
+hope to be able to prove that this form will obey all of the constraints
+defined above, but this has yet to happen (though it seems likely that
+it will).
-\subsection{Example}
-In the descriptions of transformations below, the following (slightly
-contrived) example program will be used as the running example. Note that this
-the example for the canonical form given above is the same program, but in
-canonical form.
+Each of the transforms will be described informally first, explaining
+the need for and goal of the transform. Then, a formal definition is
+given, using a familiar syntax from the world of logic. Each transform
+is specified as a number of conditions (above the horizontal line) and a
+number of conclusions (below the horizontal line). The details of using
+this notation are still a bit fuzzy, so comments are welcom.
-\starttyping
- \x ->
- let s = foo x
- in
- case s of
- (a, b) ->
- case a of
- High -> add
- Low -> let
- op' = case b of
- High -> sub
- Low -> \c d -> c
- in
- \c d -> op' d c
-\stoptyping
+TODO: Formally describe the "apply to every (sub)expression" in terms of
+rules with full transformations in the conditions.
\subsection{η-abstraction}
This transformation makes sure that all arguments of a function-typed
β-reduction and function inlining below, all function-typed expressions should
be lambda abstractions or global identifiers.
-\transform{η-abstraction}
-{
-\lam{E :: * -> *}
-
-\lam{E} is not the first argument of an application.
-
-\lam{E} is not a lambda abstraction.
-
-\lam{x} is a variable that does not occur free in E.
-
-\conclusion
-
-\trans{E}{λx.E x}
-}
+\starttrans
+E \lam{E :: * -> *}
+-------------- \lam{E} is not the first argument of an application.
+λx.E x \lam{E} is not a lambda abstraction.
+ \lam{x} is a variable that does not occur free in \lam{E}.
+\stoptrans
\startbuffer[from]
foo = λa -> case a of
\stopitemize
\stopitemize
-\subsubsection{User-defined functions}
-We can divide the arguments of a user-defined function into two
-categories:
+When looking at the arguments of a user-defined function, we can
+divide them into two categories:
\startitemize
- \item Runtime representable typed arguments (\eg bits or vectors).
+ \item Arguments with a runtime representable type (\eg bits or vectors).
+
+ These arguments can be preserved in the program, since they can
+ be translated to input ports later on. However, since we can
+ only connect signals to input ports, these arguments must be
+ reduced to simple variables (for which signals will be
+ produced). This is taken care of by the argument extraction
+ transform.
\item Non-runtime representable typed arguments.
+
+ These arguments cannot be preserved in the program, since we
+ cannot represent them as input or output ports in the resulting
+ VHDL. To remove them, we create a specialized version of the
+ called function with these arguments filled in. This is done by
+ the argument propagation transform.
\stopitemize
-The next two transformations will deal with each of these two kinds of argument respectively.
+When looking at the arguments of a builtin function, we can divide them
+into categories:
-\subsubsubsection{Argument extraction}
-This transform deals with arguments to user-defined functions that
-are of a runtime representable type. These arguments can be preserved in
-the program, since they can be translated to input ports later on.
-However, since we can only connect signals to input ports, these
-arguments must be reduced to simple variables (for which signals will be
-produced).
+\startitemize
+ \item Arguments with a runtime representable type.
+
+ As we have seen with user-defined functions, these arguments can
+ always be reduced to a simple variable reference, by the
+ argument extraction transform. Performing this transform for
+ builtin functions as well, means that the translation of builtin
+ functions can be limited to signal references, instead of
+ needing to support all possible expressions.
+
+ \item Arguments with a function type.
+
+ These arguments are functions passed to higher order builtins,
+ like \lam{map} and \lam{foldl}. Since implementing these
+ functions for arbitrary function-typed expressions (\eg, lambda
+ expressions) is rather comlex, we reduce these arguments to
+ (partial applications of) global functions.
+
+ We can still support arbitrary expressions from the user code,
+ by creating a new global function containing that expression.
+ This way, we can simply replace the argument with a reference to
+ that new function. However, since the expression can contain any
+ number of free variables we also have to include partial
+ applications in our normal form.
+
+ This category of arguments is handled by the function extraction
+ transform.
+ \item Other unrepresentable arguments.
+
+ These arguments can take a few different forms:
+ \startdesc{Type arguments}
+ In the core language, type arguments can only take a single
+ form: A type wrapped in the Type constructor. Also, there is
+ nothing that can be done with type expressions, except for
+ applying functions to them, so we can simply leave type
+ arguments as they are.
+ \stopdesc
+ \startdesc{Dictionary arguments}
+ In the core language, dictionary arguments are used to find
+ operations operating on one of the type arguments (mostly for
+ finding class methods). Since we will not actually evaluatie
+ the function body for builtin functions and can generate
+ code for builtin functions by just looking at the type
+ arguments, these arguments can be ignored and left as they
+ are.
+ \stopdesc
+ \startdesc{Type level arguments}
+ Sometimes, we want to pass a value to a builtin function, but
+ we need to know the value at compile time. Additionally, the
+ value has an impact on the type of the function. This is
+ encoded using type-level values, where the actual value of the
+ argument is not important, but the type encodes some integer,
+ for example. Since the value is not important, the actual form
+ of the expression does not matter either and we can leave
+ these arguments as they are.
+ \stopdesc
+ \startdesc{Other arguments}
+ Technically, there is still a wide array of arguments that can
+ be passed, but does not fall into any of the above categories.
+ However, none of the supported builtin functions requires such
+ an argument. This leaves use with passing unsupported types to
+ a function, such as calling \lam{head} on a list of functions.
+
+ In these cases, it would be impossible to generate hardware
+ for such a function call anyway, so we can ignore these
+ arguments.
+
+ The only way to generate hardware for builtin functions with
+ arguments like these, is to expand the function call into an
+ equivalent core expression (\eg, expand map into a series of
+ function applications). But for now, we choose to simply not
+ support expressions like these.
+ \stopdesc
+
+ From the above, we can conclude that we can simply ignore these
+ other unrepresentable arguments and focus on the first two
+ categories instead.
+\stopitemize
+
+\subsubsection{Argument extraction}
+This transform deals with arguments to functions that
+are of a runtime representable type.
TODO: It seems we can map an expression to a port, not only a signal.
Perhaps this makes this transformation not needed?
\transform{Argument extract}
{
-\lam{X} is a (partial application of) a user-defined function
-
\lam{Y} is of a hardware representable type
\lam{Y} is not a variable referene
\trans{X Y}{let z = Y in X z}
}
-\subsubsubsection{Argument propagation}
+
+\subsubsection{Function extraction}
+This transform deals with function-typed arguments to builtin functions.
+Since these arguments cannot be propagated, we choose to extract them
+into a new global function instead.
+
+Any free variables occuring in the extracted arguments will become
+parameters to the new global function. The original argument is replaced
+with a reference to the new function, applied to any free variables from
+the original argument.
+
+\transform{Function extraction}
+{
+\lam{X} is a (partial application of) a builtin function
+
+\lam{Y} is not an application
+
+\lam{Y} is not a variable reference
+
+\conclusion
+
+\lam{f0 ... fm} = free local vars of \lam{Y}
+
+\lam{y} is a new global variable
+
+\lam{y = λf0 ... fn.Y}
+
+\trans{X Y}{X (y f0 ... fn)}
+}
+
+\subsubsection{Argument propagation}
This transform deals with arguments to user-defined functions that are
not representable at runtime. This means these arguments cannot be
preserved in the final form and most be {\em propagated}.
hardware translation is available, because its actual definition is not
translatable. A user-defined function is any other function.
+\starttrans
+x = E
+~
+x Y0 ... Yi ... Yn \lam{Y_i} is not of a runtime representable type
+--------------------------------------------- \lam{Y_i} is not a local variable reference
+x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn . \lam{f0 ... fm} = free local vars of \lam{Y_i}
+ E y0 ... yi-1 Yi yi+1 ... yn
+~
+x' y0 ... yi-1 f0 ... fm Yi+1 ... Yn
+\stoptrans
+
\transform{Argument propagation}
{
\lam{x} is a global variable, bound to a user-defined function
TODO: The above definition looks too complicated... Can we find
something more concise?
-\subsubsection{Builtin functions}
-
-argument categories:
-
-function typed
-
-type / dictionary / other
-
-hardware representable
-
-TODO
-
-\subsection{Introducing main scope}
-This transformation is meant to introduce a single let expression that will be
-the "main scope". This is the let expression as described under requirement
-\ref[letexpr]. This let expression will contain only a single binding, which
-binds the original expression to some identifier and then evalutes to just
-this identifier (to comply with requirement \in[retexpr]).
-
-Formally, we can describe the transformation as follows.
-
-\transformold{Main scope introduction}
-{
-\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR
-\NR
-\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR
-\NC \app{transform'}{\expr{expr}} \NC = \expr{\letexpr{\bind{x}{expr}}{x}} \NR
-}
-
-When applying this transformation to our running example, we get the following
-program.
-
-\starttyping
- \x c d ->
- let r = (let s = foo x
- in
- case s of
- (a, b) ->
- case a of
- High -> add c d
- Low -> let
- op' = case b of
- High -> sub
- Low -> \c d -> c
- in
- op' d c
- )
- in
- r
-\stoptyping
-
-\subsection{Scope flattening}
-This transformation tries to flatten the topmost let expression in a bind,
-{\em i.e.}, bind all identifiers in the same scope, and bind them to simple
-expressions only (so simplify deeply nested expressions).
-
-Formally, we can describe the transformation as follows.
-
-\transformold{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR
-\NR
-\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR
-\NC \app{transform'}{\expr{\letexpr{binds}{expr}}} \NC = \expr{\letexpr{\app{concat . map . flatten}{binds}}{expr}}\NR
-\NR
-\NC \app{flatten}{\expr{\bind{x}{\letexpr{binds}{expr}}}} \NC = (\app{concat . map . flatten}{binds}) \cup \{\app{flatten}{\expr{\bind{x}{expr}}}\}\NR
-\NC \app{flatten}{\expr{\bind{x}{\case{s}{alts}}}} \NC = \app{concat}{binds'} \cup \{\bind{x}{\case{s}{alts'}}\}\NR
-\NC \NC \where{(binds', alts')=\app{unzip.map.(flattenalt s)}{alts}}\NR
-\NC \app{\app{flattenalt}{s}}{\expr{\alt{\app{con}{x_0\;..\;x_n}}{expr}}} \NC = (extracts \cup \{\app{flatten}{bind}\}, alt)\NR
-\NC \NC \where{extracts =\{\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_0}}},}\NR
-\NC \NC \;..\;,\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_n}}}\} \NR
-\NC \NC bind = \expr{\bind{y}{expr}}\NR
-\NC \NC alt = \expr{\alt{\app{con}{\_\;..\;\_}}{y}}\NR
-}
-
-When applying this transformation to our running example, we get the following
-program.
-
-\starttyping
- \x c d ->
- let s = foo x
- r = case s of
- (_, _) -> y
- a = case s of (a, b) -> a
- b = case s of (a, b) -> b
- y = case a of
- High -> g
- Low -> h
- g = add c d
- h = op' d c
- op' = case b of
- High -> i
- Low -> j
- i = sub
- j = \c d -> c
- in
- r
-\stoptyping
-
-\subsection{More transformations}
-As noted before, the above transformations are not complete. Other needed
-transformations include:
-\startitemize
-\item Inlining of local identifiers with a function type. Since these cannot
-be represented in hardware directly, they must be transformed into something
-else. Inlining them should always be possible without loss of semantics (TODO:
-How true is this?) and can expose new possibilities for other transformations
-passes (such as application propagation when inlining {\tt j} above).
-\item A variation on inlining local identifiers is the propagation of
-function arguments with a function type. This will probably be initiated when
-transforming the caller (and not the callee), but it will also deal with
-identifiers with a function type that are unrepresentable in hardware.
-
-Special care must be taken here, since the expression that is propagated into
-the callee comes from a different scope. The function typed argument must thus
-be replaced by any identifiers from the callers scope that the propagated
-expression uses.
-
-Note that propagating an argument will change both a function's interface and
-implementation. For this to work, a new function should be created instead of
-modifying the original function, so any other callers will not be broken.
-\item Something similar should happen with return values with function types.
-\item Polymorphism must be removed from all user-defined functions. This is
-again similar to propagation function typed arguments, since this will also
-create duplicates of functions (for a specific type). This is an operation
-that is commonly known as "specialization" and already happens in GHC (since
-non-polymorph functions can be a lot faster than generic ones).
-\item More builtin expressions should be added and these should be evaluated
-by the compiler. For example, map, fold, +.
-\stopitemize
-
-Initial example
+\subsection{Cast propagation}
+This transform pushes casts down into the expression as far as possible.
+\subsection{Let recursification}
+This transform makes all lets recursive.
+\subsection{Let simplification}
+This transform makes the result value of all let expressions a simple
+variable reference.
+\subsection{Let flattening}
+This transform turns two nested lets (\lam{let x = (let ... in ...) in
+...}) into a single let.
+\subsection{Simple let binding removal}
+This transforms inlines simple let bindings (\eg a = b).
+\subsection{Function inlining}
+This transform inlines let bindings of a funtion type. TODO: This should
+be generelized to anything that is non representable at runtime, or
+something like that.
+\subsection{Scrutinee simplification}
+This transform ensures that the scrutinee of a case expression is always
+a simple variable reference.
+\subsection{Case binder wildening}
+This transform replaces all binders of a each case alternative with a
+wild binder (\ie, one that is never referred to). This will possibly
+introduce a number of new "selector" case statements, that only select
+one element from an algebraic datatype and bind it to a variable.
+\subsection{Case value simplification}
+This transform simplifies the result value of each case alternative by
+binding the value in a let expression and replacing the value by a
+simple variable reference.
+\subsection{Case removal}
+This transform removes any case statements with a single alternative and
+only wild binders.
+
+\subsection{Example sequence}
+
+This section lists an example expression, with a sequence of transforms
+applied to it. The exact transforms given here probably don't exactly
+match the transforms given above anymore, but perhaps this can clarify
+the big picture a bit.
+
+TODO: Update or remove this section.
\startlambda
λx.