\mainlanguage [en]
\setuppapersize[A4][A4]
% Define a custom typescript. We could also have put the \definetypeface's
% directly in the script, without a typescript, but I guess this is more
% elegant...
\starttypescript[Custom]
% This is a sans font that supports greek symbols
\definetypeface [Custom] [ss] [sans] [iwona] [default]
\definetypeface [Custom] [rm] [serif] [antykwa-torunska] [default]
\definetypeface [Custom] [tt] [mono] [modern] [default]
\definetypeface [Custom] [mm] [math] [modern] [default]
\stoptypescript
\usetypescript [Custom]
% Use our custom typeface
\switchtotypeface [Custom] [10pt]
% The function application operator, which expands to a space in math mode
\define[1]\expr{|#1|}
\define[2]\app{#1\;#2}
\define[2]\lam{λ#1 \xrightarrow #2}
\define[2]\letexpr{{\bold let}\;#1\;{\bold in}\;#2}
\define[2]\case{{\bold case}\;#1\;{\bold of}\;#2}
\define[2]\alt{#1 \xrightarrow #2}
\define[2]\bind{#1:#2}
\define[1]\where{{\bold where}\;#1}
% A transformation
\definefloat[transformation][transformations]
\define[2]\transform{
\startframedtext[width=\textwidth]
#2
\stopframedtext
}
\define\conclusion{\blackrule[height=0.5pt,depth=0pt,width=.5\textwidth]}
\define\nextrule{\vskip1cm}
\define[2]\transformold{
%\placetransformation[here]{#1}
\startframedtext[width=\textwidth]
\startformula \startalign
#2
\stopalign \stopformula
\stopframedtext
}
% A shortcut for italicized e.g.
\define[0]\eg{{\em e.g.}}
% Install the lambda calculus pretty-printer, as defined in pret-lam.lua.
\installprettytype [LAM] [LAM]
\definetyping[lambda][option=LAM,style=sans]
\definetype[lam][option=LAM,style=sans]
% An (invisible) frame to hold a lambda expression
\define[1]\lamframe{
% Put a frame around lambda expressions, so they can have multiple
% lines and still appear inline.
% The align=right option really does left-alignment, but without the
% program will end up on a single line. The strut=no option prevents a
% bunch of empty space at the start of the frame.
\framed[offset=0mm,location=middle,strut=no,align=right,frame=off]{#1}
}
\define[2]\transbuf{
% Make \typebuffer uses the LAM pretty printer and a sans-serif font
% Also prevent any extra spacing above and below caused by the default
% before=\blank and after=\blank.
\setuptyping[option=LAM,style=sans,before=,after=]
% Prevent the arrow from ending up below the first frame (a \framed
% at the start of a line defaults to using vmode).
\dontleavehmode
% Put the elements in frames, so they can have multiple lines and be
% middle-aligned
\lamframe{\typebuffer[#1]}
\lamframe{\Rightarrow}
\lamframe{\typebuffer[#2]}
% Reset the typing settings to their defaults
\setuptyping[option=none,style=\tttf]
}
% This is the same as \transbuf above, but it accepts text directly instead
% of through buffers. This only works for single lines, however.
\define[2]\trans{
\dontleavehmode
\lamframe{\lam{#1}}
\lamframe{\Rightarrow}
\lamframe{\lam{#2}}
}
% A helper to print a single example in the half the page width. The example
% text should be in a buffer whose name is given in an argument.
%
% The align=right option really does left-alignment, but without the program
% will end up on a single line. The strut=no option prevents a bunch of empty
% space at the start of the frame.
\define[1]\example{
\framed[offset=1mm,align=right,strut=no]{
\setuptyping[option=LAM,style=sans,before=,after=]
\typebuffer[#1]
\setuptyping[option=none,style=\tttf]
}
}
% A transformation example
\definefloat[example][examples]
\setupcaption[example][location=top] % Put captions on top
\define[3]\transexample{
\placeexample[here]{#1}
\startcombination[2*1]
{\example{#2}}{Original program}
{\example{#3}}{Transformed program}
\stopcombination
}
\define[3]\transexampleh{
% \placeexample[here]{#1}
% \startcombination[1*2]
% {\example{#2}}{Original program}
% {\example{#3}}{Transformed program}
% \stopcombination
}
% Define a custom description format for the builtinexprs below
\definedescription[builtindesc][headstyle=bold,style=normal,location=top]
\starttext
\title {Core transformations for hardware generation}
Matthijs Kooijman
\section{Introduction}
As a new approach to translating Core to VHDL, we investigate a number of
transformations on our Core program, which should bring the program into a
well-defined "canonical" form, which is subsequently trivial to translate to
VHDL.
The transformations as presented here are far from complete, but are meant as
an exploration of possible transformations. In the running example below, we
apply each of the transformations exactly once, in sequence. As will be
apparent from the end result, there will be additional transformations needed
to fully reach our goal, and some transformations must be applied more than
once. How exactly to (efficiently) do this, has not been investigated.
Lastly, I hope to be able to state a number of pre- and postconditions for
each transformation. If these can be proven for each transformation, and it
can be shown that there exists some ordering of transformations for which the
postcondition implies the canonical form, we can show that the transformations
do indeed transform any program (probably satisfying a number of
preconditions) to the canonical form.
\section{Goal}
The transformations described here have a well-defined goal: To bring the
program in a well-defined form that is directly translatable to hardware,
while fully preserving the semantics of the program.
This {\em canonical form} is again a Core program, but with a very specific
structure. A function in canonical form has nested lambda's at the top, which
produce a let expression. This let expression binds every function application
in the function and produces either a simple identifier or a tuple of
identifiers. Every bound value in the let expression is either a simple
function application or a case expression to extract a single element from a
tuple returned by a function.
An example of a program in canonical form would be:
\starttyping
-- All arguments are an inital lambda
\x c d ->
-- There is one let expression at the top level
let
-- Calling some other user-defined function.
s = foo x
-- Extracting result values from a tuple
a = case s of (a, b) -> a
b = case s of (a, b) -> b
-- Some builtin expressions
rh = add c d
rhh = sub d c
-- Conditional connections
rl = case b of
High -> rhh
Low -> d
r = case a of
High -> rh
Low -> rl
in
-- The actual result
r
\stoptyping
In this and all following programs, the following definitions are assumed to
be available:
\starttyping
data Bit = Low | High
foo :: Int -> (Bit, Bit)
add :: Int -> Int -> Int
sub :: Int -> Int -> Int
\stoptyping
When looking at such a program from a hardware perspective, the top level
lambda's define the input ports. The value produced by the let expression are
the output ports. Each function application bound by the let expression
defines a component instantiation, where the input and output ports are mapped
to local signals or arguments. The tuple extracting case expressions don't map
to
\subsection{Canonical form definition}
Formally, the canonical form is a core program obeying the following
constraints.
\startitemize[R,inmargin]
\item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$.
$fun$ is an identifier that will be bound as a global identifier.
\item A $lamexpr$ has the form $\expr{\lam{arg}{lamexpr}}$ or
$\expr{letexpr}$. $arg$ is an identifier which will be bound as an $argument$.
\item[letexpr] A $letexpr$ has the form $\expr{\letexpr{letbinds}{retexpr}}$.
\item $letbinds$ is a list with elements of the form
$\expr{\bind{res}{appexpr}}$ or $\expr{\bind{res}{builtinexpr}}$, where $res$ is
an identifier that will be bound as local identifier. The type of the bound
value must be a $hardware\;type$.
\item[builtinexpr] A $builtinexpr$ is an expression that can be mapped to an
equivalent VHDL expression. Since there are many supported forms for this,
these are defined in a separate table.
\item An $appexpr$ has the form $\expr{fun}$ or $\expr{\app{appexpr}{x}}$,
where $fun$ is a global identifier and $x$ is a local identifier.
\item[retexpr] A $retexpr$ has the form $\expr{x}$ or $\expr{tupexpr}$, where $x$ is a local identifier that is bound as an $argument$ or $result$. A $retexpr$ must
be of a $hardware\;type$.
\item A $tupexpr$ has the form $\expr{con}$ or $\expr{\app{tupexpr}{x}}$,
where $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$) and $x$ is
a local identifier.
\item A $hardware\;type$ is a type that can be directly translated to
hardware. This includes the types $Bit$, $SizedWord$, tuples containing
elements of $hardware\;type$s, and will include others. This explicitely
excludes function types.
\stopitemize
TODO: Say something about uniqueness of identifiers
\subsection{Builtin expressions}
A $builtinexpr$, as defined at \in[builtinexpr] can have any of the following forms.
\startitemize[m,inmargin]
\item
$tuple\_extract=\expr{\case{t}{\alt{\app{con}{x_0\;x_1\;..\;x_n}}{x_i}}}$,
where $t$ can be any local identifier, $con$ is a tuple constructor ({\em
e.g.} $(,)$ or $(,,,)$), $x_0$ to $x_n$ can be any identifier, and $x_i$ can
be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$.
\item TODO: Many more!
\stopitemize
\section{Transformation passes}
In this section we describe the actual transformations. Here we're using
mostly Core-like notation, with a few notable points.
\startitemize
\item A core expression (in contrast with a transformation function, for
example), is enclosed in pipes. For example, $\app{transform}{\expr{\lam{z}{z+1}}}$
is the transform function applied to a lambda core expression.
Note that this notation might not be consistently applied everywhere. In
particular where a non-core function is used inside a core expression, things
get slightly confusing.
\item A bind is written as $\expr{\bind{x}{expr}}$. This means binding the identifier
$x$ to the expression $expr$.
\item In the core examples, the layout rule from Haskell is loosely applied.
It should be evident what was meant from indentation, even though it might nog
be strictly correct.
\stopitemize
\subsection{Example}
In the descriptions of transformations below, the following (slightly
contrived) example program will be used as the running example. Note that this
the example for the canonical form given above is the same program, but in
canonical form.
\starttyping
\x ->
let s = foo x
in
case s of
(a, b) ->
case a of
High -> add
Low -> let
op' = case b of
High -> sub
Low -> \c d -> c
in
\c d -> op' d c
\stoptyping
\subsection{η-abstraction}
This transformation makes sure that all arguments of a function-typed
expression are named, by introducing lambda expressions. When combined with
β-reduction and function inlining below, all function-typed expressions should
be lambda abstractions or global identifiers.
\transform{η-abstraction}
{
\lam{E :: * -> *}
\lam{E} is not the first argument of an application.
\lam{E} is not a lambda abstraction.
\lam{x} is a variable that does not occur free in E.
\conclusion
\trans{E}{λx.E x}
}
\startbuffer[from]
foo = λa -> case a of
True -> λb.mul b b
False -> id
\stopbuffer
\startbuffer[to]
foo = λa.λx -> (case a of
True -> λb.mul b b
False -> λy.id y) x
\stopbuffer
\transexample{η-abstraction}{from}{to}
\subsection{Extended β-reduction}
This transformation is meant to propagate application expressions downwards
into expressions as far as possible. In lambda calculus, this reduction
is known as β-reduction, but it is of course only defined for
applications of lambda abstractions. We extend this reduction to also
work for the rest of core (case and let expressions).
\startbuffer[from]
(case x of
p1 -> E1
\vdots
pn -> En) M
\stopbuffer
\startbuffer[to]
case x of
p1 -> E1 M
\vdots
pn -> En M
\stopbuffer
\transform{Extended β-reduction}
{
\conclusion
\trans{(λx.E) M}{E[M/x]}
\nextrule
\conclusion
\trans{(let binds in E) M}{let binds in E M}
\nextrule
\conclusion
\transbuf{from}{to}
}
\startbuffer[from]
let a = (case x of
True -> id
False -> neg
) 1
b = (let y = 3 in add y) 2
in
(λz.add 1 z) 3
\stopbuffer
\startbuffer[to]
let a = case x of
True -> id 1
False -> neg 1
b = let y = 3 in add y 2
in
add 1 3
\stopbuffer
\transexample{Extended β-reduction}{from}{to}
\subsection{Argument simplification}
The transforms in this section deal with simplifying application
arguments into normal form. The goal here is to:
\startitemize
\item Make all arguments of user-defined functions (\eg, of which
we have a function body) simple variable references of a runtime
representable type.
\item Make all arguments of builtin functions either:
\startitemize
\item A type argument.
\item A dictionary argument.
\item A type level expression.
\item A variable reference of a runtime representable type.
\item A variable reference or partial application of a function type.
\stopitemize
\stopitemize
\subsubsection{User-defined functions}
We can divide the arguments of a user-defined function into two
categories:
\startitemize
\item Runtime representable typed arguments (\eg bits or vectors).
\item Non-runtime representable typed arguments.
\stopitemize
The next two transformations will deal with each of these two kinds of argument respectively.
\subsubsubsection{Argument extraction}
This transform deals with arguments to user-defined functions that
are of a runtime representable type. These arguments can be preserved in
the program, since they can be translated to input ports later on.
However, since we can only connect signals to input ports, these
arguments must be reduced to simple variables (for which signals will be
produced).
TODO: It seems we can map an expression to a port, not only a signal.
Perhaps this makes this transformation not needed?
TODO: Say something about dataconstructors (without arguments, like True
or False), which are variable references of a runtime representable
type, but do not result in a signal.
To reduce a complex expression to a simple variable reference, we create
a new let expression around the application, which binds the complex
expression to a new variable. The original function is then applied to
this variable.
\transform{Argument extract}
{
\lam{X} is a (partial application of) a user-defined function
\lam{Y} is of a hardware representable type
\lam{Y} is not a variable referene
\conclusion
\trans{X Y}{let z = Y in X z}
}
\subsubsubsection{Argument propagation}
This transform deals with arguments to user-defined functions that are
not representable at runtime. This means these arguments cannot be
preserved in the final form and most be {\em propagated}.
Propagation means to create a specialized version of the called
function, with the propagated argument already filled in. As a simple
example, in the following program:
\startlambda
f = λa.λb.a + b
inc = λa.f a 1
\stoplambda
we could {\em propagate} the constant argument 1, with the following
result:
\startlambda
f' = λa.a + 1
inc = λa.f' a
\stoplambda
Special care must be taken when the to-be-propagated expression has any
free variables. If this is the case, the original argument should not be
removed alltogether, but replaced by all the free variables of the
expression. In this way, the original expression can still be evaluated
inside the new function. Also, this brings us closer to our goal: All
these free variables will be simple variable references.
To prevent us from propagating the same argument over and over, a simple
local variable reference is not propagated (since is has exactly one
free variable, itself, we would only replace that argument with itself).
This shows that any free local variables that are not runtime representable
cannot be brought into normal form by this transform. We rely on an
inlining transformation to replace such a variable with an expression we
can propagate again.
TODO: Move these definitions somewhere sensible.
Definition: A global variable is any variable that is bound at the
top level of a program. A local variable is any other variable.
Definition: A hardware representable type is a type that we can generate
a signal for in hardware. For example, a bit, a vector of bits, a 32 bit
unsigned word, etc. Types that are not runtime representable notably
include (but are not limited to): Types, dictionaries, functions.
Definition: A builtin function is a function for which a builtin
hardware translation is available, because its actual definition is not
translatable. A user-defined function is any other function.
\transform{Argument propagation}
{
\lam{x} is a global variable, bound to a user-defined function
\lam{x = E}
\lam{Y_i} is not of a runtime representable type
\lam{Y_i} is not a local variable reference
\conclusion
\lam{f0 ... fm} = free local vars of \lam{Y_i}
\lam{x'} is a new global variable
\lam{x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn . E y0 ... yi-1 Yi yi+1 ... yn}
\trans{x Y0 ... Yi ... Yn}{x' y0 ... yi-1 f0 ... fm Yi+1 ... Yn}
}
TODO: The above definition looks too complicated... Can we find
something more concise?
\subsubsection{Builtin functions}
argument categories:
function typed
type / dictionary / other
hardware representable
TODO
\subsection{Introducing main scope}
This transformation is meant to introduce a single let expression that will be
the "main scope". This is the let expression as described under requirement
\ref[letexpr]. This let expression will contain only a single binding, which
binds the original expression to some identifier and then evalutes to just
this identifier (to comply with requirement \in[retexpr]).
Formally, we can describe the transformation as follows.
\transformold{Main scope introduction}
{
\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR
\NR
\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR
\NC \app{transform'}{\expr{expr}} \NC = \expr{\letexpr{\bind{x}{expr}}{x}} \NR
}
When applying this transformation to our running example, we get the following
program.
\starttyping
\x c d ->
let r = (let s = foo x
in
case s of
(a, b) ->
case a of
High -> add c d
Low -> let
op' = case b of
High -> sub
Low -> \c d -> c
in
op' d c
)
in
r
\stoptyping
\subsection{Scope flattening}
This transformation tries to flatten the topmost let expression in a bind,
{\em i.e.}, bind all identifiers in the same scope, and bind them to simple
expressions only (so simplify deeply nested expressions).
Formally, we can describe the transformation as follows.
\transformold{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR
\NR
\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR
\NC \app{transform'}{\expr{\letexpr{binds}{expr}}} \NC = \expr{\letexpr{\app{concat . map . flatten}{binds}}{expr}}\NR
\NR
\NC \app{flatten}{\expr{\bind{x}{\letexpr{binds}{expr}}}} \NC = (\app{concat . map . flatten}{binds}) \cup \{\app{flatten}{\expr{\bind{x}{expr}}}\}\NR
\NC \app{flatten}{\expr{\bind{x}{\case{s}{alts}}}} \NC = \app{concat}{binds'} \cup \{\bind{x}{\case{s}{alts'}}\}\NR
\NC \NC \where{(binds', alts')=\app{unzip.map.(flattenalt s)}{alts}}\NR
\NC \app{\app{flattenalt}{s}}{\expr{\alt{\app{con}{x_0\;..\;x_n}}{expr}}} \NC = (extracts \cup \{\app{flatten}{bind}\}, alt)\NR
\NC \NC \where{extracts =\{\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_0}}},}\NR
\NC \NC \;..\;,\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_n}}}\} \NR
\NC \NC bind = \expr{\bind{y}{expr}}\NR
\NC \NC alt = \expr{\alt{\app{con}{\_\;..\;\_}}{y}}\NR
}
When applying this transformation to our running example, we get the following
program.
\starttyping
\x c d ->
let s = foo x
r = case s of
(_, _) -> y
a = case s of (a, b) -> a
b = case s of (a, b) -> b
y = case a of
High -> g
Low -> h
g = add c d
h = op' d c
op' = case b of
High -> i
Low -> j
i = sub
j = \c d -> c
in
r
\stoptyping
\subsection{More transformations}
As noted before, the above transformations are not complete. Other needed
transformations include:
\startitemize
\item Inlining of local identifiers with a function type. Since these cannot
be represented in hardware directly, they must be transformed into something
else. Inlining them should always be possible without loss of semantics (TODO:
How true is this?) and can expose new possibilities for other transformations
passes (such as application propagation when inlining {\tt j} above).
\item A variation on inlining local identifiers is the propagation of
function arguments with a function type. This will probably be initiated when
transforming the caller (and not the callee), but it will also deal with
identifiers with a function type that are unrepresentable in hardware.
Special care must be taken here, since the expression that is propagated into
the callee comes from a different scope. The function typed argument must thus
be replaced by any identifiers from the callers scope that the propagated
expression uses.
Note that propagating an argument will change both a function's interface and
implementation. For this to work, a new function should be created instead of
modifying the original function, so any other callers will not be broken.
\item Something similar should happen with return values with function types.
\item Polymorphism must be removed from all user-defined functions. This is
again similar to propagation function typed arguments, since this will also
create duplicates of functions (for a specific type). This is an operation
that is commonly known as "specialization" and already happens in GHC (since
non-polymorph functions can be a lot faster than generic ones).
\item More builtin expressions should be added and these should be evaluated
by the compiler. For example, map, fold, +.
\stopitemize
Initial example
\startlambda
λx.
let s = foo x
in
case s of
(a, b) ->
case a of
High -> add
Low -> let
op' = case b of
High -> sub
Low -> λc.λd.c
in
λc.λd.op' d c
\stoplambda
After top-level η-abstraction:
\startlambda
λx.λc.λd.
(let s = foo x
in
case s of
(a, b) ->
case a of
High -> add
Low -> let
op' = case b of
High -> sub
Low -> λc.λd.c
in
λc.λd.op' d c
) c d
\stoplambda
After (extended) β-reduction:
\startlambda
λx.λc.λd.
let s = foo x
in
case s of
(a, b) ->
case a of
High -> add c d
Low -> let
op' = case b of
High -> sub
Low -> λc.λd.c
in
op' d c
\stoplambda
After return value extraction:
\startlambda
λx.λc.λd.
let s = foo x
r = case s of
(a, b) ->
case a of
High -> add c d
Low -> let
op' = case b of
High -> sub
Low -> λc.λd.c
in
op' d c
in
r
\stoplambda
Scrutinee simplification does not apply.
After case binder wildening:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) ->
case a of
High -> add c d
Low -> let op' = case b of
High -> sub
Low -> λc.λd.c
in
op' d c
in
r
\stoplambda
After case value simplification
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) -> r'
rh = add c d
rl = let rll = λc.λd.c
op' = case b of
High -> sub
Low -> rll
in
op' d c
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After let flattening:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) -> r'
rh = add c d
rl = op' d c
rll = λc.λd.c
op' = case b of
High -> sub
Low -> rll
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After function inlining:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) -> r'
rh = add c d
rl = (case b of
High -> sub
Low -> λc.λd.c) d c
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After (extended) β-reduction again:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) -> r'
rh = add c d
rl = case b of
High -> sub d c
Low -> d
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After case value simplification again:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = case s of (_, _) -> r'
rh = add c d
rlh = sub d c
rl = case b of
High -> rlh
Low -> d
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After case removal:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
r = r'
rh = add c d
rlh = sub d c
rl = case b of
High -> rlh
Low -> d
r' = case a of
High -> rh
Low -> rl
in
r
\stoplambda
After let bind removal:
\startlambda
λx.λc.λd.
let s = foo x
a = case s of (a, _) -> a
b = case s of (_, b) -> b
rh = add c d
rlh = sub d c
rl = case b of
High -> rlh
Low -> d
r' = case a of
High -> rh
Low -> rl
in
r'
\stoplambda
Application simplification is not applicable.
\stoptext