\mainlanguage [en] \setuppapersize[A4][A4] % Define a custom typescript. We could also have put the \definetypeface's % directly in the script, without a typescript, but I guess this is more % elegant... \starttypescript[Custom] % This is a sans font that supports greek symbols \definetypeface [Custom] [ss] [sans] [iwona] [default] \definetypeface [Custom] [rm] [serif] [antykwa-torunska] [default] \definetypeface [Custom] [tt] [mono] [modern] [default] \definetypeface [Custom] [mm] [math] [modern] [default] \stoptypescript \usetypescript [Custom] % Use our custom typeface \switchtotypeface [Custom] [10pt] % The function application operator, which expands to a space in math mode \define[1]\expr{|#1|} \define[2]\app{#1\;#2} \define[2]\lam{λ#1 \xrightarrow #2} \define[2]\letexpr{{\bold let}\;#1\;{\bold in}\;#2} \define[2]\case{{\bold case}\;#1\;{\bold of}\;#2} \define[2]\alt{#1 \xrightarrow #2} \define[2]\bind{#1:#2} \define[1]\where{{\bold where}\;#1} % A transformation \definefloat[transformation][transformations] \define[2]\transform{ %\placetransformation[here]{#1} \startframedtext[width=\textwidth] \startformula \startalign #2 \stopalign \stopformula \stopframedtext } % Install the lambda calculus pretty-printer, as defined in pret-lam.lua. \installprettytype [LAM] [LAM] % An (invisible) frame to hold a lambda expression \define[1]\lamframe{ % Put a frame around lambda expressions, so they can have multiple % lines and still appear inline. % The align=right option really does left-alignment, but without the % program will end up on a single line. The strut=no option prevents a % bunch of empty space at the start of the frame. \framed[offset=0mm,location=middle,strut=no,align=right,frame=off]{#1} } \define[2]\trans{ % Make \typebuffer uses the LAM pretty printer and a sans-serif font % Also prevent any extra spacing above and below caused by the default % before=\blank and after=\blank. \setuptyping[option=LAM,style=sans,before=,after=] % Prevent the arrow from ending up below the first frame (a \framed % at the start of a line defaults to using vmode). \dontleavehmode % Put the elements in frames, so they can have multiple lines and be % middle-aligned \lamframe{\typebuffer[#1]} \lamframe{\Rightarrow} \lamframe{\typebuffer[#2]} % Reset the typing settings to their defaults \setuptyping[option=none,style=\tttf] } % A helper to print a single example in the half the page width. The example % text should be in a buffer whose name is given in an argument. % % The align=right option really does left-alignment, but without the program % will end up on a single line. The strut=no option prevents a bunch of empty % space at the start of the frame. \define[1]\example{\framed[frameoffset=2mm,align=right,strut=no]{\typebuffer[#1]}} % A transformation example \definefloat[example][examples] \setupcaption[example][location=top] % Put captions on top \define[3]\transexample{ \placeexample[here]{#1} \startcombination[2*1] {\example{#2}}{Original program} {\example{#3}}{Transformed program} \stopcombination } \define[3]\transexampleh{ \placeexample[here]{#1} \startcombination[1*2] {\example{#2}}{Original program} {\example{#3}}{Transformed program} \stopcombination } % Define a custom description format for the builtinexprs below \definedescription[builtindesc][headstyle=bold,style=normal,location=top] \starttext \title {Core transformations for hardware generation} Matthijs Kooijman \section{Introduction} As a new approach to translating Core to VHDL, we investigate a number of transformations on our Core program, which should bring the program into a well-defined "canonical" form, which is subsequently trivial to translate to VHDL. The transformations as presented here are far from complete, but are meant as an exploration of possible transformations. In the running example below, we apply each of the transformations exactly once, in sequence. As will be apparent from the end result, there will be additional transformations needed to fully reach our goal, and some transformations must be applied more than once. How exactly to (efficiently) do this, has not been investigated. Lastly, I hope to be able to state a number of pre- and postconditions for each transformation. If these can be proven for each transformation, and it can be shown that there exists some ordering of transformations for which the postcondition implies the canonical form, we can show that the transformations do indeed transform any program (probably satisfying a number of preconditions) to the canonical form. \section{Goal} The transformations described here have a well-defined goal: To bring the program in a well-defined form that is directly translatable to hardware, while fully preserving the semantics of the program. This {\em canonical form} is again a Core program, but with a very specific structure. A function in canonical form has nested lambda's at the top, which produce a let expression. This let expression binds every function application in the function and produces either a simple identifier or a tuple of identifiers. Every bound value in the let expression is either a simple function application or a case expression to extract a single element from a tuple returned by a function. An example of a program in canonical form would be: \starttyping -- All arguments are an inital lambda \x c d -> -- There is one let expression at the top level let -- Calling some other user-defined function. s = foo x -- Extracting result values from a tuple a = case s of (a, b) -> a b = case s of (a, b) -> b -- Some builtin expressions rh = add c d rhh = sub d c -- Conditional connections rl = case b of High -> rhh Low -> d r = case a of High -> rh Low -> rl in -- The actual result r \stoptyping In this and all following programs, the following definitions are assumed to be available: \starttyping data Bit = Low | High foo :: Int -> (Bit, Bit) add :: Int -> Int -> Int sub :: Int -> Int -> Int \stoptyping When looking at such a program from a hardware perspective, the top level lambda's define the input ports. The value produced by the let expression are the output ports. Each function application bound by the let expression defines a component instantiation, where the input and output ports are mapped to local signals or arguments. The tuple extracting case expressions don't map to \subsection{Canonical form definition} Formally, the canonical form is a core program obeying the following constraints. \startitemize[R,inmargin] \item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$. $fun$ is an identifier that will be bound as a global identifier. \item A $lamexpr$ has the form $\expr{\lam{arg}{lamexpr}}$ or $\expr{letexpr}$. $arg$ is an identifier which will be bound as an $argument$. \item[letexpr] A $letexpr$ has the form $\expr{\letexpr{letbinds}{retexpr}}$. \item $letbinds$ is a list with elements of the form $\expr{\bind{res}{appexpr}}$ or $\expr{\bind{res}{builtinexpr}}$, where $res$ is an identifier that will be bound as local identifier. The type of the bound value must be a $hardware\;type$. \item[builtinexpr] A $builtinexpr$ is an expression that can be mapped to an equivalent VHDL expression. Since there are many supported forms for this, these are defined in a separate table. \item An $appexpr$ has the form $\expr{fun}$ or $\expr{\app{appexpr}{x}}$, where $fun$ is a global identifier and $x$ is a local identifier. \item[retexpr] A $retexpr$ has the form $\expr{x}$ or $\expr{tupexpr}$, where $x$ is a local identifier that is bound as an $argument$ or $result$. A $retexpr$ must be of a $hardware\;type$. \item A $tupexpr$ has the form $\expr{con}$ or $\expr{\app{tupexpr}{x}}$, where $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$) and $x$ is a local identifier. \item A $hardware\;type$ is a type that can be directly translated to hardware. This includes the types $Bit$, $SizedWord$, tuples containing elements of $hardware\;type$s, and will include others. This explicitely excludes function types. \stopitemize TODO: Say something about uniqueness of identifiers \subsection{Builtin expressions} A $builtinexpr$, as defined at \in[builtinexpr] can have any of the following forms. \startitemize[m,inmargin] \item $tuple\_extract=\expr{\case{t}{\alt{\app{con}{x_0\;x_1\;..\;x_n}}{x_i}}}$, where $t$ can be any local identifier, $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$), $x_0$ to $x_n$ can be any identifier, and $x_i$ can be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$. \item TODO: Many more! \stopitemize \section{Transformation passes} In this section we describe the actual transformations. Here we're using mostly Core-like notation, with a few notable points. \startitemize \item A core expression (in contrast with a transformation function, for example), is enclosed in pipes. For example, $\app{transform}{\expr{\lam{z}{z+1}}}$ is the transform function applied to a lambda core expression. Note that this notation might not be consistently applied everywhere. In particular where a non-core function is used inside a core expression, things get slightly confusing. \item A bind is written as $\expr{\bind{x}{expr}}$. This means binding the identifier $x$ to the expression $expr$. \item In the core examples, the layout rule from Haskell is loosely applied. It should be evident what was meant from indentation, even though it might nog be strictly correct. \stopitemize \subsection{Example} In the descriptions of transformations below, the following (slightly contrived) example program will be used as the running example. Note that this the example for the canonical form given above is the same program, but in canonical form. \starttyping \x -> let s = foo x in case s of (a, b) -> case a of High -> add Low -> let op' = case b of High -> sub Low -> \c d -> c in \c d -> op' d c \stoptyping \subsection{Argument extraction} This transformation makes sure that all of a bindings arguments are always bound to variables at the top level of the bound value. Formally, we can describe this transformation as follows. \transform{Argument extraction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR \NR \NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR \NC \app{transform'}{\expr{expr :: a \xrightarrow b}} \NC = \expr{\lam{x}{\app{transform'}{\expr{(\app{expr}{x})}}}} \NR } When applying this transformation to our running example, we get the following program. \starttyping \x c d -> (let s = foo x in case s of (a, b) -> case a of High -> add Low -> let op' = case b of High -> sub Low -> \c d -> c in \c d -> op' d c ) c d \stoptyping \startbuffer[from] foo = \x -> case x of True -> (\y -> mul y y); False -> id \stopbuffer \startbuffer[to] foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z \stopbuffer \transexampleh{Argument extraction example}{from}{to} \subsection{Application propagation} This transformation is meant to propagate application expressions downwards into expressions as far as possible. Formally, we can describe this transformation as follows. \transform{Application propagation} { \NC \app{transform}{\expr{\app{(\letexpr{binds}{expr})}{y}}} \NC = \expr{\letexpr{binds}{(\app{expr}{y})}}\NR \NC \app{transform}{\expr{\app{(\lam{x}{expr})}{y}}} \NC = \app{\app{subs}{x \xRightarrow y}}{\expr{expr}}\NR \NC \app{transform}{\expr{\app{(\case{x}{\alt{p}{e};...})}{y}}} \NC = \expr{\case{x}{\alt{p}{\app{e}{y}};...}}\;(for\;every\;alt)\NR } When applying this transformation to our running example, we get the following program. \starttyping \x c d -> let s = foo x in case s of (a, b) -> case a of High -> add c d Low -> let op' = case b of High -> sub Low -> \c d -> c in op' d c \stoptyping \startbuffer[from] foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z \stopbuffer \startbuffer[to] foo = \x z -> case x of True -> mul z z; False -> id z \stopbuffer \transexampleh{Application propagation example}{from}{to} \subsection{Introducing main scope} This transformation is meant to introduce a single let expression that will be the "main scope". This is the let expression as described under requirement \ref[letexpr]. This let expression will contain only a single binding, which binds the original expression to some identifier and then evalutes to just this identifier (to comply with requirement \in[retexpr]). Formally, we can describe the transformation as follows. \transform{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR \NR \NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR \NC \app{transform'}{\expr{expr}} \NC = \expr{\letexpr{\bind{x}{expr}}{x}} \NR } When applying this transformation to our running example, we get the following program. \starttyping \x c d -> let r = (let s = foo x in case s of (a, b) -> case a of High -> add c d Low -> let op' = case b of High -> sub Low -> \c d -> c in op' d c ) in r \stoptyping \subsection{Scope flattening} This transformation tries to flatten the topmost let expression in a bind, {\em i.e.}, bind all identifiers in the same scope, and bind them to simple expressions only (so simplify deeply nested expressions). Formally, we can describe the transformation as follows. \transform{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR \NR \NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR \NC \app{transform'}{\expr{\letexpr{binds}{expr}}} \NC = \expr{\letexpr{\app{concat . map . flatten}{binds}}{expr}}\NR \NR \NC \app{flatten}{\expr{\bind{x}{\letexpr{binds}{expr}}}} \NC = (\app{concat . map . flatten}{binds}) \cup \{\app{flatten}{\expr{\bind{x}{expr}}}\}\NR \NC \app{flatten}{\expr{\bind{x}{\case{s}{alts}}}} \NC = \app{concat}{binds'} \cup \{\bind{x}{\case{s}{alts'}}\}\NR \NC \NC \where{(binds', alts')=\app{unzip.map.(flattenalt s)}{alts}}\NR \NC \app{\app{flattenalt}{s}}{\expr{\alt{\app{con}{x_0\;..\;x_n}}{expr}}} \NC = (extracts \cup \{\app{flatten}{bind}\}, alt)\NR \NC \NC \where{extracts =\{\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_0}}},}\NR \NC \NC \;..\;,\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_n}}}\} \NR \NC \NC bind = \expr{\bind{y}{expr}}\NR \NC \NC alt = \expr{\alt{\app{con}{\_\;..\;\_}}{y}}\NR } When applying this transformation to our running example, we get the following program. \starttyping \x c d -> let s = foo x r = case s of (_, _) -> y a = case s of (a, b) -> a b = case s of (a, b) -> b y = case a of High -> g Low -> h g = add c d h = op' d c op' = case b of High -> i Low -> j i = sub j = \c d -> c in r \stoptyping \subsection{More transformations} As noted before, the above transformations are not complete. Other needed transformations include: \startitemize \item Inlining of local identifiers with a function type. Since these cannot be represented in hardware directly, they must be transformed into something else. Inlining them should always be possible without loss of semantics (TODO: How true is this?) and can expose new possibilities for other transformations passes (such as application propagation when inlining {\tt j} above). \item A variation on inlining local identifiers is the propagation of function arguments with a function type. This will probably be initiated when transforming the caller (and not the callee), but it will also deal with identifiers with a function type that are unrepresentable in hardware. Special care must be taken here, since the expression that is propagated into the callee comes from a different scope. The function typed argument must thus be replaced by any identifiers from the callers scope that the propagated expression uses. Note that propagating an argument will change both a function's interface and implementation. For this to work, a new function should be created instead of modifying the original function, so any other callers will not be broken. \item Something similar should happen with return values with function types. \item Polymorphism must be removed from all user-defined functions. This is again similar to propagation function typed arguments, since this will also create duplicates of functions (for a specific type). This is an operation that is commonly known as "specialization" and already happens in GHC (since non-polymorph functions can be a lot faster than generic ones). \item More builtin expressions should be added and these should be evaluated by the compiler. For example, map, fold, +. \stopitemize Initial example \starttyping \x -> let s = foo x in case s of (a, b) -> case a of High -> add Low -> let op' = case b of High -> sub Low -> \c d -> c in \c d -> op' d c \stoptyping After top-level η-abstraction: \starttyping \x c d -> (let s = foo x in case s of (a, b) -> case a of High -> add Low -> let op' = case b of High -> sub Low -> \c d -> c in \c d -> op' d c ) c d \stoptyping After (extended) β-reduction: \starttyping \x c d -> let s = foo x in case s of (a, b) -> case a of High -> add c d Low -> let op' = case b of High -> sub Low -> \c d -> c in op' d c \stoptyping After return value extraction: \starttyping \x c d -> let s = foo x r = case s of (a, b) -> case a of High -> add c d Low -> let op' = case b of High -> sub Low -> \c d -> c in op' d c in r \stoptyping Scrutinee simplification does not apply. After case binder wildening: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> case a of High -> add c d Low -> let op' = case b of High -> sub Low -> \c d -> c in op' d c in r \stoptyping After case value simplification \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> r' rh = add c d rl = let rll = \c d -> c op' = case b of High -> sub Low -> rll in op' d c r' = case a of High -> rh Low -> rl in r \stoptyping After let flattening: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> r' rh = add c d rl = op' d c rll = \c d -> c op' = case b of High -> sub Low -> rll r' = case a of High -> rh Low -> rl in r \stoptyping After function inlining: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> r' rh = add c d rl = (case b of High -> sub Low -> \c d -> c) d c r' = case a of High -> rh Low -> rl in r \stoptyping After (extended) β-reduction again: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> r' rh = add c d rl = case b of High -> sub d c Low -> d r' = case a of High -> rh Low -> rl in r \stoptyping After case value simplification again: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = case s of (_, _) -> r' rh = add c d rlh = sub d c rl = case b of High -> rlh Low -> d r' = case a of High -> rh Low -> rl in r \stoptyping After case removal: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b r = r' rh = add c d rlh = sub d c rl = case b of High -> rlh Low -> d r' = case a of High -> rh Low -> rl in r \stoptyping After let bind removal: \starttyping \x c d -> let s = foo x a = case s of (a, _) -> a b = case s of (_, b) -> b rh = add c d rlh = sub d c rl = case b of High -> rlh Low -> d r' = case a of High -> rh Low -> rl in r' \stoptyping Application simplification is not applicable. \stoptext