From 8e496c475d663131ad09c62dc5a3efe30e3f2e15 Mon Sep 17 00:00:00 2001 From: Matthijs Kooijman Date: Wed, 20 May 2009 11:44:59 +0200 Subject: [PATCH 1/1] Add initial version of the Core2Core document. This document describes core transformations and the canonical form. --- Core2Core.tex | 425 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 425 insertions(+) create mode 100644 Core2Core.tex diff --git a/Core2Core.tex b/Core2Core.tex new file mode 100644 index 0000000..437aae7 --- /dev/null +++ b/Core2Core.tex @@ -0,0 +1,425 @@ +\mainlanguage [en] +\setuppapersize[A4][A4] + +\setupbodyfont[10pt] +%\usetypescript [lbr][ec] +%\switchtotypeface [lbr] [10pt] + +% The function application operator, which expands to a space in math mode +\define[1]\expr{|#1|} +\define[2]\app{#1\;#2} +\define[2]\lam{Î»#1 \xrightarrow #2} +\define[2]\letexpr{{\bold let}\;#1\;{\bold in}\;#2} +\define[2]\case{{\bold case}\;#1\;{\bold of}\;#2} +\define[2]\alt{#1 \xrightarrow #2} +\define[2]\bind{#1:#2} +\define[1]\where{{\bold where}\;#1} +% A transformation +\definefloat[transformation][transformations] +\define[2]\transform{ + %\placetransformation[here]{#1} + \startframedtext[width=\textwidth] + \startformula \startalign + #2 + \stopalign \stopformula + \stopframedtext +} + +% A helper to print a single example in the half the page width. The example +% text should be in a buffer whose name is given in an argument. +% +% The align=right option really does left-alignment, but without the program +% will end up on a single line. The strut=no option prevents a bunch of empty +% space at the start of the frame. +\define[1]\example{\framed[frameoffset=2mm,align=right,strut=no]{\typebuffer[#1]}} + +% A transformation example +\definefloat[example][examples] +\setupcaption[example][location=top] % Put captions on top + +\define[3]\transexample{ + \placeexample[here]{#1} + \startcombination[2*1] + {\example{#2}}{Original program} + {\example{#3}}{Transformed program} + \stopcombination +} + +\define[3]\transexampleh{ + \placeexample[here]{#1} + \startcombination[1*2] + {\example{#2}}{Original program} + {\example{#3}}{Transformed program} + \stopcombination +} + +% Define a custom description format for the builtinexprs below +\definedescription[builtindesc][headstyle=bold,style=normal,location=top] + +\starttext +\title {Core transformations for hardware generation} +Matthijs Kooijman + +\section{Introduction} +As a new approach to translating Core to VHDL, we investigate a number of +transformations on our Core program, which should bring the program into a +well-defined "canonical" state, which is subsequently trivial to translate to +VHDL. + +The transformations as presented here are far from complete, but are meant as +an exploration of possible transformations. In the running example below, we +apply each of the transformations exactly once, in sequence. As will be +apparent from the end result, there will be additional transformations needed +to fully reach our goal, and some transformations must be applied more than +once. How exactly to (efficiently) do this, has not been investigated. + +Lastly, I hope to be able to state a number of pre- and postconditinos for +each transformation. If these can be proven for each transformation, and it +can be shown that ther exists some ordering of transformations for which the +postcondition implies the canonical form, we can show that the transformations +do indeed transform any program (probably satisfying a number of +preconditions) to the canonical form. + +\section{Goal} +The transformations described here have a well-defined goal: To bring the +program in a well-defined program that is directly translatable to hardware, +while fully preserving the semantics of the program. + +This {\em canonical form} is again a Core program, but with a very specific +structure. A function in canonical form has nested lambda's at the top, which +produce a let expression. This let expression binds every function application +in the function and produces either a simple identifier or a tuple of +identifiers. Every bound value in the let expression is either a simple +function application or a case expression to extract a single element from a +tuple returned by a function. + +An example of a program in canonical form would be: + +\starttyping + -- All arguments are an inital lambda + \x c d -> + -- There is one let expression at the top level + let + -- Calling some other user-defined function. + s = foo x + -- Extracting result values from a tuple + a = case s of (a, b) -> a + b = case s of (a, b) -> b + -- Some builtin expressions + rh = add c d + rhh = sub d c + -- Conditional connections + rl = case b of + High -> rhh + Low -> d + r = case a of + High -> rh + Low -> rl + in + -- The actual result + r +\stoptyping + +In this and all following programs, the following definitions are assumed to +be available: + +\starttyping +data Bit = Low | High + +foo :: Int -> (Bit, Bit) +add :: Int -> Int -> Int +sub :: Int -> Int -> Int +\stoptyping + +When looking at such a program from a hardware perspective, the top level +lambda's define the input ports. The value produced by the let expression are +the output ports. Each function application bound by the let expression +defines a component instantiation, where the input and output ports are mapped +to local signals or arguments. The tuple extracting case expressions don't map +to + +\subsection{Canonical form definition} +Formally, the canonical form is a core program obeying the following +constraints. + +\startitemize[R,inmargin] +\item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$. +$fun$ is an identifier that will be bound as a global identifier. +\item A $lamexpr$ has the form $\expr{\lam{arg}{lamexpr}}$ or +$\expr{letexpr}$. $arg$ is an identifier which will be bound as an $argument$. +\item[letexpr] A $letexpr$ has the form $\expr{\letexpr{letbinds}{retexpr}}$. +\item $letbinds$ is a list with elements of the form +$\expr{\bind{res}{appexpr}}$ or $\expr{\bind{res}{builtinexpr}}$, where $res$ is +an identifier that will be bound as local identifier. The type of the bound +value must be a $hardware\;type$. +\item[builtinexpr] A $builtinexpr$ is an expression that can be mapped to an +equivalent VHDL expression. Since there are many supported forms for this, +these are defined in a separate table. +\item An $appexpr$ has the form $\expr{fun}$ or $\expr{\app{appexpr}{x}}$, +where $fun$ is a global identifier and $x$ is a local identifier. +\item[retexpr] A $retexpr$ has the form $\expr{x}$ or $\expr{tupexpr}$, where $x$ is a local identifier that is bound as an $argument$ or $result$. A $retexpr$ must +be of a $hardware\;type$. +\item A $tupexpr$ has the form $\expr{con}$ or $\expr{\app{tupexpr}{x}}$, +where $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$) and $x$ is +a local identifier. +\item A $hardware\;type$ is a type that can be directly translated to +hardware. This includes the types $Bit$, $SizedWord$, tuples containing +elements of $hardware\;type$s, and will include others. This explicitely +excludes function types. +\stopitemize + +TODO: Say something about uniqueness of identifiers + +\subsection{Builtin expressions} +A $builtinexpr$, as defined at \in[builtinexpr] can have any of the following forms. + +\startitemize[m,inmargin] +\item +$tuple\_extract=\expr{\case{t}{\alt{\app{con}{x_0\;x_1\;..\;x_n}}{x_i}}}$, +where $t$ can be any local identifier, $con$ is a tuple constructor ({\em +e.g.} $(,)$ or $(,,,)$), $x_0$ to $x_n$ can be any identifier, and $x_i$ can +be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$. +\item TODO: Many more! +\stopitemize + +\section{Transformation passes} + +In this section we describe the actual transformations. Here we're using +mostly Core-like notation, with a few notable points. + +\startitemize +\item A core expression (in contrast with a transformation function, for +example), is enclosed in pipes. For example, $\app{transform}{\expr{\lam{z}{z+1}}}$ +is the transform function applied to a lambda core expression. + +Note that this notation might not be consistently applied everywhere. In +particular where a non-core function is used inside a core expression, things +get slightly confusing. +\item A bind is written as $\expr{\bind{x}{expr}}$. This means binding the identifier +$x$ to the expression $expr$. +\item In the core examples, the layout rule from Haskell is loosely applied. +It should be evident what was meant from indentation, even though it might nog +be strictly correct. +\stopitemize + +\subsection{Example} +In the descriptions of transformations below, the following (slightly +contrived) example program will be used as the running example. Note that this +the example for the canonical form given above is the same program, but in +canonical form. + +\starttyping + \x -> + let s = foo x + in + case s of + (a, b) -> + r = case a of + High -> add + Low -> let + op' = case b of + High -> sub + Low -> \c d -> c + in + \c d -> op' d c +\stoptyping + +\subsection{Argument extraction} +This transformation makes sure that all of a bindings arguments are always +bound to variables at the top level of the bound value. Formally, we can +describe this transformation as follows. + +\transform{Argument extraction} +{ +\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR +\NR +\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR +\NC \app{transform'}{\expr{expr :: a \xrightarrow b}} \NC = \expr{\lam{x}{\app{transform'}{\expr{(\app{expr}{x})}}}} \NR +} + +When applying this transformation to our running example, we get the following +program. + +\starttyping + \x c d -> + (let s = foo x + in + case s of + (a, b) -> + case a of + High -> add + Low -> let + op' = case b of + High -> sub + Low -> \c d -> c + in + \c d -> op' d c + ) c d +\stoptyping + +\startbuffer[from] +foo = \x -> case x of True -> (\y -> mul y y); False -> id +\stopbuffer +\startbuffer[to] +foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z +\stopbuffer + +\transexampleh{Argument extraction example}{from}{to} + +\subsection{Application propagation} +This transformation is meant to propagate application expressions downwards +into expressions as far as possible. Formally, we can describe this +transformation as follows. + +\transform{Application propagation} +{ +\NC \app{transform}{\expr{\app{(\letexpr{binds}{expr})}{y}}} \NC = \expr{\letexpr{binds}{(\app{expr}{y})}}\NR +\NC \app{transform}{\expr{\app{(\lam{x}{expr})}{y}}} \NC = \app{\app{subs}{x \xRightarrow y}}{\expr{expr}}\NR +\NC \app{transform}{\expr{\app{(\case{x}{\alt{p}{e};...})}{y}}} \NC = \expr{\case{x}{\alt{p}{\app{e}{y}};...}}\;(for\;every\;alt)\NR +} + +When applying this transformation to our running example, we get the following +program. + +\starttyping + \x c d -> + let s = foo x + in + case s of + (a, b) -> + case a of + High -> add c d + Low -> let + op' = case b of + High -> sub + Low -> \c d -> c + in + op' d c +\stoptyping + +\startbuffer[from] +foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z +\stopbuffer +\startbuffer[to] +foo = \x z -> case x of True -> mul z z; False -> id z +\stopbuffer + +\transexampleh{Application propagation example}{from}{to} + +\subsection{Introducing main scope} +This transformation is meant to introduce a single let expression that will be +the "main scope". This is the let expression as described under requirement +\ref[letexpr]. This let expression will contain only a single binding, which +binds the original expression to some identifier and then evalutes to just +this identifier (to comply with requirement \in[retexpr]). + +Formally, we can describe the transformation as follows. + +\transform{Main scope introduction} +{ +\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR +\NR +\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR +\NC \app{transform'}{\expr{expr}} \NC = \expr{\letexpr{\bind{x}{expr}}{x}} \NR +} + +When applying this transformation to our running example, we get the following +program. + +\starttyping + \x c d -> + let r = (let s = foo x + in + case s of + (a, b) -> + case a of + High -> add c d + Low -> let + op' = case b of + High -> sub + Low -> \c d -> c + in + op' d c + ) + in + r +\stoptyping + +\subsection{Scope flattening} +This transformation tries to flatten the topmost let expression in a bind, +{\em i.e.}, bind all identifiers in the same scope, and bind them to simple +expressions only (so simplify deeply nested expressions). + +Formally, we can describe the transformation as follows. + +\transform{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR +\NR +\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR +\NC \app{transform'}{\expr{\letexpr{binds}{expr}}} \NC = \expr{\letexpr{\app{concat . map . flatten}{binds}}{expr}}\NR +\NR +\NC \app{flatten}{\expr{\bind{x}{\letexpr{binds}{expr}}}} \NC = (\app{concat . map . flatten}{binds}) \cup \{\app{flatten}{\expr{\bind{x}{expr}}}\}\NR +\NC \app{flatten}{\expr{\bind{x}{\case{s}{alts}}}} \NC = \app{concat}{binds'} \cup \{\bind{x}{\case{s}{alts'}}\}\NR +\NC \NC \where{(binds', alts')=\app{unzip.map.(flattenalt s)}{alts}}\NR +\NC \app{\app{flattenalt}{s}}{\expr{\alt{\app{con}{x_0\;..\;x_n}}{expr}}} \NC = (extracts \cup \{\app{flatten}{bind}\}, alt)\NR +\NC \NC \where{extracts =\{\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_0}}},}\NR +\NC \NC \;..\;,\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_n}}}\} \NR +\NC \NC bind = \expr{\bind{y}{expr}}\NR +\NC \NC alt = \expr{\alt{\app{con}{\_\;..\;\_}}{y}}\NR +} + +When applying this transformation to our running example, we get the following +program. + +\starttyping + \x c d -> + let s = foo x + r = case s of + (_, _) -> y + a = case s of (a, b) -> a + b = case s of (a, b) -> b + y = case a of + High -> g + Low -> h + g = add c d + h = op' d c + op' = case b of + High -> i + Low -> j + i = sub + j = \c d -> c + in + r +\stoptyping + +\subsection{More transformations} +As noted before, the above transformations are not complete. Other needed +transformations include: +\startitemize +\item Inlining of local identifiers with a function type. Since these cannot +be represented in hardware directly, they must be transformated into something +else. Inlining them should always be possible without loss of semantics (TODO: +How true is this?) and can expose new possibilities for other transformations +passes (such as application propagation when inlining {\tt j} above). +\item A variation on inlining local identifiers is the propagation of +function arguments with a function type. This will probably be initiated when +transforming the caller (and not the callee), but it will also deal with +identifiers with a function type that are unrepresentable in hardware. + +Special care must be taken here, since the expression that is propagated into +the callee comes from a different scope. The function typed argument must thus +be replaced by any identifiers from the callers scope that the propagated +expression uses. + +Note that propagating an argument will change both a function's interface and +implementation. For this to work, a new function should be created instead of +modifying the original function, so any other callers will not be broken. +\item Something similar should happen with return values with function types. +\item Polymorphism must be removed from all user-defined functions. This is +again similar to propagation function typed arguments, since this will also +create duplicates of functions (for a specific type). This is an operation +that is commonly known as "specialization" and already happens in GHC (since +non-polymorph functions can be a lot faster than generic ones). +\item More builtin expressions should be added and these should be evaluated +by the compiler. For example, map, fold, +. +\stopitemize +\stoptext -- 2.20.1