X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Core2Core.tex;h=5df24873befeb7b28d0cf018dfb0924cc3efd85a;hp=437aae70425fbc87c8cdbe9b5b9a1c8215d938b5;hb=32d52cda513c45334ab256c9f42d41ed6938fc48;hpb=8e496c475d663131ad09c62dc5a3efe30e3f2e15 diff --git a/Core2Core.tex b/Core2Core.tex index 437aae7..5df2487 100644 --- a/Core2Core.tex +++ b/Core2Core.tex @@ -1,9 +1,20 @@ \mainlanguage [en] \setuppapersize[A4][A4] -\setupbodyfont[10pt] -%\usetypescript [lbr][ec] -%\switchtotypeface [lbr] [10pt] +% Define a custom typescript. We could also have put the \definetypeface's +% directly in the script, without a typescript, but I guess this is more +% elegant... +\starttypescript[Custom] +% This is a sans font that supports greek symbols +\definetypeface [Custom] [ss] [sans] [iwona] [default] +\definetypeface [Custom] [rm] [serif] [antykwa-torunska] [default] +\definetypeface [Custom] [tt] [mono] [modern] [default] +\definetypeface [Custom] [mm] [math] [modern] [default] +\stoptypescript +\usetypescript [Custom] + +% Use our custom typeface +\switchtotypeface [Custom] [10pt] % The function application operator, which expands to a space in math mode \define[1]\expr{|#1|} @@ -17,6 +28,15 @@ % A transformation \definefloat[transformation][transformations] \define[2]\transform{ + \startframedtext[width=\textwidth] + #2 + \stopframedtext +} + +\define\conclusion{\blackrule[height=0.5pt,depth=0pt,width=.5\textwidth]} +\define\nextrule{\vskip1cm} + +\define[2]\transformold{ %\placetransformation[here]{#1} \startframedtext[width=\textwidth] \startformula \startalign @@ -25,13 +45,74 @@ \stopframedtext } +% A shortcut for italicized e.g. and i.e. +\define[0]\eg{{\em e.g.}} +\define[0]\ie{{\em i.e.}} + +\definedescription + [desc] + [location=hanging,hang=20,width=broad] + %command=\hskip-1cm,margin=1cm] + +% Install the lambda calculus pretty-printer, as defined in pret-lam.lua. +\installprettytype [LAM] [LAM] + +\definetyping[lambda][option=LAM,style=sans] +\definetype[lam][option=LAM,style=sans] + +\installprettytype [TRANS] [TRANS] +\definetyping[trans][option=TRANS,style=normal,before=,after=] + +% An (invisible) frame to hold a lambda expression +\define[1]\lamframe{ + % Put a frame around lambda expressions, so they can have multiple + % lines and still appear inline. + % The align=right option really does left-alignment, but without the + % program will end up on a single line. The strut=no option prevents a + % bunch of empty space at the start of the frame. + \framed[offset=0mm,location=middle,strut=no,align=right,frame=off]{#1} +} + +\define[2]\transbuf{ + % Make \typebuffer uses the LAM pretty printer and a sans-serif font + % Also prevent any extra spacing above and below caused by the default + % before=\blank and after=\blank. + \setuptyping[option=LAM,style=sans,before=,after=] + % Prevent the arrow from ending up below the first frame (a \framed + % at the start of a line defaults to using vmode). + \dontleavehmode + % Put the elements in frames, so they can have multiple lines and be + % middle-aligned + \lamframe{\typebuffer[#1]} + \lamframe{\Rightarrow} + \lamframe{\typebuffer[#2]} + % Reset the typing settings to their defaults + \setuptyping[option=none,style=\tttf] +} +% This is the same as \transbuf above, but it accepts text directly instead +% of through buffers. This only works for single lines, however. +\define[2]\trans{ + \dontleavehmode + \lamframe{\lam{#1}} + \lamframe{\Rightarrow} + \lamframe{\lam{#2}} +} + + % A helper to print a single example in the half the page width. The example % text should be in a buffer whose name is given in an argument. % % The align=right option really does left-alignment, but without the program % will end up on a single line. The strut=no option prevents a bunch of empty % space at the start of the frame. -\define[1]\example{\framed[frameoffset=2mm,align=right,strut=no]{\typebuffer[#1]}} +\define[1]\example{ + \framed[offset=1mm,align=right,strut=no]{ + \setuptyping[option=LAM,style=sans,before=,after=] + \typebuffer[#1] + \setuptyping[option=none,style=\tttf] + } +} + % A transformation example \definefloat[example][examples] @@ -46,11 +127,11 @@ } \define[3]\transexampleh{ - \placeexample[here]{#1} - \startcombination[1*2] - {\example{#2}}{Original program} - {\example{#3}}{Transformed program} - \stopcombination +% \placeexample[here]{#1} +% \startcombination[1*2] +% {\example{#2}}{Original program} +% {\example{#3}}{Transformed program} +% \stopcombination } % Define a custom description format for the builtinexprs below @@ -62,42 +143,31 @@ Matthijs Kooijman \section{Introduction} As a new approach to translating Core to VHDL, we investigate a number of -transformations on our Core program, which should bring the program into a -well-defined "canonical" state, which is subsequently trivial to translate to -VHDL. - -The transformations as presented here are far from complete, but are meant as -an exploration of possible transformations. In the running example below, we -apply each of the transformations exactly once, in sequence. As will be -apparent from the end result, there will be additional transformations needed -to fully reach our goal, and some transformations must be applied more than -once. How exactly to (efficiently) do this, has not been investigated. - -Lastly, I hope to be able to state a number of pre- and postconditinos for -each transformation. If these can be proven for each transformation, and it -can be shown that ther exists some ordering of transformations for which the -postcondition implies the canonical form, we can show that the transformations -do indeed transform any program (probably satisfying a number of -preconditions) to the canonical form. +transforms on our Core program, which should bring the program into a +well-defined {\em normal} form, which is subsequently trivial to +translate to VHDL. + +The transforms as presented here are far from complete, but are meant as +an exploration of possible transformations. \section{Goal} The transformations described here have a well-defined goal: To bring the -program in a well-defined program that is directly translatable to hardware, +program in a well-defined form that is directly translatable to hardware, while fully preserving the semantics of the program. -This {\em canonical form} is again a Core program, but with a very specific -structure. A function in canonical form has nested lambda's at the top, which +This {\em normal form} is again a Core program, but with a very specific +structure. A function in normal form has nested lambda's at the top, which produce a let expression. This let expression binds every function application -in the function and produces either a simple identifier or a tuple of -identifiers. Every bound value in the let expression is either a simple -function application or a case expression to extract a single element from a -tuple returned by a function. +in the function and produces a simple identifier. Every bound value in +the let expression is either a simple function application or a case +expression to extract a single element from a tuple returned by a +function. An example of a program in canonical form would be: -\starttyping +\startlambda -- All arguments are an inital lambda - \x c d -> + λx.λc.λd. -- There is one let expression at the top level let -- Calling some other user-defined function. @@ -118,29 +188,21 @@ An example of a program in canonical form would be: in -- The actual result r -\stoptyping - -In this and all following programs, the following definitions are assumed to -be available: - -\starttyping -data Bit = Low | High - -foo :: Int -> (Bit, Bit) -add :: Int -> Int -> Int -sub :: Int -> Int -> Int -\stoptyping +\stoplambda When looking at such a program from a hardware perspective, the top level -lambda's define the input ports. The value produced by the let expression are -the output ports. Each function application bound by the let expression -defines a component instantiation, where the input and output ports are mapped -to local signals or arguments. The tuple extracting case expressions don't map -to - -\subsection{Canonical form definition} -Formally, the canonical form is a core program obeying the following -constraints. +lambda's define the input ports. The value produced by the let expression is +the output port. Most function applications bound by the let expression +define a component instantiation, where the input and output ports are mapped +to local signals or arguments. Some of the others use a builtin +construction (\eg the \lam{case} statement) or call a builtin function +(\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is +available. + +\subsection{Normal definition} +Formally, the normal form is a core program obeying the following +constraints. TODO: Update this section, this is probably not completely +accurate or relevant anymore. \startitemize[R,inmargin] \item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$. @@ -182,66 +244,432 @@ be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$. \item TODO: Many more! \stopitemize -\section{Transformation passes} +\section{Transform passes} + +In this section we describe the actual transforms. Here we're using +the core language in a notation that resembles lambda calculus. + +Each of these transforms is meant to be applied to every (sub)expression +in a program, for as long as it applies. Only when none of the +expressions can be applied anymore, the program is in normal form. We +hope to be able to prove that this form will obey all of the constraints +defined above, but this has yet to happen (though it seems likely that +it will). + +Each of the transforms will be described informally first, explaining +the need for and goal of the transform. Then, a formal definition is +given, using a familiar syntax from the world of logic. Each transform +is specified as a number of conditions (above the horizontal line) and a +number of conclusions (below the horizontal line). The details of using +this notation are still a bit fuzzy, so comments are welcom. + +TODO: Formally describe the "apply to every (sub)expression" in terms of +rules with full transformations in the conditions. + +\subsection{η-abstraction} +This transformation makes sure that all arguments of a function-typed +expression are named, by introducing lambda expressions. When combined with +β-reduction and function inlining below, all function-typed expressions should +be lambda abstractions or global identifiers. + +\starttrans +E \lam{E :: * -> *} +-------------- \lam{E} is not the first argument of an application. +λx.E x \lam{E} is not a lambda abstraction. + \lam{x} is a variable that does not occur free in \lam{E}. +\stoptrans -In this section we describe the actual transformations. Here we're using -mostly Core-like notation, with a few notable points. +\startbuffer[from] +foo = λa -> case a of + True -> λb.mul b b + False -> id +\stopbuffer + +\startbuffer[to] +foo = λa.λx -> (case a of + True -> λb.mul b b + False -> λy.id y) x +\stopbuffer + +\transexample{η-abstraction}{from}{to} + +\subsection{Extended β-reduction} +This transformation is meant to propagate application expressions downwards +into expressions as far as possible. In lambda calculus, this reduction +is known as β-reduction, but it is of course only defined for +applications of lambda abstractions. We extend this reduction to also +work for the rest of core (case and let expressions). +\startbuffer[from] +(case x of + p1 -> E1 + \vdots + pn -> En) M +\stopbuffer +\startbuffer[to] +case x of + p1 -> E1 M + \vdots + pn -> En M +\stopbuffer + +\transform{Extended β-reduction} +{ +\conclusion +\trans{(λx.E) M}{E[M/x]} + +\nextrule +\conclusion +\trans{(let binds in E) M}{let binds in E M} + +\nextrule +\conclusion +\transbuf{from}{to} +} + +\startbuffer[from] +let a = (case x of + True -> id + False -> neg + ) 1 + b = (let y = 3 in add y) 2 +in + (λz.add 1 z) 3 +\stopbuffer + +\startbuffer[to] +let a = case x of + True -> id 1 + False -> neg 1 + b = let y = 3 in add y 2 +in + add 1 3 +\stopbuffer + +\transexample{Extended β-reduction}{from}{to} + +\subsection{Argument simplification} +The transforms in this section deal with simplifying application +arguments into normal form. The goal here is to: + +\startitemize + \item Make all arguments of user-defined functions (\eg, of which + we have a function body) simple variable references of a runtime + representable type. + \item Make all arguments of builtin functions either: + \startitemize + \item A type argument. + \item A dictionary argument. + \item A type level expression. + \item A variable reference of a runtime representable type. + \item A variable reference or partial application of a function type. + \stopitemize +\stopitemize + +When looking at the arguments of a user-defined function, we can +divide them into two categories: +\startitemize + \item Arguments with a runtime representable type (\eg bits or vectors). + + These arguments can be preserved in the program, since they can + be translated to input ports later on. However, since we can + only connect signals to input ports, these arguments must be + reduced to simple variables (for which signals will be + produced). This is taken care of by the argument extraction + transform. + \item Non-runtime representable typed arguments. + + These arguments cannot be preserved in the program, since we + cannot represent them as input or output ports in the resulting + VHDL. To remove them, we create a specialized version of the + called function with these arguments filled in. This is done by + the argument propagation transform. +\stopitemize + +When looking at the arguments of a builtin function, we can divide them +into categories: \startitemize -\item A core expression (in contrast with a transformation function, for -example), is enclosed in pipes. For example, $\app{transform}{\expr{\lam{z}{z+1}}}$ -is the transform function applied to a lambda core expression. - -Note that this notation might not be consistently applied everywhere. In -particular where a non-core function is used inside a core expression, things -get slightly confusing. -\item A bind is written as $\expr{\bind{x}{expr}}$. This means binding the identifier -$x$ to the expression $expr$. -\item In the core examples, the layout rule from Haskell is loosely applied. -It should be evident what was meant from indentation, even though it might nog -be strictly correct. + \item Arguments with a runtime representable type. + + As we have seen with user-defined functions, these arguments can + always be reduced to a simple variable reference, by the + argument extraction transform. Performing this transform for + builtin functions as well, means that the translation of builtin + functions can be limited to signal references, instead of + needing to support all possible expressions. + + \item Arguments with a function type. + + These arguments are functions passed to higher order builtins, + like \lam{map} and \lam{foldl}. Since implementing these + functions for arbitrary function-typed expressions (\eg, lambda + expressions) is rather comlex, we reduce these arguments to + (partial applications of) global functions. + + We can still support arbitrary expressions from the user code, + by creating a new global function containing that expression. + This way, we can simply replace the argument with a reference to + that new function. However, since the expression can contain any + number of free variables we also have to include partial + applications in our normal form. + + This category of arguments is handled by the function extraction + transform. + \item Other unrepresentable arguments. + + These arguments can take a few different forms: + \startdesc{Type arguments} + In the core language, type arguments can only take a single + form: A type wrapped in the Type constructor. Also, there is + nothing that can be done with type expressions, except for + applying functions to them, so we can simply leave type + arguments as they are. + \stopdesc + \startdesc{Dictionary arguments} + In the core language, dictionary arguments are used to find + operations operating on one of the type arguments (mostly for + finding class methods). Since we will not actually evaluatie + the function body for builtin functions and can generate + code for builtin functions by just looking at the type + arguments, these arguments can be ignored and left as they + are. + \stopdesc + \startdesc{Type level arguments} + Sometimes, we want to pass a value to a builtin function, but + we need to know the value at compile time. Additionally, the + value has an impact on the type of the function. This is + encoded using type-level values, where the actual value of the + argument is not important, but the type encodes some integer, + for example. Since the value is not important, the actual form + of the expression does not matter either and we can leave + these arguments as they are. + \stopdesc + \startdesc{Other arguments} + Technically, there is still a wide array of arguments that can + be passed, but does not fall into any of the above categories. + However, none of the supported builtin functions requires such + an argument. This leaves use with passing unsupported types to + a function, such as calling \lam{head} on a list of functions. + + In these cases, it would be impossible to generate hardware + for such a function call anyway, so we can ignore these + arguments. + + The only way to generate hardware for builtin functions with + arguments like these, is to expand the function call into an + equivalent core expression (\eg, expand map into a series of + function applications). But for now, we choose to simply not + support expressions like these. + \stopdesc + + From the above, we can conclude that we can simply ignore these + other unrepresentable arguments and focus on the first two + categories instead. \stopitemize -\subsection{Example} -In the descriptions of transformations below, the following (slightly -contrived) example program will be used as the running example. Note that this -the example for the canonical form given above is the same program, but in -canonical form. +\subsubsection{Argument extraction} +This transform deals with arguments to functions that +are of a runtime representable type. + +TODO: It seems we can map an expression to a port, not only a signal. +Perhaps this makes this transformation not needed? +TODO: Say something about dataconstructors (without arguments, like True +or False), which are variable references of a runtime representable +type, but do not result in a signal. + +To reduce a complex expression to a simple variable reference, we create +a new let expression around the application, which binds the complex +expression to a new variable. The original function is then applied to +this variable. + +\transform{Argument extract} +{ +\lam{Y} is of a hardware representable type + +\lam{Y} is not a variable referene + +\conclusion + +\trans{X Y}{let z = Y in X z} +} + +\subsubsection{Function extraction} +This transform deals with function-typed arguments to builtin functions. +Since these arguments cannot be propagated, we choose to extract them +into a new global function instead. + +Any free variables occuring in the extracted arguments will become +parameters to the new global function. The original argument is replaced +with a reference to the new function, applied to any free variables from +the original argument. + +\transform{Function extraction} +{ +\lam{X} is a (partial application of) a builtin function + +\lam{Y} is not an application + +\lam{Y} is not a variable reference + +\conclusion + +\lam{f0 ... fm} = free local vars of \lam{Y} + +\lam{y} is a new global variable -\starttyping - \x -> +\lam{y = λf0 ... fn.Y} + +\trans{X Y}{X (y f0 ... fn)} +} + +\subsubsection{Argument propagation} +This transform deals with arguments to user-defined functions that are +not representable at runtime. This means these arguments cannot be +preserved in the final form and most be {\em propagated}. + +Propagation means to create a specialized version of the called +function, with the propagated argument already filled in. As a simple +example, in the following program: + +\startlambda +f = λa.λb.a + b +inc = λa.f a 1 +\stoplambda + +we could {\em propagate} the constant argument 1, with the following +result: + +\startlambda +f' = λa.a + 1 +inc = λa.f' a +\stoplambda + +Special care must be taken when the to-be-propagated expression has any +free variables. If this is the case, the original argument should not be +removed alltogether, but replaced by all the free variables of the +expression. In this way, the original expression can still be evaluated +inside the new function. Also, this brings us closer to our goal: All +these free variables will be simple variable references. + +To prevent us from propagating the same argument over and over, a simple +local variable reference is not propagated (since is has exactly one +free variable, itself, we would only replace that argument with itself). + +This shows that any free local variables that are not runtime representable +cannot be brought into normal form by this transform. We rely on an +inlining transformation to replace such a variable with an expression we +can propagate again. + +TODO: Move these definitions somewhere sensible. + +Definition: A global variable is any variable that is bound at the +top level of a program. A local variable is any other variable. + +Definition: A hardware representable type is a type that we can generate +a signal for in hardware. For example, a bit, a vector of bits, a 32 bit +unsigned word, etc. Types that are not runtime representable notably +include (but are not limited to): Types, dictionaries, functions. + +Definition: A builtin function is a function for which a builtin +hardware translation is available, because its actual definition is not +translatable. A user-defined function is any other function. + +\starttrans +x = E +~ +x Y0 ... Yi ... Yn \lam{Y_i} is not of a runtime representable type +--------------------------------------------- \lam{Y_i} is not a local variable reference +x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn . \lam{f0 ... fm} = free local vars of \lam{Y_i} + E y0 ... yi-1 Yi yi+1 ... yn +~ +x' y0 ... yi-1 f0 ... fm Yi+1 ... Yn +\stoptrans + +\transform{Argument propagation} +{ +\lam{x} is a global variable, bound to a user-defined function + +\lam{x = E} + +\lam{Y_i} is not of a runtime representable type + +\lam{Y_i} is not a local variable reference + +\conclusion + +\lam{f0 ... fm} = free local vars of \lam{Y_i} + +\lam{x'} is a new global variable + +\lam{x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn . E y0 ... yi-1 Yi yi+1 ... yn} + +\trans{x Y0 ... Yi ... Yn}{x' y0 ... yi-1 f0 ... fm Yi+1 ... Yn} +} + +TODO: The above definition looks too complicated... Can we find +something more concise? + +\subsection{Cast propagation} +This transform pushes casts down into the expression as far as possible. +\subsection{Let recursification} +This transform makes all lets recursive. +\subsection{Let simplification} +This transform makes the result value of all let expressions a simple +variable reference. +\subsection{Let flattening} +This transform turns two nested lets (\lam{let x = (let ... in ...) in +...}) into a single let. +\subsection{Simple let binding removal} +This transforms inlines simple let bindings (\eg a = b). +\subsection{Function inlining} +This transform inlines let bindings of a funtion type. TODO: This should +be generelized to anything that is non representable at runtime, or +something like that. +\subsection{Scrutinee simplification} +This transform ensures that the scrutinee of a case expression is always +a simple variable reference. +\subsection{Case binder wildening} +This transform replaces all binders of a each case alternative with a +wild binder (\ie, one that is never referred to). This will possibly +introduce a number of new "selector" case statements, that only select +one element from an algebraic datatype and bind it to a variable. +\subsection{Case value simplification} +This transform simplifies the result value of each case alternative by +binding the value in a let expression and replacing the value by a +simple variable reference. +\subsection{Case removal} +This transform removes any case statements with a single alternative and +only wild binders. + +\subsection{Example sequence} + +This section lists an example expression, with a sequence of transforms +applied to it. The exact transforms given here probably don't exactly +match the transforms given above anymore, but perhaps this can clarify +the big picture a bit. + +TODO: Update or remove this section. + +\startlambda + λx. let s = foo x in case s of (a, b) -> - r = case a of - High -> add - Low -> let - op' = case b of - High -> sub - Low -> \c d -> c - in - \c d -> op' d c -\stoptyping - -\subsection{Argument extraction} -This transformation makes sure that all of a bindings arguments are always -bound to variables at the top level of the bound value. Formally, we can -describe this transformation as follows. - -\transform{Argument extraction} -{ -\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR -\NR -\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR -\NC \app{transform'}{\expr{expr :: a \xrightarrow b}} \NC = \expr{\lam{x}{\app{transform'}{\expr{(\app{expr}{x})}}}} \NR -} + case a of + High -> add + Low -> let + op' = case b of + High -> sub + Low -> λc.λd.c + in + λc.λd.op' d c +\stoplambda -When applying this transformation to our running example, we get the following -program. +After top-level η-abstraction: -\starttyping - \x c d -> +\startlambda + λx.λc.λd. (let s = foo x in case s of @@ -251,38 +679,16 @@ program. Low -> let op' = case b of High -> sub - Low -> \c d -> c + Low -> λc.λd.c in - \c d -> op' d c + λc.λd.op' d c ) c d -\stoptyping - -\startbuffer[from] -foo = \x -> case x of True -> (\y -> mul y y); False -> id -\stopbuffer -\startbuffer[to] -foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z -\stopbuffer +\stoplambda -\transexampleh{Argument extraction example}{from}{to} - -\subsection{Application propagation} -This transformation is meant to propagate application expressions downwards -into expressions as far as possible. Formally, we can describe this -transformation as follows. +After (extended) β-reduction: -\transform{Application propagation} -{ -\NC \app{transform}{\expr{\app{(\letexpr{binds}{expr})}{y}}} \NC = \expr{\letexpr{binds}{(\app{expr}{y})}}\NR -\NC \app{transform}{\expr{\app{(\lam{x}{expr})}{y}}} \NC = \app{\app{subs}{x \xRightarrow y}}{\expr{expr}}\NR -\NC \app{transform}{\expr{\app{(\case{x}{\alt{p}{e};...})}{y}}} \NC = \expr{\case{x}{\alt{p}{\app{e}{y}};...}}\;(for\;every\;alt)\NR -} - -When applying this transformation to our running example, we get the following -program. - -\starttyping - \x c d -> +\startlambda + λx.λc.λd. let s = foo x in case s of @@ -292,134 +698,190 @@ program. Low -> let op' = case b of High -> sub - Low -> \c d -> c + Low -> λc.λd.c in op' d c -\stoptyping +\stoplambda -\startbuffer[from] -foo = \x z -> (case x of True -> (\y -> mul y y); False -> id) z -\stopbuffer -\startbuffer[to] -foo = \x z -> case x of True -> mul z z; False -> id z -\stopbuffer +After return value extraction: -\transexampleh{Application propagation example}{from}{to} +\startlambda + λx.λc.λd. + let s = foo x + r = case s of + (a, b) -> + case a of + High -> add c d + Low -> let + op' = case b of + High -> sub + Low -> λc.λd.c + in + op' d c + in + r +\stoplambda -\subsection{Introducing main scope} -This transformation is meant to introduce a single let expression that will be -the "main scope". This is the let expression as described under requirement -\ref[letexpr]. This let expression will contain only a single binding, which -binds the original expression to some identifier and then evalutes to just -this identifier (to comply with requirement \in[retexpr]). +Scrutinee simplification does not apply. -Formally, we can describe the transformation as follows. +After case binder wildening: -\transform{Main scope introduction} -{ -\NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR -\NR -\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR -\NC \app{transform'}{\expr{expr}} \NC = \expr{\letexpr{\bind{x}{expr}}{x}} \NR -} +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> + case a of + High -> add c d + Low -> let op' = case b of + High -> sub + Low -> λc.λd.c + in + op' d c + in + r +\stoplambda -When applying this transformation to our running example, we get the following -program. +After case value simplification -\starttyping - \x c d -> - let r = (let s = foo x - in - case s of - (a, b) -> - case a of - High -> add c d - Low -> let - op' = case b of - High -> sub - Low -> \c d -> c - in - op' d c - ) +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> r' + rh = add c d + rl = let rll = λc.λd.c + op' = case b of + High -> sub + Low -> rll + in + op' d c + r' = case a of + High -> rh + Low -> rl in r -\stoptyping - -\subsection{Scope flattening} -This transformation tries to flatten the topmost let expression in a bind, -{\em i.e.}, bind all identifiers in the same scope, and bind them to simple -expressions only (so simplify deeply nested expressions). - -Formally, we can describe the transformation as follows. - -\transform{Main scope introduction} { \NC \app{transform}{\expr{\bind{f}{expr}}} \NC = \expr{\bind{f}{\app{transform'(expr)}}}\NR -\NR -\NC \app{transform'}{\expr{\lam{v}{expr}}} \NC = \expr{\lam{v}{\app{transform'}{expr}}}\NR -\NC \app{transform'}{\expr{\letexpr{binds}{expr}}} \NC = \expr{\letexpr{\app{concat . map . flatten}{binds}}{expr}}\NR -\NR -\NC \app{flatten}{\expr{\bind{x}{\letexpr{binds}{expr}}}} \NC = (\app{concat . map . flatten}{binds}) \cup \{\app{flatten}{\expr{\bind{x}{expr}}}\}\NR -\NC \app{flatten}{\expr{\bind{x}{\case{s}{alts}}}} \NC = \app{concat}{binds'} \cup \{\bind{x}{\case{s}{alts'}}\}\NR -\NC \NC \where{(binds', alts')=\app{unzip.map.(flattenalt s)}{alts}}\NR -\NC \app{\app{flattenalt}{s}}{\expr{\alt{\app{con}{x_0\;..\;x_n}}{expr}}} \NC = (extracts \cup \{\app{flatten}{bind}\}, alt)\NR -\NC \NC \where{extracts =\{\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_0}}},}\NR -\NC \NC \;..\;,\expr{\case{s}{\alt{\app{con}{x_0\;..\;x_n}}{x_n}}}\} \NR -\NC \NC bind = \expr{\bind{y}{expr}}\NR -\NC \NC alt = \expr{\alt{\app{con}{\_\;..\;\_}}{y}}\NR -} +\stoplambda -When applying this transformation to our running example, we get the following -program. +After let flattening: -\starttyping - \x c d -> +\startlambda + λx.λc.λd. let s = foo x - r = case s of - (_, _) -> y - a = case s of (a, b) -> a - b = case s of (a, b) -> b - y = case a of - High -> g - Low -> h - g = add c d - h = op' d c + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> r' + rh = add c d + rl = op' d c + rll = λc.λd.c op' = case b of - High -> i - Low -> j - i = sub - j = \c d -> c + High -> sub + Low -> rll + r' = case a of + High -> rh + Low -> rl in r -\stoptyping +\stoplambda -\subsection{More transformations} -As noted before, the above transformations are not complete. Other needed -transformations include: -\startitemize -\item Inlining of local identifiers with a function type. Since these cannot -be represented in hardware directly, they must be transformated into something -else. Inlining them should always be possible without loss of semantics (TODO: -How true is this?) and can expose new possibilities for other transformations -passes (such as application propagation when inlining {\tt j} above). -\item A variation on inlining local identifiers is the propagation of -function arguments with a function type. This will probably be initiated when -transforming the caller (and not the callee), but it will also deal with -identifiers with a function type that are unrepresentable in hardware. - -Special care must be taken here, since the expression that is propagated into -the callee comes from a different scope. The function typed argument must thus -be replaced by any identifiers from the callers scope that the propagated -expression uses. - -Note that propagating an argument will change both a function's interface and -implementation. For this to work, a new function should be created instead of -modifying the original function, so any other callers will not be broken. -\item Something similar should happen with return values with function types. -\item Polymorphism must be removed from all user-defined functions. This is -again similar to propagation function typed arguments, since this will also -create duplicates of functions (for a specific type). This is an operation -that is commonly known as "specialization" and already happens in GHC (since -non-polymorph functions can be a lot faster than generic ones). -\item More builtin expressions should be added and these should be evaluated -by the compiler. For example, map, fold, +. -\stopitemize +After function inlining: + +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> r' + rh = add c d + rl = (case b of + High -> sub + Low -> λc.λd.c) d c + r' = case a of + High -> rh + Low -> rl + in + r +\stoplambda + +After (extended) β-reduction again: + +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> r' + rh = add c d + rl = case b of + High -> sub d c + Low -> d + r' = case a of + High -> rh + Low -> rl + in + r +\stoplambda + +After case value simplification again: + +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = case s of (_, _) -> r' + rh = add c d + rlh = sub d c + rl = case b of + High -> rlh + Low -> d + r' = case a of + High -> rh + Low -> rl + in + r +\stoplambda + +After case removal: + +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + r = r' + rh = add c d + rlh = sub d c + rl = case b of + High -> rlh + Low -> d + r' = case a of + High -> rh + Low -> rl + in + r +\stoplambda + +After let bind removal: + +\startlambda + λx.λc.λd. + let s = foo x + a = case s of (a, _) -> a + b = case s of (_, b) -> b + rh = add c d + rlh = sub d c + rl = case b of + High -> rlh + Low -> d + r' = case a of + High -> rh + Low -> rl + in + r' +\stoplambda + +Application simplification is not applicable. \stoptext