X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FNormalization.tex;h=b74333ae103c15d51b08fa37cc708cc7c3f1503b;hp=d30cb6b99e3680eb32ba84bcd5ef3515c493c78b;hb=951d8f0c4551cbc1adc75c80ae21c86bee6d1e80;hpb=5b652f5aabadc31f785d7860c32c6fb4c7fb50e5 diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex index d30cb6b..b74333a 100644 --- a/Chapters/Normalization.tex +++ b/Chapters/Normalization.tex @@ -15,10 +15,6 @@ } -% A transformation example -\definefloat[example][examples] -\setupcaption[example][location=top] % Put captions on top - \define[3]\transexample{ \placeexample[here]{#1} \startcombination[2*1] @@ -373,76 +369,371 @@ construction (\eg the \lam{case} statement) or call a builtin function (\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is available. -\subsection{Definitions} -In the following sections, we will be using a number of functions and -notations, which we will define here. +\section{Transformation notation} +To be able to concisely present transformations, we use a specific format to +them. It is a simple format, similar to one used in logic reasoning. -\subsubsection{Transformations} -The most important notation is the one for transformation, which looks like -the following: +Such a transformation description looks like the following. \starttrans -context conditions + ~ -from ------------------------- expression conditions -to + +-------------------------- + ~ -context additions + \stoptrans -Here, we describe a transformation. The most import parts are \lam{from} and -\lam{to}, which describe the Core expresssion that should be matched and the -expression that it should be replaced with. This matching can occur anywhere -in function that is being normalized, so it applies to any subexpression as -well. +This format desribes a transformation that applies to \lam{original +expresssion} and transforms it into \lam{transformed expression}, assuming +that all conditions apply. In this format, there are a number of placeholders +in pointy brackets, most of which should be rather obvious in their meaning. +Nevertheless, we will more precisely specify their meaning below: + + \startdesc{} The expression pattern that will be matched + against (subexpressions of) the expression to be transformed. We call this a + pattern, because it can contain \emph{placeholders} (variables), which match + any expression or binder. Any such placeholder is said to be \emph{bound} to + the expression it matches. It is convention to use an uppercase latter (\eg + \lam{M} or \lam{E} to refer to any expression (including a simple variable + reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to + (references to) binders. + + For example, the pattern \lam{a + B} will match the expression + \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to + \lam{(2 * 2)}), but not \lam{v + (2 * w)}. + \stopdesc + + \startdesc{} + These are extra conditions on the expression that is matched. These + conditions can be used to further limit the cases in which the + transformation applies, in particular to prevent a transformation from + causing a loop with itself or another transformation. + + Only if these if these conditions are \emph{all} true, this transformation + applies. + \stopdesc + + \startdesc{} + These are a number of extra conditions on the context of the function. In + particular, these conditions can require some other top level function to be + present, whose value matches the pattern given here. The format of each of + these conditions is: \lam{binder = }. + + Typically, the binder is some placeholder bound in the \lam{}, while the pattern contains some placeholders that are used in + the \lam{transformed expression}. + + Only if a top level binder exists that matches each binder and pattern, this + transformation applies. + \stopdesc + + \startdesc{} + This is the expression template that is the result of the transformation. If, looking + at the above three items, the transformation applies, the \lam{original + expression} is completely replaced with the \lam{}. + We call this a template, because it can contain placeholders, referring to + any placeholder bound by the \lam{} or the + \lam{}. The resulting expression will have those + placeholders replaced by the values bound to them. + + Any binder (lowercase) placeholder that has no value bound to it yet will be + bound to (and replaced with) a fresh binder. + \stopdesc + + \startdesc{} + These are templates for new functions to add to the context. This is a way + to have a transformation create new top level functiosn. + + Each addition has the form \lam{binder = template}. As above, any + placeholder in the addition is replaced with the value bound to it, and any + binder placeholder that has no value bound to it yet will be bound to (and + replaced with) a fresh binder. + \stopdesc + + As an example, we'll look at η-abstraction: -The \lam{expression conditions} list a number of conditions on the \lam{from} -expression that must hold for the transformation to apply. +\starttrans +E \lam{E :: a -> b} +-------------- \lam{E} does not occur on a function position in an application +λx.E x \lam{E} is not a lambda abstraction. +\stoptrans -Furthermore, there is some way to look into the environment (\eg, other top -level bindings). The \lam{context conditions} part specifies any number of -top level bindings that must be present for the transformation to apply. -Usually, this lists a top level binding that binds an identfier that is also -used in the \lam{from} expression, allowing us to "access" the value of a top -level binding in the \lam{to} expression (\eg, for inlining). + Consider the following function, which is a fairly obvious way to specify a + simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this + function): -Finally, there is a way to influence the environment. The \lam{context -additions} part lists any number of new top level bindings that should be -added. +\startlambda +alu :: Bit -> Word -> Word -> Word +alu = λopcode. case opcode of + Low -> (+) + High -> (-) +\stoplambda -If there are no \lam{context conditions} or \lam{context additions}, they can -be left out alltogether, along with the separator \lam{~}. + There are a few subexpressions in this function to which we could possibly + apply the transformation. Since the pattern of the transformation is only + the placeholder \lam{E}, any expression will match that. Whether the + transformation applies to an expression is thus solely decided by the + conditions to the right of the transformation. -TODO: Example + We will look at each expression in the function in a top down manner. The + first expression is the entire expression the function is bound to. -\subsubsection{Other concepts} -A \emph{global variable} is any variable that is bound at the -top level of a program, or an external module. A local variable is any other -variable (\eg, variables local to a function, which can be bound by lambda -abstractions, let expressions and case expressions). +\startlambda +λopcode. case opcode of + Low -> (+) + High -> (-) +\stoplambda -A \emph{hardware representable} type is a type that we can generate -a signal for in hardware. For example, a bit, a vector of bits, a 32 bit -unsigned word, etc. Types that are not runtime representable notably -include (but are not limited to): Types, dictionaries, functions. + As said, the expression pattern matches this. The type of this expression is + \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in + this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}). -A \emph{builtin function} is a function for which a builtin -hardware translation is available, because its actual definition is not -translatable. A user-defined function is any other function. + Since this expression is at top level, it does not occur at a function + position of an application. However, The expression is a lambda abstraction, + so this transformation does not apply. + + The next expression we could apply this transformation to, is the body of + the lambda abstraction: + +\startlambda +case opcode of + Low -> (+) + High -> (-) +\stoplambda + + The type of this expression is \lam{Word -> Word -> Word}, which again + matches \lam{a -> b}. The expression is the body of a lambda expression, so + it does not occur at a function position of an application. Finally, the + expression is not a lambda abstraction but a case expression, so all the + conditions match. There are no context conditions to match, so the + transformation applies. + + By now, the placeholder \lam{E} is bound to the entire expression. The + placeholder \lam{x}, which occurs in the replacement template, is not bound + yet, so we need to generate a fresh binder for that. Let's use the binder + \lam{a}. This results in the following replacement expression: + +\startlambda +λa.(case opcode of + Low -> (+) + High -> (-)) a +\stoplambda + + Continuing with this expression, we see that the transformation does not + apply again (it is a lambda expression). Next we look at the body of this + labmda abstraction: + +\startlambda +(case opcode of + Low -> (+) + High -> (-)) a +\stoplambda + + Here, the transformation does apply, binding \lam{E} to the entire + expression and \lam{x} to the fresh binder \lam{b}, resulting in the + replacement: + +\startlambda +λb.(case opcode of + Low -> (+) + High -> (-)) a b +\stoplambda + + Again, the transformation does not apply to this lambda abstraction, so we + look at its body. For brevity, we'll put the case statement on one line from + now on. + +\startlambda +(case opcode of Low -> (+); High -> (-)) a b +\stoplambda + + The type of this expression is \lam{Word}, so it does not match \lam{a -> b} + and the transformation does not apply. Next, we have two options for the + next expression to look at: The function position and argument position of + the application. The expression in the argument position is \lam{b}, which + has type \lam{Word}, so the transformation does not apply. The expression in + the function position is: + +\startlambda +(case opcode of Low -> (+); High -> (-)) a +\stoplambda + + Obviously, the transformation does not apply here, since it occurs in + function position. In the same way the transformation does not apply to both + components of this expression (\lam{case opcode of Low -> (+); High -> (-)} + and \lam{a}), so we'll skip to the components of the case expression: The + scrutinee and both alternatives. Since the opcode is not a function, it does + not apply here, and we'll leave both alternatives as an exercise to the + reader. The final function, after all these transformations becomes: + +\startlambda +alu :: Bit -> Word -> Word -> Word +alu = λopcode.λa.b. (case opcode of + Low -> λa1.λb1 (+) a1 b1 + High -> λa2.λb2 (-) a2 b2) a b +\stoplambda + + In this case, the transformation does not apply anymore, though this might + not always be the case (e.g., the application of a transformation on a + subexpression might open up possibilities to apply the transformation + further up in the expression). + +\subsection{Transformation application} +In this chapter we define a number of transformations, but how will we apply +these? As stated before, our normal form is reached as soon as no +transformation applies anymore. This means our application strategy is to +simply apply any transformation that applies, and continuing to do that with +the result of each transformation. + +In particular, we define no particular order of transformations. Since +transformation order should not influence the resulting normal form (see TODO +ref), this leaves the implementation free to choose any application order that +results in an efficient implementation. + +When applying a single transformation, we try to apply it to every (sub)expression +in a function, not just the top level function. This allows us to keep the +transformation descriptions concise and powerful. + +\subsection{Definitions} +In the following sections, we will be using a number of functions and +notations, which we will define here. + +\subsubsection{Other concepts} +A \emph{global variable} is any variable that is bound at the +top level of a program, or an external module. A \emph{local variable} is any +other variable (\eg, variables local to a function, which can be bound by +lambda abstractions, let expressions and pattern matches of case +alternatives). Note that this is a slightly different notion of global versus +local than what \small{GHC} uses internally. +\defref{global variable} \defref{local variable} + +A \emph{hardware representable} (or just \emph{representable}) type or value +is (a value of) a type that we can generate a signal for in hardware. For +example, a bit, a vector of bits, a 32 bit unsigned word, etc. Types that are +not runtime representable notably include (but are not limited to): Types, +dictionaries, functions. +\defref{representable} + +A \emph{builtin function} is a function supplied by the Cλash framework, whose +implementation is not valid Cλash. The implementation is of course valid +Haskell, for simulation, but it is not expressable in Cλash. +\defref{builtin function} \defref{user-defined function} + +For these functions, Cλash has a \emph{builtin hardware translation}, so calls +to these functions can still be translated. These are functions like +\lam{map}, \lam{hwor} and \lam{length}. + +A \emph{user-defined} function is a function for which we do have a Cλash +implementation available. \subsubsection{Functions} Here, we define a number of functions that can be used below to concisely specify conditions. -\emph{gvar(expr)} is true when \emph{expr} is a variable that references a +\refdef{global variable}\emph{gvar(expr)} is true when \emph{expr} is a variable that references a global variable. It is false when it references a local variable. -\emph{lvar(expr)} is the inverse of \emph{gvar}; it is true when \emph{expr} +\refdef{local variable}\emph{lvar(expr)} is the complement of \emph{gvar}; it is true when \emph{expr} references a local variable, false when it references a global variable. -\emph{representable(expr)} or \emph{representable(var)} is true when -\emph{expr} or \emph{var} has a type that is representable at runtime. +\refdef{representable}\emph{representable(expr)} or \emph{representable(var)} is true when +\emph{expr} or \emph{var} is \emph{representable}. + +\subsection{Binder uniqueness} +A common problem in transformation systems, is binder uniqueness. When not +considering this problem, it is easy to create transformations that mix up +bindings and cause name collisions. Take for example, the following core +expression: + +\startlambda +(λa.λb.λc. a * b * c) x c +\stoplambda + +By applying β-reduction (see below) once, we can simplify this expression to: + +\startlambda +(λb.λc. x * b * c) c +\stoplambda + +Now, we have replaced the \lam{a} binder with a reference to the \lam{x} +binder. No harm done here. But note that we see multiple occurences of the +\lam{c} binder. The first is a binding occurence, to which the second refers. +The last, however refers to \emph{another} instance of \lam{c}, which is +bound somewhere outside of this expression. Now, if we would apply beta +reduction without taking heed of binder uniqueness, we would get: + +\startlambda +λc. x * c * c +\stoplambda + +This is obviously not what was supposed to happen! The root of this problem is +the reuse of binders: Identical binders can be bound in different scopes, such +that only the inner one is \quote{visible} in the inner expression. In the example +above, the \lam{c} binder was bound outside of the expression and in the inner +lambda expression. Inside that lambda expression, only the inner \lam{c} is +visible. + +There are a number of ways to solve this. \small{GHC} has isolated this +problem to their binder substitution code, which performs \emph{deshadowing} +during its expression traversal. This means that any binding that shadows +another binding on a higher level is replaced by a new binder that does not +shadow any other binding. This non-shadowing invariant is enough to prevent +binder uniqueness problems in \small{GHC}. + +In our transformation system, maintaining this non-shadowing invariant is +a bit harder to do (mostly due to implementation issues, the prototype doesn't +use \small{GHC}'s subsitution code). Also, we can observe the following +points. + +\startitemize +\item Deshadowing does not guarantee overall uniqueness. For example, the +following (slightly contrived) expression shows the identifier \lam{x} bound in +two seperate places (and to different values), even though no shadowing +occurs. + +\startlambda +(let x = 1 in x) + (let x = 2 in x) +\stoplambda + +\item In our normal form (and the resulting \small{VHDL}), all binders +(signals) will end up in the same scope. To allow this, all binders within the +same function should be unique. + +\item When we know that all binders in an expression are unique, moving around +or removing a subexpression will never cause any binder conflicts. If we have +some way to generate fresh binders, introducing new subexpressions will not +cause any problems either. The only way to cause conflicts is thus to +duplicate an existing subexpression. +\stopitemize + +Given the above, our prototype maintains a unique binder invariant. This +meanst that in any given moment during normalization, all binders \emph{within +a single function} must be unique. To achieve this, we apply the following +technique. + +TODO: Define fresh binders and unique supplies + +\startitemize +\item Before starting normalization, all binders in the function are made +unique. This is done by generating a fresh binder for every binder used. This +also replaces binders that did not pose any conflict, but it does ensure that +all binders within the function are generated by the same unique supply. See +(TODO: ref fresh binder). +\item Whenever a new binder must be generated, we generate a fresh binder that +is guaranteed to be different from \emph{all binders generated so far}. This +can thus never introduce duplication and will maintain the invariant. +\item Whenever (part of) an expression is duplicated (for example when +inlining), all binders in the expression are replaced with fresh binders +(using the same method as at the start of normalization). These fresh binders +can never introduce duplication, so this will maintain the invariant. +\item Whenever we move part of an expression around within the function, there +is no need to do anything special. There is obviously no way to introduce +duplication by moving expressions around. Since we know that each of the +binders is already unique, there is no way to introduce (incorrect) shadowing +either. +\stopitemize \section{Transform passes} In this section we describe the actual transforms. Here we're using @@ -462,9 +753,6 @@ is specified as a number of conditions (above the horizontal line) and a number of conclusions (below the horizontal line). The details of using this notation are still a bit fuzzy, so comments are welcom. -TODO: Formally describe the "apply to every (sub)expression" in terms of -rules with full transformations in the conditions. - \subsection{η-abstraction} This transformation makes sure that all arguments of a function-typed expression are named, by introducing lambda expressions. When combined with @@ -472,7 +760,7 @@ expression are named, by introducing lambda expressions. When combined with be lambda abstractions or global identifiers. \starttrans -E \lam{E :: * -> *} +E \lam{E :: a -> b} -------------- \lam{E} is not the first argument of an application. λx.E x \lam{E} is not a lambda abstraction. \lam{x} is a variable that does not occur free in \lam{E}. @@ -492,21 +780,66 @@ foo = λa.λx.(case a of \transexample{η-abstraction}{from}{to} -\subsection{Extended β-reduction} +\subsection{β-reduction} +β-reduction is a well known transformation from lambda calculus, where it is +the main reduction step. It reduces applications of labmda abstractions, +removing both the lambda abstraction and the application. + +In our transformation system, this step helps to remove unwanted lambda +abstractions (basically all but the ones at the top level). Other +transformations (application propagation, non-representable inlining) make +sure that most lambda abstractions will eventually be reducable by +β-reduction. + +TODO: Define substitution syntax + +\starttrans +(λx.E) M +----------------- +E[M/x] +\stoptrans + +% And an example +\startbuffer[from] +(λa. 2 * a) (2 * b) +\stopbuffer + +\startbuffer[to] +2 * (2 * b) +\stopbuffer + +\transexample{β-reduction}{from}{to} + +\subsection{Application propagation} This transformation is meant to propagate application expressions downwards -into expressions as far as possible. In lambda calculus, this reduction -is known as β-reduction, but it is of course only defined for -applications of lambda abstractions. We extend this reduction to also -work for the rest of core (case and let expressions). +into expressions as far as possible. This allows partial applications inside +expressions to become fully applied and exposes new transformation +possibilities for other transformations (like β-reduction). -For let expressions: \starttrans let binds in E) M ----------------- let binds in E M \stoptrans -For case statements: +% And an example +\startbuffer[from] +( let + val = 1 + in + add val +) 3 +\stopbuffer + +\startbuffer[to] +let + val = 1 +in + add val 3 +\stopbuffer + +\transexample{Application propagation for a let expression}{from}{to} + \starttrans (case x of p1 -> E1 @@ -519,35 +852,21 @@ case x of pn -> En M \stoptrans -For lambda expressions: -\starttrans -(λx.E) M ------------------ -E[M/x] -\stoptrans - % And an example \startbuffer[from] -( let a = (case x of - True -> id - False -> neg - ) 1 - b = (let y = 3 in add y) 2 - in - (λz.add 1 z) -) 3 +( case x of + True -> id + False -> neg +) 1 \stopbuffer \startbuffer[to] -let a = case x of - True -> id 1 - False -> neg 1 - b = let y = 3 in add y 2 -in - add 1 3 +case x of + True -> id 1 + False -> neg 1 \stopbuffer -\transexample{Extended β-reduction}{from}{to} +\transexample{Application propagation for a case expression}{from}{to} \subsection{Let derecursification} This transformation is meant to make lets non-recursive whenever possible. @@ -563,7 +882,7 @@ in scope for the function return value). Note that this transformation does not try to be smart when faced with recursive lets, it will just leave the lets recursive (possibly joining a recursive and non-recursive let into a single recursive let). The let -rederursification transformation will do this instead. +rederecursification transformation will do this instead. \starttrans letnonrec x = (let bindings in M) in N