X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FNormalization.tex;h=445c32389ad91ccd8f874ddbd49ebb8f7fa71682;hp=a2ccfa4c9b6111f27acf6a23d486244d1d0c7b1b;hb=35ce834593526d613598a431723cd5be7fe440af;hpb=6cb18e51dfaa9f30754a2e47e1cf8f845b701ec6 diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex index a2ccfa4..445c323 100644 --- a/Chapters/Normalization.tex +++ b/Chapters/Normalization.tex @@ -373,48 +373,236 @@ construction (\eg the \lam{case} statement) or call a builtin function (\eg \lam{add} or \lam{sub}). For these, a hardcoded \small{VHDL} translation is available. -\subsection{Definitions} -In the following sections, we will be using a number of functions and -notations, which we will define here. +\section{Transformation notation} +To be able to concisely present transformations, we use a specific format to +them. It is a simple format, similar to one used in logic reasoning. -\subsubsection{Transformations} -The most important notation is the one for transformation, which looks like -the following: +Such a transformation description looks like the following. \starttrans -context conditions + ~ -from ------------------------- expression conditions -to + +-------------------------- + ~ -context additions + \stoptrans -Here, we describe a transformation. The most import parts are \lam{from} and -\lam{to}, which describe the Core expresssion that should be matched and the -expression that it should be replaced with. This matching can occur anywhere -in function that is being normalized, so it applies to any subexpression as -well. +This format desribes a transformation that applies to \lam{original +expresssion} and transforms it into \lam{transformed expression}, assuming +that all conditions apply. In this format, there are a number of placeholders +in pointy brackets, most of which should be rather obvious in their meaning. +Nevertheless, we will more precisely specify their meaning below: + + \startdesc{} The expression pattern that will be matched + against (subexpressions of) the expression to be transformed. We call this a + pattern, because it can contain \emph{placeholders} (variables), which match + any expression or binder. Any such placeholder is said to be \emph{bound} to + the expression it matches. It is convention to use an uppercase latter (\eg + \lam{M} or \lam{E} to refer to any expression (including a simple variable + reference) and lowercase letters (\eg \lam{v} or \lam{b}) to refer to + (references to) binders. + + For example, the pattern \lam{a + B} will match the expression + \lam{v + (2 * w)} (and bind \lam{a} to \lam{v} and \lam{B} to + \lam{(2 * 2)}), but not \lam{v + (2 * w)}. + \stopdesc + + \startdesc{} + These are extra conditions on the expression that is matched. These + conditions can be used to further limit the cases in which the + transformation applies, in particular to prevent a transformation from + causing a loop with itself or another transformation. + + Only if these if these conditions are \emph{all} true, this transformation + applies. + \stopdesc + + \startdesc{} + These are a number of extra conditions on the context of the function. In + particular, these conditions can require some other top level function to be + present, whose value matches the pattern given here. The format of each of + these conditions is: \lam{binder = }. + + Typically, the binder is some placeholder bound in the \lam{}, while the pattern contains some placeholders that are used in + the \lam{transformed expression}. + + Only if a top level binder exists that matches each binder and pattern, this + transformation applies. + \stopdesc + + \startdesc{} + This is the expression template that is the result of the transformation. If, looking + at the above three items, the transformation applies, the \lam{original + expression} is completely replaced with the \lam{}. + We call this a template, because it can contain placeholders, referring to + any placeholder bound by the \lam{} or the + \lam{}. The resulting expression will have those + placeholders replaced by the values bound to them. + + Any binder (lowercase) placeholder that has no value bound to it yet will be + bound to (and replaced with) a fresh binder. + \stopdesc + + \startdesc{} + These are templates for new functions to add to the context. This is a way + to have a transformation create new top level functiosn. + + Each addition has the form \lam{binder = template}. As above, any + placeholder in the addition is replaced with the value bound to it, and any + binder placeholder that has no value bound to it yet will be bound to (and + replaced with) a fresh binder. + \stopdesc + + As an example, we'll look at η-abstraction: + +\starttrans +E \lam{E :: a -> b} +-------------- \lam{E} does not occur on a function position in an application +λx.E x \lam{E} is not a lambda abstraction. +\stoptrans -The \lam{expression conditions} list a number of conditions on the \lam{from} -expression that must hold for the transformation to apply. + Consider the following function, which is a fairly obvious way to specify a + simple ALU (Note \at{example}[ex:AddSubAlu] is the normal form of this + function): -Furthermore, there is some way to look into the environment (\eg, other top -level bindings). The \lam{context conditions} part specifies any number of -top level bindings that must be present for the transformation to apply. -Usually, this lists a top level binding that binds an identfier that is also -used in the \lam{from} expression, allowing us to "access" the value of a top -level binding in the \lam{to} expression (\eg, for inlining). +\startlambda +alu :: Bit -> Word -> Word -> Word +alu = λopcode. case opcode of + Low -> (+) + High -> (-) +\stoplambda -Finally, there is a way to influence the environment. The \lam{context -additions} part lists any number of new top level bindings that should be -added. + There are a few subexpressions in this function to which we could possibly + apply the transformation. Since the pattern of the transformation is only + the placeholder \lam{E}, any expression will match that. Whether the + transformation applies to an expression is thus solely decided by the + conditions to the right of the transformation. -If there are no \lam{context conditions} or \lam{context additions}, they can -be left out alltogether, along with the separator \lam{~}. + We will look at each expression in the function in a top down manner. The + first expression is the entire expression the function is bound to. -TODO: Example +\startlambda +λopcode. case opcode of + Low -> (+) + High -> (-) +\stoplambda + + As said, the expression pattern matches this. The type of this expression is + \lam{Bit -> Word -> Word -> Word}, which matches \lam{a -> b} (Note that in + this case \lam{a = Bit} and \lam{b = Word -> Word -> Word}). + + Since this expression is at top level, it does not occur at a function + position of an application. However, The expression is a lambda abstraction, + so this transformation does not apply. + + The next expression we could apply this transformation to, is the body of + the lambda abstraction: + +\startlambda +case opcode of + Low -> (+) + High -> (-) +\stoplambda + + The type of this expression is \lam{Word -> Word -> Word}, which again + matches \lam{a -> b}. The expression is the body of a lambda expression, so + it does not occur at a function position of an application. Finally, the + expression is not a lambda abstraction but a case expression, so all the + conditions match. There are no context conditions to match, so the + transformation applies. + + By now, the placeholder \lam{E} is bound to the entire expression. The + placeholder \lam{x}, which occurs in the replacement template, is not bound + yet, so we need to generate a fresh binder for that. Let's use the binder + \lam{a}. This results in the following replacement expression: + +\startlambda +λa.(case opcode of + Low -> (+) + High -> (-)) a +\stoplambda + + Continuing with this expression, we see that the transformation does not + apply again (it is a lambda expression). Next we look at the body of this + labmda abstraction: + +\startlambda +(case opcode of + Low -> (+) + High -> (-)) a +\stoplambda + + Here, the transformation does apply, binding \lam{E} to the entire + expression and \lam{x} to the fresh binder \lam{b}, resulting in the + replacement: + +\startlambda +λb.(case opcode of + Low -> (+) + High -> (-)) a b +\stoplambda + + Again, the transformation does not apply to this lambda abstraction, so we + look at its body. For brevity, we'll put the case statement on one line from + now on. + +\startlambda +(case opcode of Low -> (+); High -> (-)) a b +\stoplambda + + The type of this expression is \lam{Word}, so it does not match \lam{a -> b} + and the transformation does not apply. Next, we have two options for the + next expression to look at: The function position and argument position of + the application. The expression in the argument position is \lam{b}, which + has type \lam{Word}, so the transformation does not apply. The expression in + the function position is: + +\startlambda +(case opcode of Low -> (+); High -> (-)) a +\stoplambda + + Obviously, the transformation does not apply here, since it occurs in + function position. In the same way the transformation does not apply to both + components of this expression (\lam{case opcode of Low -> (+); High -> (-)} + and \lam{a}), so we'll skip to the components of the case expression: The + scrutinee and both alternatives. Since the opcode is not a function, it does + not apply here, and we'll leave both alternatives as an exercise to the + reader. The final function, after all these transformations becomes: + +\startlambda +alu :: Bit -> Word -> Word -> Word +alu = λopcode.λa.b. (case opcode of + Low -> λa1.λb1 (+) a1 b1 + High -> λa2.λb2 (-) a2 b2) a b +\stoplambda + + In this case, the transformation does not apply anymore, though this might + not always be the case (e.g., the application of a transformation on a + subexpression might open up possibilities to apply the transformation + further up in the expression). + +\subsection{Transformation application} +In this chapter we define a number of transformations, but how will we apply +these? As stated before, our normal form is reached as soon as no +transformation applies anymore. This means our application strategy is to +simply apply any transformation that applies, and continuing to do that with +the result of each transformation. + +In particular, we define no particular order of transformations. Since +transformation order should not influence the resulting normal form (see TODO +ref), this leaves the implementation free to choose any application order that +results in an efficient implementation. + +When applying a single transformation, we try to apply it to every (sub)expression +in a function, not just the top level function. This allows us to keep the +transformation descriptions concise and powerful. + +\subsection{Definitions} +In the following sections, we will be using a number of functions and +notations, which we will define here. \subsubsection{Other concepts} A \emph{global variable} is any variable that is bound at the @@ -444,27 +632,6 @@ references a local variable, false when it references a global variable. \emph{representable(expr)} or \emph{representable(var)} is true when \emph{expr} or \emph{var} has a type that is representable at runtime. -\section{Transform passes} -In this section we describe the actual transforms. Here we're using -the core language in a notation that resembles lambda calculus. - -Each of these transforms is meant to be applied to every (sub)expression -in a program, for as long as it applies. Only when none of the -transformations can be applied anymore, the program is in normal form (by -definition). We hope to be able to prove that this form will obey all of the -constraints defined above, but this has yet to happen (though it seems likely -that it will). - -Each of the transforms will be described informally first, explaining -the need for and goal of the transform. Then, a formal definition is -given, using a familiar syntax from the world of logic. Each transform -is specified as a number of conditions (above the horizontal line) and a -number of conclusions (below the horizontal line). The details of using -this notation are still a bit fuzzy, so comments are welcom. - -TODO: Formally describe the "apply to every (sub)expression" in terms of -rules with full transformations in the conditions. - \subsection{Binder uniqueness} A common problem in transformation systems, is binder uniqueness. When not considering this problem, it is easy to create transformations that mix up @@ -559,6 +726,24 @@ binders is already unique, there is no way to introduce (incorrect) shadowing either. \stopitemize +\section{Transform passes} +In this section we describe the actual transforms. Here we're using +the core language in a notation that resembles lambda calculus. + +Each of these transforms is meant to be applied to every (sub)expression +in a program, for as long as it applies. Only when none of the +transformations can be applied anymore, the program is in normal form (by +definition). We hope to be able to prove that this form will obey all of the +constraints defined above, but this has yet to happen (though it seems likely +that it will). + +Each of the transforms will be described informally first, explaining +the need for and goal of the transform. Then, a formal definition is +given, using a familiar syntax from the world of logic. Each transform +is specified as a number of conditions (above the horizontal line) and a +number of conclusions (below the horizontal line). The details of using +this notation are still a bit fuzzy, so comments are welcom. + \subsection{η-abstraction} This transformation makes sure that all arguments of a function-typed expression are named, by introducing lambda expressions. When combined with @@ -566,7 +751,7 @@ expression are named, by introducing lambda expressions. When combined with be lambda abstractions or global identifiers. \starttrans -E \lam{E :: * -> *} +E \lam{E :: a -> b} -------------- \lam{E} is not the first argument of an application. λx.E x \lam{E} is not a lambda abstraction. \lam{x} is a variable that does not occur free in \lam{E}.