From b1f08ad1bc712b096ea8330252e7a343955004f7 Mon Sep 17 00:00:00 2001 From: Matthijs Kooijman Date: Wed, 2 Sep 2009 11:16:48 +0200 Subject: [PATCH] Improve some text and add definitions in Normalization. --- Chapters/Normalization.tex | 182 +++++++++++++++++++++---------------- 1 file changed, 103 insertions(+), 79 deletions(-) diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex index 6c9b434..1f9f62b 100644 --- a/Chapters/Normalization.tex +++ b/Chapters/Normalization.tex @@ -52,11 +52,17 @@ while fully preserving the semantics of the program. This {\em normal form} is again a Core program, but with a very specific structure. A function in normal form has nested lambda's at the top, which -produce a let expression. This let expression binds every function application -in the function and produces a simple identifier. Every bound value in -the let expression is either a simple function application or a case -expression to extract a single element from a tuple returned by a -function. +produce a number of nested let expressions. These let expressions binds a +number of simple expressions in the function and produces a simple identifier. +Every bound value in the let expression is either a simple function +application, a case expression to extract a single element from a tuple +returned by a function or a case expression to choose between two signals +based on some other signal. + +This structure is easy to translate to VHDL, since each top level lambda will +be an input port, every bound value will become a concurrent statement (such +as a component instantiation or conditional signal assignment) and the result +variable will become the output port. An example of a program in canonical form would be: @@ -93,36 +99,114 @@ An example of a program in canonical form would be: res \stoplambda +\subsection{Definitions} +In the following sections, we will be using a number of functions and +notations, which we will define here. + +\subsubsection{Transformations} +The most important notation is the one for transformation, which looks like +the following: + +\starttrans +context conditions +~ +from +------------------------ expression conditions +to +~ +context additions +\stoptrans + +Here, we describe a transformation. The most import parts are \lam{from} and +\lam{to}, which describe the Core expresssion that should be matched and the +expression that it should be replaced with. This matching can occur anywhere +in function that is being normalized, so it applies to any subexpression as +well. + +The \lam{expression conditions} list a number of conditions on the \lam{from} +expression that must hold for the transformation to apply. + +Furthermore, there is some way to look into the environment (\eg, other top +level bindings). The \lam{context conditions} part specifies any number of +top level bindings that must be present for the transformation to apply. +Usually, this lists a top level binding that binds an identfier that is also +used in the \lam{from} expression, allowing us to "access" the value of a top +level binding in the \lam{to} expression (\eg, for inlining). + +Finally, there is a way to influence the environment. The \lam{context +additions} part lists any number of new top level bindings that should be +added. + +If there are no \lam{context conditions} or \lam{context additions}, they can +be left out alltogether, along with the separator \lam{~}. + +TODO: Example + +\subsubsection{Other concepts} +A \emph{global variable} is any variable that is bound at the +top level of a program, or an external module. A local variable is any other +variable (\eg, variables local to a function, which can be bound by lambda +abstractions, let expressions and case expressions). + +A \emph{hardware representable} type is a type that we can generate +a signal for in hardware. For example, a bit, a vector of bits, a 32 bit +unsigned word, etc. Types that are not runtime representable notably +include (but are not limited to): Types, dictionaries, functions. + +A \emph{builtin function} is a function for which a builtin +hardware translation is available, because its actual definition is not +translatable. A user-defined function is any other function. + +\subsubsection{Functions} +Here, we define a number of functions that can be used below to concisely +specify conditions. + +\emph{gvar(expr)} is true when \emph{expr} is a variable that references a +global variable. It is false when it references a local variable. + +\emph{lvar(expr)} is the inverse of \emph{gvar}; it is true when \emph{expr} +references a local variable, false when it references a global variable. + +\emph{representable(expr)} or \emph{representable(var)} is true when +\emph{expr} or \emph{var} has a type that is representable at runtime. + +\subsection{Normal form definition} +We can describe this normal form in a slightly more formal manner. The +following EBNF-like description completely captures the intended structure +(and generates a subset of GHC's core format). + +Some clauses have an expression listed in parentheses. These are conditions +that need to apply to the clause. + \startlambda \italic{normal} = \italic{lambda} -\italic{lambda} = λvar.\italic{lambda} (representable(typeof(var))) +\italic{lambda} = λvar.\italic{lambda} (representable(var)) | \italic{toplet} \italic{toplet} = let \italic{binding} in \italic{toplet} | letrec [\italic{binding}] in \italic{toplet} - | var (representable(typeof(var)), fvar(var)) -\italic{binding} = var = \italic{rhs} (representable(typeof(rhs))) + | var (representable(varvar)) +\italic{binding} = var = \italic{rhs} (representable(rhs)) -- State packing and unpacking by coercion - | var0 = var1 :: State ty (fvar(var1)) - | var0 = var1 :: ty (var0 :: State ty) (fvar(var1)) + | var0 = var1 :: State ty (lvar(var1)) + | var0 = var1 :: ty (var0 :: State ty) (lvar(var1)) \italic{rhs} = userapp | builtinapp -- Extractor case - | case var of C a0 ... an -> ai (fvar(var)) + | case var of C a0 ... an -> ai (lvar(var)) -- Selector case - | case var of (fvar(var)) - DEFAULT -> var0 (fvar(var0)) - C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, fvar(resvar)) + | case var of (lvar(var)) + DEFAULT -> var0 (lvar(var0)) + C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, lvar(resvar)) \italic{userapp} = \italic{userfunc} | \italic{userapp} {userarg} -\italic{userfunc} = var (tvar(var)) -\italic{userarg} = var (fvar(var)) +\italic{userfunc} = var (gvar(var)) +\italic{userarg} = var (lvar(var)) \italic{builtinapp} = \italic{builtinfunc} | \italic{builtinapp} \italic{builtinarg} \italic{builtinfunc} = var (bvar(var)) \italic{builtinarg} = \italic{coreexpr} \stoplambda --- TODO: Define tvar, fvar, typeof, representable -- TODO: Limit builtinarg further -- TODO: There can still be other casts around (which the code can handle, @@ -140,53 +224,7 @@ construction (\eg the \lam{case} statement) or call a builtin function (\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is available. -\subsection{Normal definition} -Formally, the normal form is a core program obeying the following -constraints. TODO: Update this section, this is probably not completely -accurate or relevant anymore. - -\startitemize[R,inmargin] -%\item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$. -%$fun$ is an identifier that will be bound as a global identifier. -%\item A $lamexpr$ has the form $\expr{\lam{arg}{lamexpr}}$ or -%$\expr{letexpr}$. $arg$ is an identifier which will be bound as an $argument$. -%\item[letexpr] A $letexpr$ has the form $\expr{\letexpr{letbinds}{retexpr}}$. -%\item $letbinds$ is a list with elements of the form -%$\expr{\bind{res}{appexpr}}$ or $\expr{\bind{res}{builtinexpr}}$, where $res$ is -%an identifier that will be bound as local identifier. The type of the bound -%value must be a $hardware\;type$. -%\item[builtinexpr] A $builtinexpr$ is an expression that can be mapped to an -%equivalent VHDL expression. Since there are many supported forms for this, -%these are defined in a separate table. -%\item An $appexpr$ has the form $\expr{fun}$ or $\expr{\app{appexpr}{x}}$, -%where $fun$ is a global identifier and $x$ is a local identifier. -%\item[retexpr] A $retexpr$ has the form $\expr{x}$ or $\expr{tupexpr}$, where $x$ is a local identifier that is bound as an $argument$ or $result$. A $retexpr$ must -%be of a $hardware\;type$. -%\item A $tupexpr$ has the form $\expr{con}$ or $\expr{\app{tupexpr}{x}}$, -%where $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$) and $x$ is -%a local identifier. -%\item A $hardware\;type$ is a type that can be directly translated to -%hardware. This includes the types $Bit$, $SizedWord$, tuples containing -%elements of $hardware\;type$s, and will include others. This explicitely -%excludes function types. -\stopitemize - -TODO: Say something about uniqueness of identifiers - -\subsection{Builtin expressions} -A $builtinexpr$, as defined at \in[builtinexpr] can have any of the following forms. - -\startitemize[m,inmargin] -%\item -%$tuple\_extract=\expr{\case{t}{\alt{\app{con}{x_0\;x_1\;..\;x_n}}{x_i}}}$, -%where $t$ can be any local identifier, $con$ is a tuple constructor ({\em -%e.g.} $(,)$ or $(,,,)$), $x_0$ to $x_n$ can be any identifier, and $x_i$ can -%be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$. -%\item TODO: Many more! -\stopitemize - \section{Transform passes} - In this section we describe the actual transforms. Here we're using the core language in a notation that resembles lambda calculus. @@ -221,13 +259,13 @@ E \lam{E :: * -> *} \stoptrans \startbuffer[from] -foo = λa -> case a of +foo = λa.case a of True -> λb.mul b b False -> id \stopbuffer \startbuffer[to] -foo = λa.λx -> (case a of +foo = λa.λx.(case a of True -> λb.mul b b False -> λy.id y) x \stopbuffer @@ -871,20 +909,6 @@ cannot be brought into normal form by this transform. We rely on an inlining transformation to replace such a variable with an expression we can propagate again. -TODO: Move these definitions somewhere sensible. - -Definition: A global variable is any variable that is bound at the -top level of a program. A local variable is any other variable. - -Definition: A hardware representable type is a type that we can generate -a signal for in hardware. For example, a bit, a vector of bits, a 32 bit -unsigned word, etc. Types that are not runtime representable notably -include (but are not limited to): Types, dictionaries, functions. - -Definition: A builtin function is a function for which a builtin -hardware translation is available, because its actual definition is not -translatable. A user-defined function is any other function. - \starttrans x = E ~ -- 2.30.2