From 6cb18e51dfaa9f30754a2e47e1cf8f845b701ec6 Mon Sep 17 00:00:00 2001 From: Matthijs Kooijman Date: Thu, 8 Oct 2009 14:42:50 +0200 Subject: [PATCH 1/1] Add a section on binder uniqueness. --- Chapters/Normalization.tex | 94 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex index d30cb6b..a2ccfa4 100644 --- a/Chapters/Normalization.tex +++ b/Chapters/Normalization.tex @@ -465,6 +465,100 @@ this notation are still a bit fuzzy, so comments are welcom. TODO: Formally describe the "apply to every (sub)expression" in terms of rules with full transformations in the conditions. +\subsection{Binder uniqueness} +A common problem in transformation systems, is binder uniqueness. When not +considering this problem, it is easy to create transformations that mix up +bindings and cause name collisions. Take for example, the following core +expression: + +\startlambda +(λa.λb.λc. a * b * c) x c +\stoplambda + +By applying β-reduction (see below) once, we can simplify this expression to: + +\startlambda +(λb.λc. x * b * c) c +\stoplambda + +Now, we have replaced the \lam{a} binder with a reference to the \lam{x} +binder. No harm done here. But note that we see multiple occurences of the +\lam{c} binder. The first is a binding occurence, to which the second refers. +The last, however refers to \emph{another} instance of \lam{c}, which is +bound somewhere outside of this expression. Now, if we would apply beta +reduction without taking heed of binder uniqueness, we would get: + +\startlambda +λc. x * c * c +\stoplambda + +This is obviously not what was supposed to happen! The root of this problem is +the reuse of binders: Identical binders can be bound in different scopes, such +that only the inner one is \quote{visible} in the inner expression. In the example +above, the \lam{c} binder was bound outside of the expression and in the inner +lambda expression. Inside that lambda expression, only the inner \lam{c} is +visible. + +There are a number of ways to solve this. \small{GHC} has isolated this +problem to their binder substitution code, which performs \emph{deshadowing} +during its expression traversal. This means that any binding that shadows +another binding on a higher level is replaced by a new binder that does not +shadow any other binding. This non-shadowing invariant is enough to prevent +binder uniqueness problems in \small{GHC}. + +In our transformation system, maintaining this non-shadowing invariant is +a bit harder to do (mostly due to implementation issues, the prototype doesn't +use \small{GHC}'s subsitution code). Also, we can observe the following +points. + +\startitemize +\item Deshadowing does not guarantee overall uniqueness. For example, the +following (slightly contrived) expression shows the identifier \lam{x} bound in +two seperate places (and to different values), even though no shadowing +occurs. + +\startlambda +(let x = 1 in x) + (let x = 2 in x) +\stoplambda + +\item In our normal form (and the resulting \small{VHDL}), all binders +(signals) will end up in the same scope. To allow this, all binders within the +same function should be unique. + +\item When we know that all binders in an expression are unique, moving around +or removing a subexpression will never cause any binder conflicts. If we have +some way to generate fresh binders, introducing new subexpressions will not +cause any problems either. The only way to cause conflicts is thus to +duplicate an existing subexpression. +\stopitemize + +Given the above, our prototype maintains a unique binder invariant. This +meanst that in any given moment during normalization, all binders \emph{within +a single function} must be unique. To achieve this, we apply the following +technique. + +TODO: Define fresh binders and unique supplies + +\startitemize +\item Before starting normalization, all binders in the function are made +unique. This is done by generating a fresh binder for every binder used. This +also replaces binders that did not pose any conflict, but it does ensure that +all binders within the function are generated by the same unique supply. See +(TODO: ref fresh binder). +\item Whenever a new binder must be generated, we generate a fresh binder that +is guaranteed to be different from \emph{all binders generated so far}. This +can thus never introduce duplication and will maintain the invariant. +\item Whenever (part of) an expression is duplicated (for example when +inlining), all binders in the expression are replaced with fresh binders +(using the same method as at the start of normalization). These fresh binders +can never introduce duplication, so this will maintain the invariant. +\item Whenever we move part of an expression around within the function, there +is no need to do anything special. There is obviously no way to introduce +duplication by moving expressions around. Since we know that each of the +binders is already unique, there is no way to introduce (incorrect) shadowing +either. +\stopitemize + \subsection{η-abstraction} This transformation makes sure that all arguments of a function-typed expression are named, by introducing lambda expressions. When combined with -- 2.30.2