From: Matthijs Kooijman <matthijs@stdin.nl>
Date: Mon, 30 Nov 2009 19:50:12 +0000 (+0100)
Subject: Finalize the proofs section.
X-Git-Tag: final-thesis~136
X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=commitdiff_plain;h=627861d0a5c3dfeb94b75f0f09c3e45906e46ea0

Finalize the proofs section.

Also add a section about proofs to the future work section.
---

diff --git a/Chapters/Future.tex b/Chapters/Future.tex
index daeee3f..852d14f 100644
--- a/Chapters/Future.tex
+++ b/Chapters/Future.tex
@@ -616,4 +616,12 @@ lightly.
   methods for describing don't care conditions. Possibly there are completely
   other methods which work better.
 
+\section{Correctness proofs of the normalization system}
+As stated in \in{section}[sec:normalization:properties], there are a
+number of properties we would like to see verified about the
+normalization system.  In particular, the \emph{termination} and
+\emph{completeness} of the system would be a good candidate for future
+research. Specifying formal semantics for the Core language in
+order to verify the \emph{soundness} of the system would be an even more
+challenging task.
 % vim: set sw=2 sts=2 expandtab:
diff --git a/Chapters/Normalization.tex b/Chapters/Normalization.tex
index 941958d..4e00759 100644
--- a/Chapters/Normalization.tex
+++ b/Chapters/Normalization.tex
@@ -325,6 +325,7 @@
       Some clauses have an expression listed in parentheses. These are conditions
       that need to apply to the clause.
 
+      \defref{intended normal form definition}
       \todo{Fix indentation}
       \startlambda
       \italic{normal} = \italic{lambda}
@@ -1712,7 +1713,7 @@
         \transexample{nonrepinline}{Nonrepresentable binding inlining}{from}{to}
 
 
-  \section{Provable properties}
+  \section[sec:normalization:properties]{Provable properties}
     When looking at the system of transformations outlined above, there are a
     number of questions that we can ask ourselves. The main question is of course:
     \quote{Does our system work as intended?}. We can split this question into a
@@ -1743,6 +1744,21 @@
     three: The translator would still function properly without it.
     \stopitemize
 
+    Unfortunately, the final transformation system has only been
+    developed in the final part of the research, leaving no more time
+    for verifying these properties. In fact, it is likely that the
+    current transformation system still violates some of these
+    properties in some cases and should be improved (or extra conditions
+    on the input hardware descriptions should be formulated).
+
+    This is most likely the case with the completeness and determinism
+    properties, perhaps als the termination property. The soundness
+    property probably holds, since it is easier to manually verify (each
+    transformation can be reviewed separately).
+
+    Even though no complete proofs have been made, some ideas for
+    possible proof strategies are shown below.
+
     \subsection{Graph representation}
       Before looking into how to prove these properties, we'll look at our
       transformation system from a graph perspective. The nodes of the graph are
@@ -1813,7 +1829,9 @@
       Note that the normal form of such a system consists of the set of nodes
       (expressions) without outgoing edges, since those are the expression to which
       no transformation applies anymore. We call this set of nodes the \emph{normal
-      set}.
+      set}. The set of nodes containing expressions in intended normal
+      form \refdef{intended normal form} is called the \emph{intended
+      normal set}.
 
       From such a graph, we can derive some properties easily:
       \startitemize[KR]
@@ -1823,7 +1841,10 @@
         \item A system is \emph{complete} if all of the nodes in the normal set have
         the intended normal form. The inverse (that all of the nodes outside of
         the normal set are \emph{not} in the intended normal form) is not
-        strictly required.
+        strictly required. In other words, our normal set must be a
+        subset of the intended normal form, but they do not need to be
+        the same set.
+        form.
         \item A system is deterministic if all paths starting at a particular
         node, which end in a node in the normal set, end at the same node.
       \stopitemize
@@ -1847,20 +1868,112 @@
 
     \todo{Add content to these sections}
     \subsection{Termination}
-      Approach: Counting.
-
-      Church-Rosser?
+      In general, proving termination of an arbitrary program is a very
+      hard problem. \todo{Ref about arbitrary termination} Fortunately,
+      we only have to prove termination for our specific transformation
+      system.
+
+      A common approach for these kinds of proofs is to associate a
+      measure with each possible expression in our system. If we can
+      show that each transformation strictly decreases this measure
+      (\ie, the expression transformed to has a lower measure than the
+      expression transformed from).  \todo{ref about measure-based
+      termination proofs / analysis}
+      
+      A good measure for a system consisting of just Î²-reduction would
+      be the number of lambda expressions in the expression. Since every
+      application of Î²-reduction removes a lambda abstraction (and there
+      is always a bounded number of lambda abstractions in every
+      expression) we can easily see that a transformation system with
+      just Î²-reduction will always terminate.
+
+      For our complete system, this measure would be fairly complex
+      (probably the sum of a lot of things). Since the (conditions on)
+      our transformations are pretty complex, we would need to include
+      both simple things like the number of let expressions as well as
+      more complex things like the number of case expressions that are
+      not yet in normal form.
+
+      No real attempt has been made at finding a suitable measure for
+      our system yet.
 
     \subsection{Soundness}
-      Needs formal definition of semantics.
-      Prove for each transformation seperately, implies soundness of the system.
-     
+      Soundness is a property that can be proven for each transformation
+      separately. Since our system only runs separate transformations
+      sequentially, if each of our transformations leaves the
+      \emph{meaning} of the expression unchanged, then the entire system
+      will of course leave the meaning unchanged and is thus
+      \emph{sound}.
+
+      The current prototype has only been verified in an ad-hoc fashion
+      by inspecting (the code for) each transformation. A more formal
+      verification would be more appropriate.
+
+      To be able to formally show that each transformation properly
+      preserves the meaning of every expression, we require an exact
+      definition of the \emph{meaning} of every expression, so we can
+      compare them. Currently there seems to be no formal definition of
+      the meaning or semantics of \GHC's core language, only informal
+      descriptions are available.
+
+      It should be possible to have a single formal definition of
+      meaning for Core for both normal Core compilation by \GHC and for
+      our compilation to \VHDL. The main difference seems to be that in
+      hardware every expression is always evaluated, while in software
+      it is only evaluated if needed, but it should be possible to
+      assign a meaning to core expressions that assumes neither.
+      
+      Since each of the transformations can be applied to any
+      subexpression as well, there is a constraint on our meaning
+      definition: The meaning of an expression should depend only on the
+      meaning of subexpressions, not on the expressions themselves. For
+      example, the meaning of the application in \lam{f (let x = 4 in
+      x)} should be the same as the meaning of the application in \lam{f
+      4}, since the argument subexpression has the same meaning (though
+      the actual expression is different).
+      
     \subsection{Completeness}
-      Show that any transformation applies to every Core expression that is not
-      in normal form. To prove: no transformation applies => in intended form.
-      Show the reverse: Not in intended form => transformation applies.
+      Proving completeness is probably not hard, but it could be a lot
+      of work. We have seen above that to prove completeness, we must
+      show that the normal set of our graph representation is a subset
+      of the intended normal set.
+
+      However, it is hard to systematically generate or reason about the
+      normal set, since it is defined as any nodes to which no
+      transformation applies. To determine this set, each transformation
+      must be considered and when a transformation is added, the entire
+      set should be re-evaluated. This means it is hard to show that
+      each node in the normal set is also in the intended normal set.
+      Reasoning about our intended normal set is easier, since we know
+      how to generate it from its definition. \refdef{intended normal
+      form definition}.
+
+      Fortunately, we can also prove the complement (which is
+      equivalent, since $A \subseteq B \Leftrightarrow \overline{B}
+      \subseteq \overline{A}$): Show that the set of nodes not in
+      intended normal form is a subset of the set of nodes not in normal
+      form. In other words, show that for every expression that is not
+      in intended normal form, that there is at least one transformation
+      that applies to it (since that means it is not in normal form
+      either and since $A \subseteq C \Leftrightarrow \forall x (x \in A
+      \rightarrow x \in C)$).
+
+      By systematically reviewing the entire Core language definition
+      along with the intended normal form definition (both of which have
+      a similar structure), it should be possible to identify all
+      possible (sets of) core expressions that are not in intended
+      normal form and identify a transformation that applies to it.
+      
+      This approach is especially useful for proving completeness of our
+      system, since if expressions exist to which none of the
+      transformations apply (\ie if the system is not yet complete), it
+      is immediately clear which expressions these are and adding
+      (or modifying) transformations to fix this should be relatively
+      easy.
+
+      As observed above, applying this approach is a lot of work, since
+      we need to check every (set of) transformation(s) separately.
 
-    \subsection{Determinism}
-      How to prove this?
+      \todo{Perhaps do a few steps of the proofs as proof-of-concept}
 
 % vim: set sw=2 sts=2 expandtab:
diff --git a/Outline b/Outline
index 8f2d0c3..533ca69 100644
--- a/Outline
+++ b/Outline
@@ -56,3 +56,5 @@ TODO: Expand on "representable"
 TODO: Register
 TODO: Variable vs binder
 TODO: simplification -> Normalisation?
+TODO: Use saturated (application) instead of complete (application)?
+TODO: core => Core