Chapters/Prototype.tex

   1 \chapter{Prototype}
   2   An important step in this research is the creation of a prototype compiler.
   3   Having this prototype allows us to apply the ideas from the previous chapter
   4   to actual hardware descriptions and evaluate their usefulness. Having a
   5   prototype also helps to find new techniques and test possible
   6   interpretations.
   7
   8   Obviously the prototype was not created after all research
   9   ideas were formed, but its implementation has been interleaved with the
  10   research itself. Also, the prototype described here is the final version, it
  11   has gone through a number of design iterations which we will not completely
  12   describe here.
  13
  14   \section{Choice of language}
  15     When implementing this prototype, the first question to ask is: What
  16     (functional) language will we use to describe our hardware? (Note that
  17     this does not concern the \emph{implementation language} of the compiler,
  18     just the language \emph{translated by} the compiler).
  19
  20     On the highest level, we have two choices:
  21
  22     \startitemize
  23       \item Create a new functional language from scratch. This has the
  24       advantage of having a language that contains exactly those elements that
  25       are convenient for describing hardware and can contain special
  26       constructs that might.
  27       \item Use an existing language and create a new backend for it. This has
  28       the advantage that existing tools can be reused, which will speed up
  29       development.
  30     \stopitemize
  31
  32     Considering that we required a prototype which should be working quickly,
  33     and that implementing parsers, semantic checkers and especially
  34     typcheckers isn't exactly the core of this research (but it is lots and
  35     lots of work!), using an existing language is the obvious choice. This
  36     also has the advantage that a large set of language features is available
  37     to experiment with and it is easy to find which features apply well and
  38     which don't. A possible second prototype could use a custom language with
  39     just the useful features (and possibly extra features that are specific to
  40     the domain of hardware description as well).
  41
  42     The second choice is to pick one of the many existing languages. As
  43     mentioned before, this language is Haskell.  This choice has not been the
  44     result of a thorough comparison of languages, for the simple reason that
  45     the requirements on the language were completely unclear at the start of
  46     this language. The fact that Haskell is a language with a broad spectrum
  47     of features, that it is commonly used in research projects and that the
  48     primary compiler, GHC, provides a high level API to its internals, made
  49     Haskell an obvious choice.
  50
  51     TODO: Was Haskell really a good choice? Perhaps say this somewhere else?
  52
  53   \section{Prototype design}
  54     As stated above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
  55     implement our prototype compiler. To understand the design of the
  56     compiler, we will first dive into the \small{GHC} compiler a bit. It's
  57     compilation consists of the following steps (slightly simplified):
  58
  59     \startuseMPgraphic{ghc-pipeline}
  60       % Create objects
  61       save inp, front, desugar, simpl, back, out;
  62       newEmptyBox.inp(0,0);
  63       newBox.front(btex Parser etex);
  64       newBox.desugar(btex Desugarer etex);
  65       newBox.simpl(btex Simplifier etex);
  66       newBox.back(btex Backend etex);
  67       newEmptyBox.out(0,0);
  68
  69       % Space the boxes evenly
  70       inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c
  71         = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
  72       out.c = origin;
  73
  74       % Draw lines between the boxes. We make these lines "deferred" and give
  75       % them a name, so we can use ObjLabel to draw a label beside them.
  76       ncline.inp(inp)(front) "name(haskell)";
  77       ncline.front(front)(desugar) "name(ast)";
  78       ncline.desugar(desugar)(simpl) "name(core)";
  79       ncline.simpl(simpl)(back) "name(simplcore)";
  80       ncline.back(back)(out) "name(native)";
  81       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
  82       ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
  83       ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
  84       ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
  85       ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
  86
  87       % Draw the objects (and deferred labels)
  88       drawObj (inp, front, desugar, simpl, back, out);
  89     \stopuseMPgraphic
  90     \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
  91
  92     \startdesc{Frontend}
  93       This step takes the Haskell source files and parses them into an
  94       abstract syntax tree (\small{AST}). This \small{AST} can express the
  95       complete Haskell language and is thus a very complex one (in contrast
  96       with the Core \small{AST}, later on). All identifiers in this
  97       \small{AST} are resolved by the renamer and all types are checked by the
  98       typechecker.
  99     \stopdesc
 100     \startdesc{Desugaring}
 101       This steps takes the full \small{AST} and translates it to the
 102       \emph{Core} language. Core is a very small functional language with lazy
 103       semantics, that can still express everything Haskell can express. Its
 104       simpleness makes Core very suitable for further simplification and
 105       translation. Core is the language we will be working on as well.
 106     \stopdesc
 107     \startdesc{Simplification}
 108       Through a number of simplification steps (such as inlining, common
 109       subexpression elimination, etc.) the Core program is simplified to make
 110       it faster or easier to process further.
 111     \stopdesc
 112     \startdesc{Backend}
 113       This step takes the simplified Core program and generates an actual
 114       runnable program for it. This is a big and complicated step we will not
 115       discuss it any further, since it is not required for our prototype.
 116     \stopdesc
 117
 118     In this process, there a number of places where we can start our work.
 119     Assuming that we don't want to deal with (or modify) parsing, typechecking
 120     and other frontend business and that native code isn't really a useful
 121     format anymore, we are left with the choice between the full Haskell
 122     \small{AST}, or the smaller (simplified) core representation.
 123
 124     The advantage of taking the full \small{AST} is that the exact structure
 125     of the source program is preserved. We can see exactly what the hardware
 126     descriiption looks like and which syntax constructs were used. However,
 127     the full \small{AST} is a very complicated datastructure. If we are to
 128     handle everything it offers, we will quickly get a big compiler.
 129
 130     Using the core representation gives us a much more compact datastructure
 131     (a core expression only uses 9 constructors). Note that this does not mean
 132     that the core representation itself is smaller, on the contrary. Since the
 133     core language has less constructs, a lot of things will take a larger
 134     expression to express.
 135
 136     However, the fact that the core language is so much smaller, means it is a
 137     lot easier to analyze and translate it into something else. For the same
 138     reason, \small{GHC} runs its simplifications and optimizations on the core
 139     representation as well.
 140
 141     However, we will use the normal core representation, not the simplified
 142     core. Reasons for this are detailed below.
 143
 144     The final prototype roughly consists of three steps:
 145
 146     \startuseMPgraphic{ghc-pipeline}
 147       % Create objects
 148       save inp, front, norm, vhdl, out;
 149       newEmptyBox.inp(0,0);
 150       newBox.front(btex \small{GHC} frontend + desugarer etex);
 151       newBox.norm(btex Normalization etex);
 152       newBox.vhdl(btex VHDL generation etex);
 153       newEmptyBox.out(0,0);
 154
 155       % Space the boxes evenly
 156       inp.c - front.c = front.c - norm.c = norm.c - vhdl.c
 157         = vhdl.c - out.c = (0, 1.5cm);
 158       out.c = origin;
 159
 160       % Draw lines between the boxes. We make these lines "deferred" and give
 161       % them a name, so we can use ObjLabel to draw a label beside them.
 162       ncline.inp(inp)(front) "name(haskell)";
 163       ncline.front(front)(norm) "name(core)";
 164       ncline.norm(norm)(vhdl) "name(normal)";
 165       ncline.vhdl(vhdl)(out) "name(vhdl)";
 166       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 167       ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
 168       ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
 169       ObjLabel.vhdl(btex VHDL description etex) "labpathname(vhdl)", "labdir(rt)";
 170
 171       % Draw the objects (and deferred labels)
 172       drawObj (inp, front, norm, vhdl, out);
 173     \stopuseMPgraphic
 174     \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
 175
 176     \startdesc{Frontend}
 177       This is exactly the frontend and desugarer from the \small{GHC}
 178       pipeline, that translates Haskell sources to a core representation.
 179     \stopdesc
 180     \startdesc{Normalization}
 181       This is a step that transforms the core representation into a normal
 182       form. This normal form is still expressed in the core language, but has
 183       to adhere to an extra set of constraints. This normal form is less
 184       expressive than the full core language (e.g., it can have limited higher
 185       order expressions, has a specific structure, etc.), but is also very
 186       close to directly describing hardware.
 187     \stopdesc
 188     \startdesc{VHDL generation}
 189       The last step takes the normal formed core representation and generates
 190       VHDL for it. Since the normal form has a specific, hardware-like
 191       structure, this final step is very straightforward.
 192     \stopdesc
 193
 194     The most interesting step in this process is the normalization step. That
 195     is where more complicated functional constructs, which have no direct
 196     hardware interpretation, are removed and translated into hardware
 197     constructs. This step is described in a lot of detail at
 198     \in{chapter}[chap:normalization].
 199
 200
 201         Core - description of the language (appendix?)
 202         Implementation issues
 203         Simplified core?
 204
 205         Haskell language coverage / constraints
 206                 Recursion
 207                 Builtin types
 208                 Custom types (Sum types, product types)
 209                 Function types / higher order expressions
 210