From: Matthijs Kooijman Date: Tue, 29 Sep 2009 14:22:50 +0000 (+0200) Subject: Further expand the Prototype chapter. X-Git-Tag: final-thesis~247 X-Git-Url: https://git.stderr.nl/gitweb?a=commitdiff_plain;h=fc1688977fdd8ee18e027876b8d86b0c38e25540;p=matthijs%2Fmaster-project%2Freport.git Further expand the Prototype chapter. --- diff --git a/Chapters/Prototype.tex b/Chapters/Prototype.tex index 64fbbe7..945e920 100644 --- a/Chapters/Prototype.tex +++ b/Chapters/Prototype.tex @@ -56,6 +56,39 @@ compiler, we will first dive into the \small{GHC} compiler a bit. It's compilation consists of the following steps (slightly simplified): + \startuseMPgraphic{ghc-pipeline} + % Create objects + save inp, front, desugar, simpl, back, out; + newEmptyBox.inp(0,0); + newBox.front(btex Parser etex); + newBox.desugar(btex Desugarer etex); + newBox.simpl(btex Simplifier etex); + newBox.back(btex Backend etex); + newEmptyBox.out(0,0); + + % Space the boxes evenly + inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c + = simpl.c - back.c = back.c - out.c = (0, 1.5cm); + out.c = origin; + + % Draw lines between the boxes. We make these lines "deferred" and give + % them a name, so we can use ObjLabel to draw a label beside them. + ncline.inp(inp)(front) "name(haskell)"; + ncline.front(front)(desugar) "name(ast)"; + ncline.desugar(desugar)(simpl) "name(core)"; + ncline.simpl(simpl)(back) "name(simplcore)"; + ncline.back(back)(out) "name(native)"; + ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)"; + ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)"; + ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)"; + ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)"; + ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)"; + + % Draw the objects (and deferred labels) + drawObj (inp, front, desugar, simpl, back, out); + \stopuseMPgraphic + \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}} + \startdesc{Frontend} This step takes the Haskell source files and parses them into an abstract syntax tree (\small{AST}). This \small{AST} can express the @@ -82,9 +115,92 @@ discuss it any further, since it is not required for our prototype. \stopdesc + In this process, there a number of places where we can start our work. + Assuming that we don't want to deal with (or modify) parsing, typechecking + and other frontend business and that native code isn't really a useful + format anymore, we are left with the choice between the full Haskell + \small{AST}, or the smaller (simplified) core representation. + + The advantage of taking the full \small{AST} is that the exact structure + of the source program is preserved. We can see exactly what the hardware + descriiption looks like and which syntax constructs were used. However, + the full \small{AST} is a very complicated datastructure. If we are to + handle everything it offers, we will quickly get a big compiler. + + Using the core representation gives us a much more compact datastructure + (a core expression only uses 9 constructors). Note that this does not mean + that the core representation itself is smaller, on the contrary. Since the + core language has less constructs, a lot of things will take a larger + expression to express. + + However, the fact that the core language is so much smaller, means it is a + lot easier to analyze and translate it into something else. For the same + reason, \small{GHC} runs its simplifications and optimizations on the core + representation as well. + + However, we will use the normal core representation, not the simplified + core. Reasons for this are detailed below. + + The final prototype roughly consists of three steps: + + \startuseMPgraphic{ghc-pipeline} + % Create objects + save inp, front, norm, vhdl, out; + newEmptyBox.inp(0,0); + newBox.front(btex \small{GHC} frontend + desugarer etex); + newBox.norm(btex Normalization etex); + newBox.vhdl(btex VHDL generation etex); + newEmptyBox.out(0,0); + + % Space the boxes evenly + inp.c - front.c = front.c - norm.c = norm.c - vhdl.c + = vhdl.c - out.c = (0, 1.5cm); + out.c = origin; + + % Draw lines between the boxes. We make these lines "deferred" and give + % them a name, so we can use ObjLabel to draw a label beside them. + ncline.inp(inp)(front) "name(haskell)"; + ncline.front(front)(norm) "name(core)"; + ncline.norm(norm)(vhdl) "name(normal)"; + ncline.vhdl(vhdl)(out) "name(vhdl)"; + ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)"; + ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)"; + ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)"; + ObjLabel.vhdl(btex VHDL description etex) "labpathname(vhdl)", "labdir(rt)"; + + % Draw the objects (and deferred labels) + drawObj (inp, front, norm, vhdl, out); + \stopuseMPgraphic + \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}} + + \startdesc{Frontend} + This is exactly the frontend and desugarer from the \small{GHC} + pipeline, that translates Haskell sources to a core representation. + \stopdesc + \startdesc{Normalization} + This is a step that transforms the core representation into a normal + form. This normal form is still expressed in the core language, but has + to adhere to an extra set of constraints. This normal form is less + expressive than the full core language (e.g., it can have limited higher + order expressions, has a specific structure, etc.), but is also very + close to directly describing hardware. + \stopdesc + \startdesc{VHDL generation} + The last step takes the normal formed core representation and generates + VHDL for it. Since the normal form has a specific, hardware-like + structure, this final step is very straightforward. + \stopdesc + + The most interesting step in this process is the normalization step. That + is where more complicated functional constructs, which have no direct + hardware interpretation, are removed and translated into hardware + constructs. This step is described in a lot of detail at + \in{chapter}[chap:normalization]. + + Core - description of the language (appendix?) - Stages (-> Core, Normalization, -> VHDL) Implementation issues + Simplified core? Haskell language coverage / constraints Recursion