X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FPrototype.tex;h=9e6893acccee0ff3d319545c78064c8a4365d5ae;hp=1db9c6d73aff1d766dabffeed186d63bd43cebcc;hb=dd26f4ae34699581838a0c1b736d14adfcafe10d;hpb=d957c07d7ba3f6b2bccc6aafb1c8012882556c6e diff --git a/Chapters/Prototype.tex b/Chapters/Prototype.tex index 1db9c6d..9e6893a 100644 --- a/Chapters/Prototype.tex +++ b/Chapters/Prototype.tex @@ -1,8 +1,206 @@ -\chapter{Prototype} - Choice of Haskell +\chapter[chap:prototype]{Prototype} + An important step in this research is the creation of a prototype compiler. + Having this prototype allows us to apply the ideas from the previous chapter + to actual hardware descriptions and evaluate their usefulness. Having a + prototype also helps to find new techniques and test possible + interpretations. + + Obviously the prototype was not created after all research + ideas were formed, but its implementation has been interleaved with the + research itself. Also, the prototype described here is the final version, it + has gone through a number of design iterations which we will not completely + describe here. + + \section{Choice of language} + When implementing this prototype, the first question to ask is: What + (functional) language will we use to describe our hardware? (Note that + this does not concern the \emph{implementation language} of the compiler, + just the language \emph{translated by} the compiler). + + On the highest level, we have two choices: + + \startitemize + \item Create a new functional language from scratch. This has the + advantage of having a language that contains exactly those elements that + are convenient for describing hardware and can contain special + constructs that might. + \item Use an existing language and create a new backend for it. This has + the advantage that existing tools can be reused, which will speed up + development. + \stopitemize + + Considering that we required a prototype which should be working quickly, + and that implementing parsers, semantic checkers and especially + typcheckers isn't exactly the core of this research (but it is lots and + lots of work!), using an existing language is the obvious choice. This + also has the advantage that a large set of language features is available + to experiment with and it is easy to find which features apply well and + which don't. A possible second prototype could use a custom language with + just the useful features (and possibly extra features that are specific to + the domain of hardware description as well). + + The second choice is to pick one of the many existing languages. As + mentioned before, this language is Haskell. This choice has not been the + result of a thorough comparison of languages, for the simple reason that + the requirements on the language were completely unclear at the start of + this language. The fact that Haskell is a language with a broad spectrum + of features, that it is commonly used in research projects and that the + primary compiler, GHC, provides a high level API to its internals, made + Haskell an obvious choice. + + TODO: Was Haskell really a good choice? Perhaps say this somewhere else? + + \section{Prototype design} + As stated above, we will use the Glasgow Haskell Compiler (\small{GHC}) to + implement our prototype compiler. To understand the design of the + compiler, we will first dive into the \small{GHC} compiler a bit. It's + compilation consists of the following steps (slightly simplified): + + \startuseMPgraphic{ghc-pipeline} + % Create objects + save inp, front, desugar, simpl, back, out; + newEmptyBox.inp(0,0); + newBox.front(btex Parser etex); + newBox.desugar(btex Desugarer etex); + newBox.simpl(btex Simplifier etex); + newBox.back(btex Backend etex); + newEmptyBox.out(0,0); + + % Space the boxes evenly + inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c + = simpl.c - back.c = back.c - out.c = (0, 1.5cm); + out.c = origin; + + % Draw lines between the boxes. We make these lines "deferred" and give + % them a name, so we can use ObjLabel to draw a label beside them. + ncline.inp(inp)(front) "name(haskell)"; + ncline.front(front)(desugar) "name(ast)"; + ncline.desugar(desugar)(simpl) "name(core)"; + ncline.simpl(simpl)(back) "name(simplcore)"; + ncline.back(back)(out) "name(native)"; + ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)"; + ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)"; + ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)"; + ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)"; + ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)"; + + % Draw the objects (and deferred labels) + drawObj (inp, front, desugar, simpl, back, out); + \stopuseMPgraphic + \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}} + + \startdesc{Frontend} + This step takes the Haskell source files and parses them into an + abstract syntax tree (\small{AST}). This \small{AST} can express the + complete Haskell language and is thus a very complex one (in contrast + with the Core \small{AST}, later on). All identifiers in this + \small{AST} are resolved by the renamer and all types are checked by the + typechecker. + \stopdesc + \startdesc{Desugaring} + This steps takes the full \small{AST} and translates it to the + \emph{Core} language. Core is a very small functional language with lazy + semantics, that can still express everything Haskell can express. Its + simpleness makes Core very suitable for further simplification and + translation. Core is the language we will be working on as well. + \stopdesc + \startdesc{Simplification} + Through a number of simplification steps (such as inlining, common + subexpression elimination, etc.) the Core program is simplified to make + it faster or easier to process further. + \stopdesc + \startdesc{Backend} + This step takes the simplified Core program and generates an actual + runnable program for it. This is a big and complicated step we will not + discuss it any further, since it is not required for our prototype. + \stopdesc + + In this process, there a number of places where we can start our work. + Assuming that we don't want to deal with (or modify) parsing, typechecking + and other frontend business and that native code isn't really a useful + format anymore, we are left with the choice between the full Haskell + \small{AST}, or the smaller (simplified) core representation. + + The advantage of taking the full \small{AST} is that the exact structure + of the source program is preserved. We can see exactly what the hardware + descriiption looks like and which syntax constructs were used. However, + the full \small{AST} is a very complicated datastructure. If we are to + handle everything it offers, we will quickly get a big compiler. + + Using the core representation gives us a much more compact datastructure + (a core expression only uses 9 constructors). Note that this does not mean + that the core representation itself is smaller, on the contrary. Since the + core language has less constructs, a lot of things will take a larger + expression to express. + + However, the fact that the core language is so much smaller, means it is a + lot easier to analyze and translate it into something else. For the same + reason, \small{GHC} runs its simplifications and optimizations on the core + representation as well. + + However, we will use the normal core representation, not the simplified + core. Reasons for this are detailed below. + + The final prototype roughly consists of three steps: + + \startuseMPgraphic{ghc-pipeline} + % Create objects + save inp, front, norm, vhdl, out; + newEmptyBox.inp(0,0); + newBox.front(btex \small{GHC} frontend + desugarer etex); + newBox.norm(btex Normalization etex); + newBox.vhdl(btex VHDL generation etex); + newEmptyBox.out(0,0); + + % Space the boxes evenly + inp.c - front.c = front.c - norm.c = norm.c - vhdl.c + = vhdl.c - out.c = (0, 1.5cm); + out.c = origin; + + % Draw lines between the boxes. We make these lines "deferred" and give + % them a name, so we can use ObjLabel to draw a label beside them. + ncline.inp(inp)(front) "name(haskell)"; + ncline.front(front)(norm) "name(core)"; + ncline.norm(norm)(vhdl) "name(normal)"; + ncline.vhdl(vhdl)(out) "name(vhdl)"; + ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)"; + ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)"; + ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)"; + ObjLabel.vhdl(btex VHDL description etex) "labpathname(vhdl)", "labdir(rt)"; + + % Draw the objects (and deferred labels) + drawObj (inp, front, norm, vhdl, out); + \stopuseMPgraphic + \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}} + + \startdesc{Frontend} + This is exactly the frontend and desugarer from the \small{GHC} + pipeline, that translates Haskell sources to a core representation. + \stopdesc + \startdesc{Normalization} + This is a step that transforms the core representation into a normal + form. This normal form is still expressed in the core language, but has + to adhere to an extra set of constraints. This normal form is less + expressive than the full core language (e.g., it can have limited higher + order expressions, has a specific structure, etc.), but is also very + close to directly describing hardware. + \stopdesc + \startdesc{VHDL generation} + The last step takes the normal formed core representation and generates + VHDL for it. Since the normal form has a specific, hardware-like + structure, this final step is very straightforward. + \stopdesc + + The most interesting step in this process is the normalization step. That + is where more complicated functional constructs, which have no direct + hardware interpretation, are removed and translated into hardware + constructs. This step is described in a lot of detail at + \in{chapter}[chap:normalization]. + + Core - description of the language (appendix?) - Stages (-> Core, Normalization, -> VHDL) Implementation issues + Simplified core? Haskell language coverage / constraints Recursion