Chapters/Prototype.tex

   1 \chapter[chap:prototype]{Prototype}
   2   An important step in this research is the creation of a prototype compiler.
   3   Having this prototype allows us to apply the ideas from the previous chapter
   4   to actual hardware descriptions and evaluate their usefulness. Having a
   5   prototype also helps to find new techniques and test possible
   6   interpretations.
   7
   8   Obviously the prototype was not created after all research
   9   ideas were formed, but its implementation has been interleaved with the
  10   research itself. Also, the prototype described here is the final version, it
  11   has gone through a number of design iterations which we will not completely
  12   describe here.
  13
  14   \section[sec:prototype:input]{Input language}
  15     When implementing this prototype, the first question to ask is: What
  16     (functional) language will we use to describe our hardware? (Note that
  17     this does not concern the \emph{implementation language} of the compiler,
  18     just the language \emph{translated by} the compiler).
  19
  20     Initially, we have two choices:
  21
  22     \startitemize
  23       \item Create a new functional language from scratch. This has the
  24       advantage of having a language that contains exactly those elements that
  25       are convenient for describing hardware and can contain special
  26       constructs that allows our hardware descriptions to be more powerful or
  27       concise.
  28       \item Use an existing language and create a new backend for it. This has
  29       the advantage that existing tools can be reused, which will speed up
  30       development.
  31     \stopitemize
  32
  33     \todo{Sidenote: No EDSL}
  34
  35     Considering that we required a prototype which should be working quickly,
  36     and that implementing parsers, semantic checkers and especially
  37     typcheckers is not exactly the core of this research (but it is lots and
  38     lots of work!), using an existing language is the obvious choice. This
  39     also has the advantage that a large set of language features is available
  40     to experiment with and it is easy to find which features apply well and
  41     which don't. A possible second prototype could use a custom language with
  42     just the useful features (and possibly extra features that are specific to
  43     the domain of hardware description as well).
  44
  45     The second choice is which of the many existing languages to use. As
  46     mentioned before, the chosen language is Haskell.  This choice has not been the
  47     result of a thorough comparison of languages, for the simple reason that
  48     the requirements on the language were completely unclear at the start of
  49     this research. The fact that Haskell is a language with a broad spectrum
  50     of features, that it is commonly used in research projects and that the
  51     primary compiler, \GHC, provides a high level API to its internals, made
  52     Haskell an obvious choice.
  53
  54   \section[sec:prototype:output]{Output format}
  55     The second important question is: What will be our output format? Since
  56     our prototype won't be able to program FPGA's directly, we'll have to have
  57     output our hardware in some format that can be later processed and
  58     programmed by other tools.
  59
  60     Looking at other tools in the industry, the Electronic Design Interchange
  61     Format (\small{EDIF}) is commonly used for storing intermediate
  62     \emph{netlists} (lists of components and connections between these
  63     components) and is commonly the target for \small{VHDL} and Verilog
  64     compilers.
  65
  66     However, \small{EDIF} is not completely tool-independent. It specifies a
  67     meta-format, but the hardware components that can be used vary between
  68     various tool and hardware vendors, as well as the interpretation of the
  69     \small{EDIF} standard. \todo{Is this still true? Reference:
  70     http://delivery.acm.org/10.1145/80000/74534/p803-li.pdf?key1=74534\&key2=8370537521\&coll=GUIDE\&dl=GUIDE\&CFID=61207158\&CFTOKEN=61908473}
  71
  72     This means that when working with \small{EDIF}, our prototype would become
  73     technology dependent (\eg only work with \small{FPGA}s of a specific
  74     vendor, or even only with specific chips). This limits the applicability
  75     of our prototype. Also, the tools we'd like to use for verifying,
  76     simulating and draw pretty pictures of our output (like Precision, or
  77     QuestaSim) are designed for \small{VHDL} or Verilog input.
  78
  79     For these reasons, we will not use \small{EDIF}, but \small{VHDL} as our
  80     output language.  We choose \VHDL over Verilog simply because we are
  81     familiar with \small{VHDL} already. The differences between \small{VHDL}
  82     and Verilog are on the higher level, while we will be using \small{VHDL}
  83     mainly to write low level, netlist-like descriptions anyway.
  84
  85     An added advantage of using VHDL is that we can profit from existing
  86     optimizations in VHDL synthesizers. A lot of optimizations are done on the
  87     VHDL level by existing tools. These tools have years of experience in this
  88     field, so it would not be reasonable to assume we could achieve a similar
  89     amount of optimization in our prototype (nor should it be a goal,
  90     considering this is just a prototype).
  91
  92     Note that we will be using \small{VHDL} as our output language, but will
  93     not use its full expressive power. Our output will be limited to using
  94     simple, structural descriptions, without any behavioural descriptions
  95     (which might not be supported by all tools). This ensures that any tool
  96     that works with \VHDL will understand our output (most tools don't support
  97     synthesis of more complex \VHDL). This also leaves open the option to
  98     switch to \small{EDIF} in the future, with minimal changes to the
  99     prototype.
 100
 101   \section{Prototype design}
 102     As suggested above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
 103     implement our prototype compiler. To understand the design of the
 104     compiler, we will first dive into the \small{GHC} compiler a bit. It's
 105     compilation consists of the following steps (slightly simplified):
 106
 107     \startuseMPgraphic{ghc-pipeline}
 108       % Create objects
 109       save inp, front, desugar, simpl, back, out;
 110       newEmptyBox.inp(0,0);
 111       newBox.front(btex Fronted etex);
 112       newBox.desugar(btex Desugarer etex);
 113       newBox.simpl(btex Simplifier etex);
 114       newBox.back(btex Backend etex);
 115       newEmptyBox.out(0,0);
 116
 117       % Space the boxes evenly
 118       inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c
 119         = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
 120       out.c = origin;
 121
 122       % Draw lines between the boxes. We make these lines "deferred" and give
 123       % them a name, so we can use ObjLabel to draw a label beside them.
 124       ncline.inp(inp)(front) "name(haskell)";
 125       ncline.front(front)(desugar) "name(ast)";
 126       ncline.desugar(desugar)(simpl) "name(core)";
 127       ncline.simpl(simpl)(back) "name(simplcore)";
 128       ncline.back(back)(out) "name(native)";
 129       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 130       ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
 131       ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
 132       ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
 133       ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
 134
 135       % Draw the objects (and deferred labels)
 136       drawObj (inp, front, desugar, simpl, back, out);
 137     \stopuseMPgraphic
 138     \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
 139
 140     \startdesc{Frontend}
 141       This step takes the Haskell source files and parses them into an
 142       abstract syntax tree (\small{AST}). This \small{AST} can express the
 143       complete Haskell language and is thus a very complex one (in contrast
 144       with the Core \small{AST}, later on). All identifiers in this
 145       \small{AST} are resolved by the renamer and all types are checked by the
 146       typechecker.
 147     \stopdesc
 148     \startdesc{Desugaring}
 149       This steps takes the full \small{AST} and translates it to the
 150       \emph{Core} language. Core is a very small functional language with lazy
 151       semantics, that can still express everything Haskell can express. Its
 152       simpleness makes Core very suitable for further simplification and
 153       translation. Core is the language we will be working with as well.
 154     \stopdesc
 155     \startdesc{Simplification}
 156       Through a number of simplification steps (such as inlining, common
 157       subexpression elimination, etc.) the Core program is simplified to make
 158       it faster or easier to process further.
 159     \stopdesc
 160     \startdesc{Backend}
 161       This step takes the simplified Core program and generates an actual
 162       runnable program for it. This is a big and complicated step we will not
 163       discuss it any further, since it is not required for our prototype.
 164     \stopdesc
 165
 166     In this process, there a number of places where we can start our work.
 167     Assuming that we don't want to deal with (or modify) parsing, typechecking
 168     and other frontend business and that native code isn't really a useful
 169     format anymore, we are left with the choice between the full Haskell
 170     \small{AST}, or the smaller (simplified) core representation.
 171
 172     The advantage of taking the full \small{AST} is that the exact structure
 173     of the source program is preserved. We can see exactly what the hardware
 174     descriiption looks like and which syntax constructs were used. However,
 175     the full \small{AST} is a very complicated datastructure. If we are to
 176     handle everything it offers, we will quickly get a big compiler.
 177
 178     Using the core representation gives us a much more compact datastructure
 179     (a core expression only uses 9 constructors). Note that this does not mean
 180     that the core representation itself is smaller, on the contrary. Since the
 181     core language has less constructs, a lot of things will take a larger
 182     expression to express.
 183
 184     However, the fact that the core language is so much smaller, means it is a
 185     lot easier to analyze and translate it into something else. For the same
 186     reason, \small{GHC} runs its simplifications and optimizations on the core
 187     representation as well.
 188
 189     However, we will use the normal core representation, not the simplified
 190     core. Reasons for this are detailed below. \todo{Ref}
 191
 192     The final prototype roughly consists of three steps:
 193
 194     \startuseMPgraphic{clash-pipeline}
 195       % Create objects
 196       save inp, front, norm, vhdl, out;
 197       newEmptyBox.inp(0,0);
 198       newBox.front(btex \small{GHC} frontend + desugarer etex);
 199       newBox.norm(btex Normalization etex);
 200       newBox.vhdl(btex \small{VHDL} generation etex);
 201       newEmptyBox.out(0,0);
 202
 203       % Space the boxes evenly
 204       inp.c - front.c = front.c - norm.c = norm.c - vhdl.c
 205         = vhdl.c - out.c = (0, 1.5cm);
 206       out.c = origin;
 207
 208       % Draw lines between the boxes. We make these lines "deferred" and give
 209       % them a name, so we can use ObjLabel to draw a label beside them.
 210       ncline.inp(inp)(front) "name(haskell)";
 211       ncline.front(front)(norm) "name(core)";
 212       ncline.norm(norm)(vhdl) "name(normal)";
 213       ncline.vhdl(vhdl)(out) "name(vhdl)";
 214       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 215       ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
 216       ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
 217       ObjLabel.vhdl(btex \small{VHDL} description etex) "labpathname(vhdl)", "labdir(rt)";
 218
 219       % Draw the objects (and deferred labels)
 220       drawObj (inp, front, norm, vhdl, out);
 221     \stopuseMPgraphic
 222     \placefigure[right]{Cλash compiler pipeline}{\useMPgraphic{clash-pipeline}}
 223
 224     \startdesc{Frontend}
 225       This is exactly the frontend and desugarer from the \small{GHC}
 226       pipeline, that translates Haskell sources to a core representation.
 227     \stopdesc
 228     \startdesc{Normalization}
 229       This is a step that transforms the core representation into a normal
 230       form. This normal form is still expressed in the core language, but has
 231       to adhere to an extra set of constraints. This normal form is less
 232       expressive than the full core language (e.g., it can have limited higher
 233       order expressions, has a specific structure, etc.), but is also very
 234       close to directly describing hardware.
 235     \stopdesc
 236     \startdesc{\small{VHDL} generation}
 237       The last step takes the normal formed core representation and generates
 238       \small{VHDL} for it. Since the normal form has a specific, hardware-like
 239       structure, this final step is very straightforward.
 240     \stopdesc
 241
 242     The most interesting step in this process is the normalization step. That
 243     is where more complicated functional constructs, which have no direct
 244     hardware interpretation, are removed and translated into hardware
 245     constructs. This step is described in a lot of detail at
 246     \in{chapter}[chap:normalization].
 247
 248   \section{The Core language}
 249     \defreftxt{core}{the Core language}
 250     Most of the prototype deals with handling the program in the Core
 251     language. In this section we will show what this language looks like and
 252     how it works.
 253
 254     The Core language is a functional language that describes
 255     \emph{expressions}. Every identifier used in Core is called a
 256     \emph{binder}, since it is bound to a value somewhere. On the highest
 257     level, a Core program is a collection of functions, each of which bind a
 258     binder (the function name) to an expression (the function value, which has
 259     a function type).
 260
 261     The Core language itself does not prescribe any program structure, only
 262     expression structure. In the \small{GHC} compiler, the Haskell module
 263     structure is used for the resulting Core code as well. Since this is not
 264     so relevant for understanding the Core language or the Normalization
 265     process, we'll only look at the Core expression language here.
 266
 267     Each Core expression consists of one of these possible expressions.
 268
 269     \startdesc{Variable reference}
 270       \defref{variable reference}
 271       \startlambda
 272       x :: T
 273       \stoplambda
 274       This is a reference to a binder. It's written down as the
 275       name of the binder that is being referred to along with its type. The
 276       binder name should of course be bound in a containing scope (including
 277       top level scope, so a reference to a top level function is also a
 278       variable reference). Additionally, constructors from algebraic datatypes
 279       also become variable references.
 280
 281       The value of this expression is the value bound to the given binder.
 282
 283       Each binder also carries around its type (explicitly shown above), but
 284       this is usually not shown in the Core expressions. Only when the type is
 285       relevant (when a new binder is introduced, for example) will it be
 286       shown. In other cases, the binder is either not relevant, or easily
 287       derived from the context of the expression. \todo{Ref sidenote on type
 288       annotations}
 289     \stopdesc
 290
 291     \startdesc{Literal}
 292       \defref{literal}
 293       \startlambda
 294       10
 295       \stoplambda
 296       This is a literal. Only primitive types are supported, like
 297       chars, strings, ints and doubles. The types of these literals are the
 298       \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
 299       normal Haskell versions (but there are builtin conversion functions).
 300     \stopdesc
 301
 302     \startdesc{Application}
 303       \defref{application}
 304       \startlambda
 305       func arg
 306       \stoplambda
 307       This is function application. Each application consists of two
 308       parts: The function part and the argument part. Applications are used
 309       for normal function \quote{calls}, but also for applying type
 310       abstractions and data constructors.
 311
 312       The value of an application is the value of the function part, with the
 313       first argument binder bound to the argument part.
 314     \stopdesc
 315
 316     \startdesc{Lambda abstraction}
 317       \defref{lambda abstraction}
 318       \startlambda
 319       λbndr.body
 320       \stoplambda
 321       This is the basic lambda abstraction, as it occurs in labmda calculus.
 322       It consists of a binder part and a body part.  A lambda abstraction
 323       creates a function, that can be applied to an argument. The binder is
 324       usually a value binder, but it can also be a \emph{type binder} (or
 325       \emph{type variable}). The latter case introduces a new polymorphic
 326       variable, which can be used in types later on. See
 327       \in{section}[sec:prototype:coretypes] for details.
 328
 329       Note that the body of a lambda abstraction extends all the way to the
 330       end of the expression, or the closing bracket surrounding the lambda. In
 331       other words, the lambda abstraction \quote{operator} has the lowest
 332       priority of all.
 333
 334       The value of an application is the value of the body part, with the
 335       binder bound to the value the entire lambda abstraction is applied to.
 336     \stopdesc
 337
 338     \startdesc{Non-recursive let expression}
 339       \defref{let expression}
 340       \startlambda
 341       let bndr = value in body
 342       \stoplambda
 343       A let expression allows you to bind a binder to some value, while
 344       evaluating to some other value (where that binder is in scope). This
 345       allows for sharing of subexpressions (you can use a binder twice) and
 346       explicit \quote{naming} of arbitrary expressions. Note that the binder
 347       is not in scope in the value bound to it, so it's not possible to make
 348       recursive definitions with the normal form of the let expression (see
 349       the recursive form below).
 350
 351       Even though this let expression is an extension on the basic lambda
 352       calculus, it is easily translated to a lambda abstraction. The let
 353       expression above would then become:
 354
 355       \startlambda
 356       (λbndr.body) value
 357       \stoplambda
 358
 359       This notion might be useful for verifying certain properties on
 360       transformations, since a lot of verification work has been done on
 361       lambda calculus already.
 362
 363       The value of a let expression is the value of the body part, with the
 364       binder bound to the value.
 365     \stopdesc
 366
 367     \startdesc{Recursive let expression}
 368       \startlambda
 369       letrec
 370         bndr1 = value1
 371         \vdots
 372         bndrn = valuen
 373       in
 374         body
 375       \stoplambda
 376       This is the recursive version of the let expression. In \small{GHC}'s
 377       Core implementation, non-recursive and recursive lets are not so
 378       distinct as we present them here, but this provides a clearer overview.
 379
 380       The main difference with the normal let expression is that each of the
 381       binders is in scope in each of the values, in addition to the body. This
 382       allows for self-recursive or mutually recursive definitions.
 383
 384       It should also be possible to express a recursive let using normal
 385       lambda calculus, if we use the \emph{least fixed-point operator},
 386       \lam{Y}. This falls beyond the scope of this report, since it is not
 387       needed for this research.
 388     \stopdesc
 389
 390     \startdesc{Case expression}
 391       \defref{case expression}
 392       \startlambda
 393         case scrutinee of bndr
 394           DEFAULT -> defaultbody
 395           C0 bndr0,0 ... bndr0,m -> body0
 396           \vdots
 397           Cn bndrn,0 ... bndrn,m -> bodyn
 398       \stoplambda
 399
 400       \todo{Define WHNF}
 401
 402       A case expression is the only way in Core to choose between values. All
 403       \hs{if} expressions and pattern matchings from the original Haskell
 404       PRogram have been translated to case expressions by the desugarer.
 405
 406       A case expression evaluates its scrutinee, which should have an
 407       algebraic datatype, into weak head normal form (\small{WHNF}) and
 408       (optionally) binds it to \lam{bndr}. It then chooses a body depending on
 409       the constructor of its scrutinee. If none of the constructors match, the
 410       \lam{DEFAULT} alternative is chosen. A case expression must always be
 411       exhaustive, \ie it must cover all possible constructors that the
 412       scrutinee can have (if all of them are covered explicitly, the
 413       \lam{DEFAULT} alternative can be left out).
 414
 415       Since we can only match the top level constructor, there can be no overlap
 416       in the alternatives and thus order of alternatives is not relevant (though
 417       the \lam{DEFAULT} alternative must appear first for implementation
 418       efficiency).
 419
 420       Any arguments to the constructor in the scrutinee are bound to each of the
 421       binders after the constructor and are in scope only in the corresponding
 422       body.
 423
 424       To support strictness, the scrutinee is always evaluated into
 425       \small{WHNF}, even when there is only a \lam{DEFAULT} alternative. This
 426       allows aplication of the strict function \lam{f} to the argument \lam{a}
 427       to be written like:
 428
 429       \startlambda
 430       f (case a of arg DEFAULT -> arg)
 431       \stoplambda
 432
 433       According to the \GHC documentation, this is the only use for the extra
 434       binder to which the scrutinee is bound.  When not using strictness
 435       annotations (which is rather pointless in hardware descriptions),
 436       \small{GHC} seems to never generate any code making use of this binder.
 437       In fact, \GHC has never been observed to generate code using this
 438       binder, even when strictness was involved.  Nonetheless, the prototype
 439       handles this binder as expected.
 440
 441       Note that these case statements are less powerful than the full Haskell
 442       case statements. In particular, they do not support complex patterns like
 443       in Haskell. Only the constructor of an expression can be matched,
 444       complex patterns are implemented using multiple nested case expressions.
 445
 446       Case statements are also used for unpacking of algebraic datatypes, even
 447       when there is only a single constructor. For examples, to add the elements
 448       of a tuple, the following Core is generated:
 449
 450       \startlambda
 451       sum = λtuple.case tuple of
 452         (,) a b -> a + b
 453       \stoplambda
 454
 455       Here, there is only a single alternative (but no \lam{DEFAULT}
 456       alternative, since the single alternative is already exhaustive). When
 457       it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
 458       (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
 459     \stopdesc
 460
 461     \startdesc{Cast expression}
 462       \defref{cast expression}
 463       \startlambda
 464       body ▶ targettype
 465       \stoplambda
 466       A cast expression allows you to change the type of an expression to an
 467       equivalent type. Note that this is not meant to do any actual work, like
 468       conversion of data from one format to another, or force a complete type
 469       change. Instead, it is meant to change between different representations
 470       of the same type, \eg switch between types that are provably equal (but
 471       look different).
 472
 473       In our hardware descriptions, we typically see casts to change between a
 474       Haskell newtype and its contained type, since those are effectively
 475       different types (so a cast is needed) with the same representation (but
 476       no work is done by the cast).
 477
 478       More complex are types that are proven to be equal by the typechecker,
 479       but look different at first glance. To ensure that, once the typechecker
 480       has proven equality, this information sticks around, explicit casts are
 481       added. In our notation we only write the target type, but in reality a
 482       cast expressions carries around a \emph{coercion}, which can be seen as a
 483       proof of equality. \todo{Example}
 484
 485       The value of a cast is the value of its body, unchanged. The type of this
 486       value is equal to the target type, not the type of its body.
 487
 488       \todo{Move and update this paragraph}
 489       Note that this syntax is also used sometimes to indicate that a particular
 490       expression has a particular type, even when no cast expression is
 491       involved. This is then purely informational, since the only elements that
 492       are explicitely typed in the Core language are the binder references and
 493       cast expressions, the types of all other elements are determined at
 494       runtime.
 495     \stopdesc
 496
 497     \startdesc{Note}
 498       The Core language in \small{GHC} allows adding \emph{notes}, which serve
 499       as hints to the inliner or add custom (string) annotations to a core
 500       expression. These shouldn't be generated normally, so these are not
 501       handled in any way in the prototype.
 502     \stopdesc
 503
 504     \startdesc{Type}
 505       \defref{type expression}
 506       \startlambda
 507       @type
 508       \stoplambda
 509       It is possibly to use a Core type as a Core expression. For the actual
 510       types supported by Core, see \in{section}[sec:prototype:coretypes]. This
 511       \quote{lifting} of a type into the value domain is done to allow for
 512       type abstractions and applications to be handled as normal lambda
 513       abstractions and applications above. This means that a type expression
 514       in Core can only ever occur in the argument position of an application,
 515       and only if the type of the function that is applied to expects a type
 516       as the first argument. This happens for all polymorphic functions, for
 517       example, the \lam{fst} function:
 518
 519       \startlambda
 520       fst :: \forall a. \forall b. (a, b) -> a
 521       fst = λtup.case tup of (,) a b -> a
 522
 523       fstint :: (Int, Int) -> Int
 524       fstint = λa.λb.fst @Int @Int a b
 525       \stoplambda
 526
 527       The type of \lam{fst} has two universally quantified type variables. When
 528       \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
 529       (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
 530       the type of \lam{fst} actual type of arguments and result can be found:
 531       \lam{fst @Int @Int :: (Int, Int) -> Int}).
 532     \stopdesc
 533
 534     \subsection[sec:prototype:coretypes]{Core type system}
 535       Whereas the expression syntax of Core is very simple, its type system is
 536       a bit more complicated. It turns out it is harder to \quote{desugar}
 537       Haskell's complex type system into something more simple. Most of the
 538       type system is thus very similar to that of Haskell.
 539
 540       We will slightly limit our view on Core's type system, since the more
 541       complicated parts of it are only meant to support Haskell's (or rather,
 542       \GHC's) type extensions, such as existential types, \small{GADT}s, type
 543       families and other non-standard Haskell stuff which we don't (plan to)
 544       support.
 545
 546       In Core, every expression is typed. The translation to Core happens
 547       after the typechecker, so types in Core are always correct as well
 548       (though you could of course construct invalidly typed expressions).
 549
 550       Any type in core is one of the following:
 551
 552       \startdesc{A type variable}
 553         \startlambda
 554         t
 555         \stoplambda
 556
 557         This is a reference to a type defined elsewhere. This can either be a
 558         polymorphic type (like the latter two \lam{t}'s in \lam{id :: \forall t.
 559         t -> t}), or a type constructor (like \lam{Bool} in \lam{not :: Bool ->
 560         Bool}). Like in Haskell, polymorphic type variables always
 561         start with a lowercase letter, while type constructors always start
 562         with an uppercase letter.
 563
 564         \todo{How to define (new) type constructors?}
 565
 566         A special case of a type constructor is the \emph{function type
 567         constructor}, \lam{->}. This is a type constructor taking two arguments
 568         (using application below). The function type constructor is commonly
 569         written inline, so we write \lam{a -> b} when we really mean \lam{-> a
 570         b}, the function type constructor applied to \lam{a} and \lam{b}.
 571
 572         Polymorphic type variables can only be defined by a lambda
 573         abstraction, see the forall type below.
 574       \stopdesc
 575
 576       \startdesc{A type application}
 577         \startlambda
 578           Maybe Int
 579         \stoplambda
 580
 581         This applies a some type to another type. This is particularly used to
 582         apply type variables (type constructors) to their arguments.
 583
 584         As mentioned above, applications of some type constructors have
 585         special notation. In particular, these are applications of the
 586         \emph{function type constructor} and \emph{tuple type constructors}:
 587         \startlambda
 588           foo :: a -> b
 589           foo' :: -> a b
 590           bar :: (a, b, c)
 591           bar' :: (,,) a b c
 592         \stoplambda
 593       \stopdesc
 594
 595       \startdesc{The forall type}
 596         \startlambda
 597           id :: \forall a. a -> a
 598         \stoplambda
 599         The forall type introduces polymorphism. It is the only way to
 600         introduce new type variables, which are completely unconstrained (Any
 601         possible type can be assigned to it). Constraints can be added later
 602         using predicate types, see below.
 603
 604         A forall type is always (and only) introduced by a type lambda
 605         expression. For example, the Core translation of the
 606         id function is:
 607         \startlambda
 608           id = λa.λx.x
 609         \stoplambda
 610
 611         Here, the type of the binder \lam{x} is \lam{a}, referring to the
 612         binder in the topmost lambda.
 613
 614         When using a value with a forall type, the actual type
 615         used must be applied first. For example haskell expression \hs{id
 616         True} (the function \hs{id} appleid to the dataconstructor \hs{True})
 617         translates to the following Core:
 618
 619         \startlambda
 620         id @Bool True
 621         \stoplambda
 622
 623         Here, id is first applied to the type to work with. Note that the type
 624         then changes from \lam{id :: \forall a. a -> a} to \lam{id @Bool ::
 625         Bool -> Bool}. Note that the type variable \lam{a} has been
 626         substituted with the actual type.
 627
 628         In Haskell, forall types are usually not explicitly specified (The use
 629         of a lowercase type variable implicitly introduces a forall type for
 630         that variable). In fact, in standard Haskell there is no way to
 631         explicitly specify forall types. Through a language extension, the
 632         \hs{forall} keyword is available, but still optional for normal forall
 633         types (it is needed for \emph{existentially quantified types}, which
 634         Cλash does not support).
 635       \stopdesc
 636
 637       \startdesc{Predicate type}
 638         \startlambda
 639           show :: \forall a. Show s ⇒ s → String
 640         \stoplambda
 641
 642         \todo{Sidenote: type classes?}
 643
 644         A predicate type introduces a constraint on a type variable introduced
 645         by a forall type (or type lambda). In the example above, the type
 646         variable \lam{a} can only contain types that are an \emph{instance} of
 647         the \emph{type class} \lam{Show}. \refdef{type class}
 648
 649         There are other sorts of predicate types, used for the type families
 650         extension, which we will not discuss here.
 651
 652         A predicate type is introduced by a lambda abstraction. Unlike with
 653         the forall type, this is a value lambda abstraction, that must be
 654         applied to a value. We call this value a \emph{dictionary}.
 655
 656         Without going into the implementation details, a dictionary can be
 657         seen as a lookup table all the methods for a given (single) type class
 658         instance. This means that all the dictionaries for the same type class
 659         look the same (\eg contain methods with the same names). However,
 660         dictionaries for different instances of the same class contain
 661         different methods, of course.
 662
 663         A dictionary is introduced by \small{GHC} whenever it encounters an
 664         instance declaration. This dictionary, as well as the binder
 665         introduced by a lambda that introduces a dictionary, have the
 666         predicate type as their type. These binders are usually named starting
 667         with a \lam{\$}. Usually the name of the type concerned is not
 668         reflected in the name of the dictionary, but the name of the type
 669         class is. The Haskell expression \hs{show True} thus becomes:
 670
 671         \startlambda
 672         show @Bool \$dShow True
 673         \stoplambda
 674       \stopdesc
 675
 676       Using this set of types, all types in basic Haskell can be represented.
 677
 678       \todo{Overview of polymorphism with more examples (or move examples
 679       here)}.
 680
 681   \section[sec:prototype:statetype]{State annotations in Haskell}
 682     As noted in \in{section}[sec:description:stateann], Cλash needs some
 683     way to let the programmer explicitly specify which of a function's
 684     arguments and which part of a function's result represent the
 685     function's state.
 686
 687     Using the Haskell type systems, there are a few ways we can tackle this.
 688
 689     \subsection{Type synonyms}
 690       Haskell provides type synonyms as a way to declare a new type that is
 691       equal to an existing type (or rather, a new name for an existing type).
 692       This allows both the original type and the synonym to be used
 693       interchangedly in a Haskell program. This means no explicit conversion
 694       is needed either. For example, a simple accumulator would become:
 695
 696       \starthaskell
 697       type State s = s
 698       acc :: Word -> State Word -> (State Word, Word)
 699       acc i s = let sum = s + i in (sum, sum)
 700       \stophaskell
 701
 702       This looks nice in Haskell, but turns out to be hard to implement. There
 703       are no explicit conversion in Haskell, but not in Core either. This
 704       means the type of a value might be show as \hs{AccState} in some places,
 705       but \hs{Word} in others (and this can even change due to
 706       transformations). Since every binder has an explicit type associated
 707       with it, the type of every function type will be properly preserved and
 708       could be used to track down the statefulness of each value by the
 709       compiler. However, this makes the implementation a lot more complicated
 710       than it currently is using \hs{newtypes}.
 711     \subsection{Type renaming (\hs{newtype})}
 712       Haskell also supports type renamings as a way to declare a new type that
 713       has the same (runtime) representation as an existing type (but is in
 714       fact a different type to the typechecker). With type renaming, an
 715       explicit conversion between values of the two types is needed. The
 716       accumulator would then become:
 717
 718       \starthaskell
 719       newtype State s = State s
 720       acc :: Word -> State Word -> (State Word, Word)
 721       acc i (State s) = let sum = s + i in (State sum, sum)
 722       \stophaskell
 723
 724       The \hs{newtype} line declares a new type \hs{State} that has one type
 725       argument, \hs{s}. This type contains one \quote{constructor} \hs{State}
 726       with a single argument of type \hs{s}. It is customary to name the
 727       constructor the same as the type, which is allowed (since types can
 728       never cause name collisions with values). The difference with the type
 729       synonym example is in the explicit conversion between the \hs{State
 730       Word} and \hs{Word} types by pattern matching and by using the explicit
 731       the \hs{State constructor}.
 732
 733       This explicit conversion makes the \VHDL generation easier: Whenever we
 734       remove (unpack) the \hs{State} type, this means we are accessing the
 735       current state (\eg, accessing the register output). Whenever we are a
 736       adding (packing) the \hs{State} type, we are producing a new value for
 737       the state (\eg, providing the register input).
 738
 739       When dealing with nested states (a stateful function that calls stateful
 740       functions, which might call stateful functions, etc.) the state type
 741       could quickly grow complex because of all the \hs{State} type constructors
 742       needed. For example, consider the following state type (this is just the
 743       state type, not the entire function type):
 744
 745       \starttyping
 746       State (State Bit, State (State Word, Bit), Word)
 747       \stoptyping
 748
 749       We cannot leave all these \hs{State} type constructors out, since that
 750       would change the type (unlike when using type synonyms). However, when
 751       using type synonyms to hide away substates, \todo{see below} this
 752       disadvantage should be limited.
 753
 754       \subsubsection{Different input and output types}
 755         An alternative could be to use different types for input and output
 756         state (\ie current and updated state). The accumulator example would
 757         then become something like:
 758
 759         \starthaskell
 760         newtype StateIn s = StateIn s
 761         newtype StateOut s = StateOut s
 762         acc :: Word -> StateIn Word -> (StateIn Word, Word)
 763         acc i (StateIn s) = let sum = s + i in (StateIn sum, sum)
 764         \stophaskell
 765
 766         This could make the implementation easier and the hardware
 767         descriptions less errorprone (you can no longer \quote{forget} to
 768         unpack and repack a state variable and just return it directly, which
 769         can be a problem in the current prototype). However, it also means we
 770         need twice as many type synonyms to hide away substates, making this
 771         approach a bit cumbersome. It also makes it harder to copmare input
 772         and output state types, possible reducing the type safety of the
 773         descriptions.
 774
 775     \subsection{Type synonyms for substates}
 776       As noted above, when using nested (hierarchical) states, the state types
 777       of the \quote{upper} functions (those that call other functions, which
 778       call other functions, etc.) quickly becomes complicated. Also, when the
 779       state type of one of the \quote{lower} functions changes, the state
 780       types of all the upper functions changes as well. If the state type for
 781       each function is explicitly and completely specified, this means that a
 782       lot of code needs updating whenever a state type changes.
 783
 784       To prevent this, it is recommended (but not enforced) to use a type
 785       synonym for the state type of every function. Every function calling
 786       other functions will then use the state type synonym of the called
 787       functions in its own type, requiring no code changes when the state type
 788       of a called function changes. This approach is used in
 789       \in{example}[ex:AvgState] below. The \hs{AccState} and \hs{AvgState}
 790       are examples of such state type synonyms.
 791
 792     \subsection{Chosen approach}
 793       To keep implementation simple, the current prototype uses the type
 794       renaming approach, with a single type for both input and output states.
 795       \todo{}
 796
 797       \subsubsection{Example}
 798         As an example of the used approach, there is a simple averaging circuit in
 799         \in{example}[ex:AvgState]. This circuit lets the accumulation of the
 800         inputs be done by a subcomponent, \hs{acc}, but keeps a count of value
 801         accumulated in its own state.
 802
 803
 804         \startbuffer[AvgState]
 805           -- The state type annotation
 806           newtype State s = State s
 807
 808           -- The accumulator state type
 809           type AccState = State Bit
 810           -- The accumulator
 811           acc :: Word -> AccState -> (AccState, Word)
 812           acc i (State s) = let sum = s + i in (State sum, sum)
 813
 814           -- The averaging circuit state type
 815           type AvgState = (AccState, Word)
 816           -- The averaging circuit
 817           avg :: Word -> AvgState -> (AvgState, Word)
 818           avg i (State s) = (State s', o)
 819             where
 820               (accs, count) = s
 821               -- Pass our input through the accumulator, which outputs a sum
 822               (accs', sum) = acc i accs
 823               -- Increment the count (which will be our new state)
 824               count' = count + 1
 825               -- Compute the average
 826               o = sum / count'
 827               s' = (accs', count')
 828         \stopbuffer
 829
 830         \placeexample[here][ex:AvgState]{Simple stateful averaging circuit.}
 831           %\startcombination[2*1]
 832             {\typebufferhs{AvgState}}%{Haskell description using function applications.}
 833           %  {\boxedgraphic{AvgState}}{The architecture described by the Haskell description.}
 834           %\stopcombination
 835         \todo{Picture}
 836
 837         And the normalized, core-like versions:
 838
 839         \startbuffer[AvgStateNormal]
 840           acc = λi.λspacked.
 841             let
 842               -- Remove the State newtype
 843               s = spacked ▶ Word
 844               s' = s + i
 845               o = s + i
 846               -- Add the State newtype again
 847               spacked' = s' ▶ State Word
 848               res = (spacked', o)
 849             in
 850               res
 851
 852           avg = λi.λspacked.
 853             let
 854               s = spacked ▶ (AccState, Word)
 855               accs = case s of (accs, _) -> accs
 856               count = case s of (_, count) -> count
 857               accres = acc i accs
 858               accs' = case accres of (accs', _) -> accs'
 859               sum = case accres of (_, sum) -> sum
 860               count' = count + 1
 861               o = sum / count'
 862               s' = (accs', count')
 863               spacked' = s' ▶ State (AccState, Word)
 864               res = (spacked', o)
 865             in
 866               res
 867         \stopbuffer
 868
 869         \placeexample[here][ex:AvgStateNormal]{Normalized version of \in{example}[ex:AvgState]}
 870             {\typebufferlam{AvgStateNormal}}
 871
 872       \subsection{What can you do with state?}
 873         Now that we've seen how state variables are typically used, the
 874         question rises of what can be done with these state variables exactly?
 875         What limitations are there on their use to guarantee that proper \VHDL
 876         can be generated?
 877
 878         We will try to formulate a number of rules about what operations are
 879         allowed with state variables. These rules apply to the normalized Core
 880         representation, but will in practice apply to the original Haskell
 881         hardware description as well. Ideally, these rules would become part
 882         of the intended normal form definition \refdef{intended normal form
 883         definition}, but this is not the case right now. This can cause some
 884         problems, which are detailed in
 885         \in{section}[sec:normalization:stateproblems].
 886
 887         In these rules we use the terms state variable to refer to any
 888         variable that has a \lam{State} type. A state-containing variable is
 889         any variable whose type contains a \lam{State} type, but is not one
 890         itself (like \lam{(AccState, Word)} in the example, which is a tuple
 891         type, but contains \lam{AccState}, which is again equal to \lam{State
 892         Word}).
 893
 894         We also use a distinction between input and output (state) variables,
 895         which will be defined in the rules themselves.
 896
 897         \startdesc{State variables can appear as an argument.}
 898           \startlam
 899             λspacked. ...
 900           \stoplam
 901
 902           Any lambda that binds a variable with a state type, creates a new
 903           input state variable.
 904         \stopdesc
 905
 906         \startdesc{Input state variables can be unpacked.}
 907           \startlam
 908             s = spacked ▶ (AccState, Word)
 909           \stoplam
 910
 911           An input state variable may be unpacked using a cast operation. This
 912           removes the \lam{State} type renaming and the result has no longer a
 913           \lam{State} type.
 914
 915           If the result of this unpacking does not have a state type and does
 916           not contain state variables, there are no limitations on its use.
 917           Otherwise if it does not have a state type but does contain
 918           substates, we refer to it as a \emph{state-containing input
 919           variable} and the limitations below apply. If it has a state type
 920           itself, we refer to it as an \emph{input substate variable} and the
 921           below limitations apply as well.
 922
 923           It may seem strange to consider a variable that still has a state
 924           type directly after unpacking, but consider the case where a
 925           function does not have any state of its own, but does call a single
 926           stateful function. This means it must have a state argument that
 927           contains just a substate. The function signature of such a function
 928           could look like:
 929
 930           \starthaskell
 931             type FooState = State AccState
 932           \stophaskell
 933
 934           Which is of course equivalent to \lam{State (State Word)}.
 935         \stopdesc
 936
 937         \startdesc{Variables can be extracted from state-containing input variables.}
 938           \startlam
 939             accs = case s of (accs, _) -> accs
 940           \stoplam
 941
 942           A state-containing input variable is typically a tuple containing
 943           multiple elements (like the current function's state, substates or
 944           more tuples containing substates). All of these can be extracted
 945           from an input variable using an extractor case (or possibly
 946           multiple, when the input variable is nested).
 947
 948           If the result has no state type and does not contain any state
 949           variables either, there are no further limitations on its use. If
 950           the result has no state type but does contain state variables we
 951           refer to it as a \emph{state-containing input variable} and this
 952           limitation keeps applying. If the variable has a state type itself,
 953           we refer to it as an \emph{input substate variable} and below
 954           limitations apply.
 955
 956         \startdesc{Input substate variables can be passed to functions.}
 957           \startlam
 958             accres = acc i accs
 959             accs' = case accres of (accs', _) -> accs'
 960           \stoplam
 961
 962           An input substate variable can (only) be passed to a function.
 963           Additionally, every input substate variable must be used in exactly
 964           \emph{one} application, no more and no less.
 965
 966           The function result should contain exactly one state variable, which
 967           can be extracted using (multiple) case statements. The extracted
 968           state variable is referred to the \emph{output substate}
 969
 970           The type of this output substate must be identical to the type of
 971           the input substate passed to the function.
 972         \stopdesc
 973
 974         \startdesc{Variables can be inserted into state-containing output variables.}
 975           \startlam
 976             s' = (accs', count')
 977           \stoplam
 978
 979           A function's output state is usually a tuple containing its own
 980           updated state variables and all output substates. This result is
 981           built up using any single-constructor algebraic datatype.
 982
 983           The result of these expressions is referred to as a
 984           \emph{state-containing output variable}, which are subject to these
 985           limitations.
 986         \stopdesc
 987
 988         \startdesc{State containing output variables can be packed}
 989           As soon as all a functions own update state and output substate
 990           variables have been joined together, the va...
 991           todo{}
 992               spacked' = s' ▶ State (AccState, Word)
 993
 994         \desc{State variables can appear as (part of) a function result.} When
 995         generating \small{VHDL}, we can completely ignore any part of a
 996         function result that has a state type. If the entire result is a state
 997         type, this will mean the entity will not have an output port.
 998         Otherwise, the state elements will be removed from the type of the
 999         output port.
1000
1001       \subsection{Translating to \VHDL}
1002         \in{Example}[ex:AvgStateNormal] shows the normalized Core version of the
1003         same description (note that _ is a wild binder). \refdef{wild binder}
1004         The normal form is used to translated to \VHDL.
1005
1006         As noted above, the basic approach when generating \VHDL for stateful
1007         functions is to generate a single register for every stateful function.
1008         We look around the normal form to find the let binding that removes the
1009         \lam{State} newtype (using a cast). We also find the let binding that
1010         adds a \lam{State} type. These are connected to the output and the input
1011         of the generated let binding respectively. This means that there can
1012         only be one let binding that adds and one that removes the \lam{State}
1013         type. It is easy to violate this constraint. This problem is detailed in
1014         \in{section}[sec:normalization:stateproblems].
1015
1016         This approach seems simple enough, but will this also work for more
1017         complex stateful functions involving substates?  Observe that any
1018         component of a function's state that is a substate, \ie passed on as
1019         the state of another function, should have no influence on the
1020         hardware generated for the calling function. Any state-specific
1021         \small{VHDL} for this component can be generated entirely within the
1022         called function.  So, we can completely ignore substates when
1023         generating \VHDL for a function.
1024
1025         From this observation, we might think to remove the substates from a
1026         function's states alltogether, and leave only the state components
1027         which are actual states of the current function. While doing this
1028         would not remove any information needed to generate \small{VHDL} from
1029         the function, it would cause the function definition to become invalid
1030         (since we won't have any substate to pass to the functions anymore).
1031         We could solve the syntactic problems by passing \type{undefined} for
1032         state variables, but that would still break the code on the semantic
1033         level (\ie, the function would no longer be semantically equivalent to
1034         the original input).
1035
1036         To keep the function definition correct until the very end of the
1037         process, we will not deal with (sub)states until we get to the
1038         \small{VHDL} generation.  Then, we are translating from Core to
1039         \small{VHDL}, and we can simply ignore substates, effectively removing
1040         the substate components alltogether.
1041
1042         There are a few important points when ignoring substates.
1043
1044         First, we have to have some definition of "substate". Since any state
1045         argument or return value that represents state must be of the
1046         \type{State} type, we can look at the type of a value. However, we
1047         must be careful to ignore only \emph{substates}, and not a function's
1048         own state.
1049
1050         In \in{example}[ex:AvgStateNorm] above, we should generate a register
1051         connected with its output connected to \lam{s} and its input connected
1052         to \lam{s'}. However, \lam{s'} is build up from both \lam{accs'} and
1053         \lam{count'}, while only \lam{count'} should end up in the register.
1054         \lam{accs'} is a substate for the \lam{acc} function, for which a
1055         register will be created when generating \VHDL for the \lam{acc}
1056         function.
1057
1058         Fortunately, the \lam{accs'} variable (and any other substate) has a
1059         property that we can easily check: It has a \lam{State} type
1060         annotation. This means that whenever \VHDL is generated for a tuple
1061         (or other algebraic type), we can simply leave out all elements that
1062         have a \lam{State} type. This will leave just the parts of the state
1063         that do not have a \lam{State} type themselves, like \lam{count'},
1064         which is exactly a function's own state. This approach also means that
1065         the state part of the result is automatically excluded when generating
1066         the output port, which is also required.
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077         Now, we know how to handle each use of a state variable separately. If we
1078         look at the whole, we can conclude the following:
1079
1080         \startitemize
1081           \item A state unpack operation should not generate any \small{VHDL}.
1082           The binder to which the unpacked state is bound should still be
1083           declared, this signal will become the register and will hold the
1084           current state.
1085           \item A state pack operation should not generate any \small{VHDL}.
1086           The binder to which the packed state is bound should not be
1087           declared. The binder that is packed is the signal that will hold the
1088           new state.
1089           \item Any values of a State type should not be translated to
1090           \small{VHDL}. In particular, State elements should be removed from
1091           tuples (and other datatypes) and arguments with a state type should
1092           not generate ports.
1093           \item To make the state actually work, a simple \small{VHDL} proc
1094           should be generated. This proc updates the state at every
1095           clockcycle, by assigning the new state to the current state. This
1096           will be recognized by synthesis tools as a register specification.
1097         \stopitemize
1098
1099         When applying these rules to the example program (in normal form), we will
1100         get the following result. All the parts that don't generate any value are
1101         crossed out, leaving some very boring assignments here and there.
1102
1103
1104         \starthaskell
1105           avg i --spacked-- = res
1106             where
1107               s = --case spacked of (State s) -> s--
1108               --accums = case s of (accums, \_) -> accums--
1109               count = case s of (--\_,-- count) -> count
1110               accumres = accum i --accums--
1111               --accums' = case accumres of (accums', \_) -> accums'--
1112               sum = case accumres of (--\_,-- sum) -> sum
1113               count' = count + 1
1114               o = sum / count'
1115               s' = (--accums',-- count')
1116               --spacked' = State s'--
1117               res = (--spacked',-- o)
1118         \stophaskell
1119
1120         When we would really leave out the crossed out parts, we get a slightly
1121         weird program: There is a variable \type{s} which has no value, and there
1122         is a variable \type{s'} that is never used. Together, these two will form
1123         the state proc of the function. \type{s} contains the "current" state,
1124         \type{s'} is assigned the "next" state. So, at the end of each clock
1125         cycle, \type{s'} should be assigned to \type{s}.
1126
1127         Note that the definition of \type{s'} is not removed, even though one
1128         might think it as having a state type. Since the state type has a single
1129         argument constructor \type{State}, some type that should be the resulting
1130         state should always be explicitly packed with the State constructor,
1131         allowing us to remove the packed version, but still generate \small{VHDL} for the
1132         unpacked version (of course with any substates removed).
1133
1134         As you can see, the definition of \type{s'} is still present, since it
1135         does not have a state type (The State constructor. The \type{accums'} substate has been removed,
1136         leaving us just with the state of \type{avg} itself.
1137     \subsection{Initial state}
1138       How to specify the initial state? Cannot be done inside a hardware
1139       function, since the initial state is its own state argument for the first
1140       call (unless you add an explicit, synchronous reset port).
1141
1142       External init state is natural for simulation.
1143
1144       External init state works for hardware generation as well.
1145
1146       Implementation issues: state splitting, linking input to output state,
1147       checking usage constraints on state variables.
1148
1149       \todo{Implementation issues: Separate compilation, simplified core.}
1150
1151 % vim: set sw=2 sts=2 expandtab: