Chapters/Prototype.tex

   1 \chapter[chap:prototype]{Prototype}
   2   An important step in this research is the creation of a prototype compiler.
   3   Having this prototype allows us to apply the ideas from the previous chapter
   4   to actual hardware descriptions and evaluate their usefulness. Having a
   5   prototype also helps to find new techniques and test possible
   6   interpretations.
   7
   8   Obviously the prototype was not created after all research
   9   ideas were formed, but its implementation has been interleaved with the
  10   research itself. Also, the prototype described here is the final version, it
  11   has gone through a number of design iterations which we will not completely
  12   describe here.
  13
  14   \section[sec:prototype:input]{Input language}
  15     When implementing this prototype, the first question to ask is: What
  16     (functional) language will we use to describe our hardware? (Note that
  17     this does not concern the \emph{implementation language} of the compiler,
  18     just the language \emph{translated by} the compiler).
  19
  20     Initially, we have two choices:
  21
  22     \startitemize
  23       \item Create a new functional language from scratch. This has the
  24       advantage of having a language that contains exactly those elements that
  25       are convenient for describing hardware and can contain special
  26       constructs that allows our hardware descriptions to be more powerful or
  27       concise.
  28       \item Use an existing language and create a new backend for it. This has
  29       the advantage that existing tools can be reused, which will speed up
  30       development.
  31     \stopitemize
  32
  33
  34     \placeintermezzo{}{
  35       \startframedtext[width=8cm,background=box,frame=no]
  36       \startalignment[center]
  37         {\tfa No \small{EDSL} or Template Haskell}
  38       \stopalignment
  39       \blank[medium]
  40
  41       Note that in this consideration, embedded domain-specific
  42       languages (\small{EDSL}) and Template Haskell (\small{TH})
  43       approaches have not been included. As we've seen in
  44       \in{section}[sec:context:fhdls], these approaches have all kinds
  45       of limitations on the description language that we would like to
  46       avoid.
  47       \stopframedtext
  48     }
  49     Considering that we required a prototype which should be working quickly,
  50     and that implementing parsers, semantic checkers and especially
  51     typcheckers is not exactly the core of this research (but it is lots and
  52     lots of work!), using an existing language is the obvious choice. This
  53     also has the advantage that a large set of language features is available
  54     to experiment with and it is easy to find which features apply well and
  55     which don't. A possible second prototype could use a custom language with
  56     just the useful features (and possibly extra features that are specific to
  57     the domain of hardware description as well).
  58
  59     The second choice is which of the many existing languages to use. As
  60     mentioned before, the chosen language is Haskell.  This choice has not been the
  61     result of a thorough comparison of languages, for the simple reason that
  62     the requirements on the language were completely unclear at the start of
  63     this research. The fact that Haskell is a language with a broad spectrum
  64     of features, that it is commonly used in research projects and that the
  65     primary compiler, \GHC, provides a high level API to its internals, made
  66     Haskell an obvious choice.
  67
  68   \section[sec:prototype:output]{Output format}
  69     The second important question is: What will be our output format? Since
  70     our prototype won't be able to program FPGA's directly, we'll have to have
  71     output our hardware in some format that can be later processed and
  72     programmed by other tools.
  73
  74     Looking at other tools in the industry, the Electronic Design Interchange
  75     Format (\small{EDIF}) is commonly used for storing intermediate
  76     \emph{netlists} (lists of components and connections between these
  77     components) and is commonly the target for \small{VHDL} and Verilog
  78     compilers.
  79
  80     However, \small{EDIF} is not completely tool-independent. It specifies a
  81     meta-format, but the hardware components that can be used vary between
  82     various tool and hardware vendors, as well as the interpretation of the
  83     \small{EDIF} standard. \cite[li89]
  84
  85     This means that when working with \small{EDIF}, our prototype would become
  86     technology dependent (\eg only work with \small{FPGA}s of a specific
  87     vendor, or even only with specific chips). This limits the applicability
  88     of our prototype. Also, the tools we'd like to use for verifying,
  89     simulating and draw pretty pictures of our output (like Precision, or
  90     QuestaSim) are designed for \small{VHDL} or Verilog input.
  91
  92     For these reasons, we will not use \small{EDIF}, but \small{VHDL} as our
  93     output language.  We choose \VHDL over Verilog simply because we are
  94     familiar with \small{VHDL} already. The differences between \small{VHDL}
  95     and Verilog are on the higher level, while we will be using \small{VHDL}
  96     mainly to write low level, netlist-like descriptions anyway.
  97
  98     An added advantage of using VHDL is that we can profit from existing
  99     optimizations in VHDL synthesizers. A lot of optimizations are done on the
 100     VHDL level by existing tools. These tools have years of experience in this
 101     field, so it would not be reasonable to assume we could achieve a similar
 102     amount of optimization in our prototype (nor should it be a goal,
 103     considering this is just a prototype).
 104
 105     Note that we will be using \small{VHDL} as our output language, but will
 106     not use its full expressive power. Our output will be limited to using
 107     simple, structural descriptions, without any behavioural descriptions
 108     (which might not be supported by all tools). This ensures that any tool
 109     that works with \VHDL will understand our output (most tools don't support
 110     synthesis of more complex \VHDL). This also leaves open the option to
 111     switch to \small{EDIF} in the future, with minimal changes to the
 112     prototype.
 113
 114   \section[sec:prototype:design]{Prototype design}
 115     As suggested above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
 116     implement our prototype compiler. To understand the design of the
 117     compiler, we will first dive into the \small{GHC} compiler a bit. It's
 118     compilation consists of the following steps (slightly simplified):
 119
 120     \startuseMPgraphic{ghc-pipeline}
 121       % Create objects
 122       save inp, front, desugar, simpl, back, out;
 123       newEmptyBox.inp(0,0);
 124       newBox.front(btex Fronted etex);
 125       newBox.desugar(btex Desugarer etex);
 126       newBox.simpl(btex Simplifier etex);
 127       newBox.back(btex Backend etex);
 128       newEmptyBox.out(0,0);
 129
 130       % Space the boxes evenly
 131       inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c
 132         = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
 133       out.c = origin;
 134
 135       % Draw lines between the boxes. We make these lines "deferred" and give
 136       % them a name, so we can use ObjLabel to draw a label beside them.
 137       ncline.inp(inp)(front) "name(haskell)";
 138       ncline.front(front)(desugar) "name(ast)";
 139       ncline.desugar(desugar)(simpl) "name(core)";
 140       ncline.simpl(simpl)(back) "name(simplcore)";
 141       ncline.back(back)(out) "name(native)";
 142       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 143       ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
 144       ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
 145       ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
 146       ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
 147
 148       % Draw the objects (and deferred labels)
 149       drawObj (inp, front, desugar, simpl, back, out);
 150     \stopuseMPgraphic
 151     \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
 152
 153     \startdesc{Frontend}
 154       This step takes the Haskell source files and parses them into an
 155       abstract syntax tree (\small{AST}). This \small{AST} can express the
 156       complete Haskell language and is thus a very complex one (in contrast
 157       with the Core \small{AST}, later on). All identifiers in this
 158       \small{AST} are resolved by the renamer and all types are checked by the
 159       typechecker.
 160     \stopdesc
 161     \startdesc{Desugaring}
 162       This steps takes the full \small{AST} and translates it to the
 163       \emph{Core} language. Core is a very small functional language with lazy
 164       semantics, that can still express everything Haskell can express. Its
 165       simpleness makes Core very suitable for further simplification and
 166       translation. Core is the language we will be working with as well.
 167     \stopdesc
 168     \startdesc{Simplification}
 169       Through a number of simplification steps (such as inlining, common
 170       subexpression elimination, etc.) the Core program is simplified to make
 171       it faster or easier to process further.
 172     \stopdesc
 173     \startdesc{Backend}
 174       This step takes the simplified Core program and generates an actual
 175       runnable program for it. This is a big and complicated step we will not
 176       discuss it any further, since it is not required for our prototype.
 177     \stopdesc
 178
 179     In this process, there are a number of places where we can start our work.
 180     Assuming that we don't want to deal with (or modify) parsing, typechecking
 181     and other frontend business and that native code isn't really a useful
 182     format anymore, we are left with the choice between the full Haskell
 183     \small{AST}, or the smaller (simplified) core representation.
 184
 185     The advantage of taking the full \small{AST} is that the exact structure
 186     of the source program is preserved. We can see exactly what the hardware
 187     description looks like and which syntax constructs were used. However,
 188     the full \small{AST} is a very complicated datastructure. If we are to
 189     handle everything it offers, we will quickly get a big compiler.
 190
 191     Using the core representation gives us a much more compact datastructure
 192     (a core expression only uses 9 constructors). Note that this does not mean
 193     that the core representation itself is smaller, on the contrary. Since the
 194     core language has less constructs, a lot of things will take a larger
 195     expression to express.
 196
 197     However, the fact that the core language is so much smaller, means it is a
 198     lot easier to analyze and translate it into something else. For the same
 199     reason, \small{GHC} runs its simplifications and optimizations on the core
 200     representation as well \cite[jones96].
 201
 202     However, we will use the normal core representation, not the simplified
 203     core. Reasons for this are detailed below. \todo{Ref}
 204
 205     The final prototype roughly consists of three steps:
 206
 207     \startuseMPgraphic{clash-pipeline}
 208       % Create objects
 209       save inp, front, norm, vhdl, out;
 210       newEmptyBox.inp(0,0);
 211       newBox.front(btex \small{GHC} frontend + desugarer etex);
 212       newBox.norm(btex Normalization etex);
 213       newBox.vhdl(btex \small{VHDL} generation etex);
 214       newEmptyBox.out(0,0);
 215
 216       % Space the boxes evenly
 217       inp.c - front.c = front.c - norm.c = norm.c - vhdl.c
 218         = vhdl.c - out.c = (0, 1.5cm);
 219       out.c = origin;
 220
 221       % Draw lines between the boxes. We make these lines "deferred" and give
 222       % them a name, so we can use ObjLabel to draw a label beside them.
 223       ncline.inp(inp)(front) "name(haskell)";
 224       ncline.front(front)(norm) "name(core)";
 225       ncline.norm(norm)(vhdl) "name(normal)";
 226       ncline.vhdl(vhdl)(out) "name(vhdl)";
 227       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 228       ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
 229       ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
 230       ObjLabel.vhdl(btex \small{VHDL} description etex) "labpathname(vhdl)", "labdir(rt)";
 231
 232       % Draw the objects (and deferred labels)
 233       drawObj (inp, front, norm, vhdl, out);
 234     \stopuseMPgraphic
 235     \placefigure[right]{Cλash compiler pipeline}{\useMPgraphic{clash-pipeline}}
 236
 237     \startdesc{Frontend}
 238       This is exactly the frontend and desugarer from the \small{GHC}
 239       pipeline, that translates Haskell sources to a core representation.
 240     \stopdesc
 241     \startdesc{Normalization}
 242       This is a step that transforms the core representation into a normal
 243       form. This normal form is still expressed in the core language, but has
 244       to adhere to an extra set of constraints. This normal form is less
 245       expressive than the full core language (e.g., it can have limited higher
 246       order expressions, has a specific structure, etc.), but is also very
 247       close to directly describing hardware.
 248     \stopdesc
 249     \startdesc{\small{VHDL} generation}
 250       The last step takes the normal formed core representation and generates
 251       \small{VHDL} for it. Since the normal form has a specific, hardware-like
 252       structure, this final step is very straightforward.
 253     \stopdesc
 254
 255     The most interesting step in this process is the normalization step. That
 256     is where more complicated functional constructs, which have no direct
 257     hardware interpretation, are removed and translated into hardware
 258     constructs. This step is described in a lot of detail at
 259     \in{chapter}[chap:normalization].
 260
 261   \section{The Core language}
 262     \defreftxt{core}{the Core language}
 263     Most of the prototype deals with handling the program in the Core
 264     language. In this section we will show what this language looks like and
 265     how it works.
 266
 267     The Core language is a functional language that describes
 268     \emph{expressions}. Every identifier used in Core is called a
 269     \emph{binder}, since it is bound to a value somewhere. On the highest
 270     level, a Core program is a collection of functions, each of which bind a
 271     binder (the function name) to an expression (the function value, which has
 272     a function type).
 273
 274     The Core language itself does not prescribe any program structure
 275     (like modules, declarations, imports, etc.), only expression
 276     structure. In the \small{GHC} compiler, the Haskell module structure
 277     is used for the resulting Core code as well. Since this is not so
 278     relevant for understanding the Core language or the Normalization
 279     process, we'll only look at the Core expression language here.
 280
 281     Each Core expression consists of one of these possible expressions.
 282
 283     \startdesc{Variable reference}
 284       \defref{variable reference}
 285       \startlambda
 286       bndr :: T
 287       \stoplambda
 288       This is a reference to a binder. It's written down as the
 289       name of the binder that is being referred to along with its type. The
 290       binder name should of course be bound in a containing scope
 291       (including top level scope, so a reference to a top level function
 292       is also a variable reference). Additionally, constructors from
 293       algebraic datatypes also become variable references.
 294
 295       In our examples, binders will commonly consist of a single
 296       characters, but they can have any length.
 297
 298       The value of this expression is the value bound to the given
 299       binder.
 300
 301       Each binder also carries around its type (explicitly shown above), but
 302       this is usually not shown in the Core expressions. Only when the type is
 303       relevant (when a new binder is introduced, for example) will it be
 304       shown. In other cases, the binder is either not relevant, or easily
 305       derived from the context of the expression. \todo{Ref sidenote on type
 306       annotations}
 307     \stopdesc
 308
 309     \startdesc{Literal}
 310       \defref{literal}
 311       \startlambda
 312       10
 313       \stoplambda
 314       This is a literal. Only primitive types are supported, like
 315       chars, strings, ints and doubles. The types of these literals are the
 316       \quote{primitive}, unboxed versions, like \lam{Char\#} and \lam{Word\#}, not the
 317       normal Haskell versions (but there are builtin conversion
 318       functions). Without going into detail about these types, note that
 319       a few conversion functions exist to convert these to the normal
 320       (boxed) Haskell equivalents.
 321     \stopdesc
 322
 323     \startdesc{Application}
 324       \defref{application}
 325       \startlambda
 326       func arg
 327       \stoplambda
 328       This is function application. Each application consists of two
 329       parts: The function part and the argument part. Applications are used
 330       for normal function \quote{calls}, but also for applying type
 331       abstractions and data constructors.
 332
 333       The value of an application is the value of the function part, with the
 334       first argument binder bound to the argument part.
 335     \stopdesc
 336
 337     \startdesc{Lambda abstraction}
 338       \defref{lambda abstraction}
 339       \startlambda
 340       λbndr.body
 341       \stoplambda
 342       This is the basic lambda abstraction, as it occurs in lambda calculus.
 343       It consists of a binder part and a body part.  A lambda abstraction
 344       creates a function, that can be applied to an argument. The binder is
 345       usually a value binder, but it can also be a \emph{type binder} (or
 346       \emph{type variable}). The latter case introduces a new polymorphic
 347       variable, which can be used in types later on. See
 348       \in{section}[sec:prototype:coretypes] for details.
 349
 350       The body of a lambda abstraction extends all the way to the end of
 351       the expression, or the closing bracket surrounding the lambda. In
 352       other words, the lambda abstraction \quote{operator} has the
 353       lowest priority of all.
 354
 355       The value of an application is the value of the body part, with the
 356       binder bound to the value the entire lambda abstraction is applied to.
 357     \stopdesc
 358
 359     \startdesc{Non-recursive let expression}
 360       \defref{let expression}
 361       \startlambda
 362       let bndr = value in body
 363       \stoplambda
 364       A let expression allows you to bind a binder to some value, while
 365       evaluating to some other value (for which that binder is in scope). This
 366       allows for sharing of subexpressions (you can use a binder twice) and
 367       explicit \quote{naming} of arbitrary expressions. A binder is not
 368       in scope in the value bound it is bound to, so it's not possible
 369       to make recursive definitions with a non-recursive let expression
 370       (see the recursive form below).
 371
 372       Even though this let expression is an extension on the basic lambda
 373       calculus, it is easily translated to a lambda abstraction. The let
 374       expression above would then become:
 375
 376       \startlambda
 377       (λbndr.body) value
 378       \stoplambda
 379
 380       This notion might be useful for verifying certain properties on
 381       transformations, since a lot of verification work has been done on
 382       lambda calculus already.
 383
 384       The value of a let expression is the value of the body part, with the
 385       binder bound to the value.
 386     \stopdesc
 387
 388     \startdesc{Recursive let expression}
 389       \startlambda
 390       letrec
 391         bndr1 = value1
 392         \vdots
 393         bndrn = valuen
 394       in
 395         body
 396       \stoplambda
 397       This is the recursive version of the let expression. In \small{GHC}'s
 398       Core implementation, non-recursive and recursive lets are not so
 399       distinct as we present them here, but this provides a clearer overview.
 400
 401       The main difference with the normal let expression is that each of the
 402       binders is in scope in each of the values, in addition to the body. This
 403       allows for self-recursive or mutually recursive definitions.
 404
 405       It is also possible to express a recursive let expression using
 406       normal lambda calculus, if we use the \emph{least fixed-point
 407       operator}, \lam{Y} (but the details are too complicated to help
 408       clarify the let expression, so this will not be explored further).
 409     \stopdesc
 410
 411     \placeintermezzo{}{
 412       \startframedtext[width=8cm,background=box,frame=no]
 413       \startalignment[center]
 414         {\tfa Weak head normal form (\small{WHNF})}
 415       \stopalignment
 416       \blank[medium]
 417         An expression is in weak head normal form if it is either an
 418         constructor application or lambda abstraction. \todo{How about
 419         atoms?}
 420
 421         Without going into detail about the differences with head
 422         normal form and normal form, note that evaluating the scrutinee
 423         of a case expression to normal form (evaluating any function
 424         applications, variable references and case expressions) is
 425         sufficient to decide which case alternatives should be chosen.
 426         \todo{ref?}
 427       \stopframedtext
 428
 429     }
 430
 431     \startdesc{Case expression}
 432       \defref{case expression}
 433       \startlambda
 434         case scrutinee of bndr
 435           DEFAULT -> defaultbody
 436           C0 bndr0,0 ... bndr0,m -> body0
 437           \vdots
 438           Cn bndrn,0 ... bndrn,m -> bodyn
 439       \stoplambda
 440
 441       A case expression is the only way in Core to choose between values. All
 442       \hs{if} expressions and pattern matchings from the original Haskell
 443       PRogram have been translated to case expressions by the desugarer.
 444
 445       A case expression evaluates its scrutinee, which should have an
 446       algebraic datatype, into weak head normal form (\small{WHNF}) and
 447       (optionally) binds it to \lam{bndr}. Every alternative lists a
 448       single constructor (\lam{C0 ... Cn}). Based on the actual
 449       constructor of the scrutinee, the corresponding alternative is
 450       chosen. The binders in the chosen alternative (\lam{bndr0,0 ....
 451       bndr0,m} are bound to the actual arguments to the constructor in
 452       the scrutinee.
 453
 454       This is best illustrated with an example. Assume
 455       there is an algebraic datatype declared as follows\footnote{This
 456       datatype is not suported by the current Cλash implementation, but
 457       serves well to illustrate the case expression}:
 458
 459       \starthaskell
 460       data D = A Word | B Bit
 461       \stophaskell
 462
 463       This is an algebraic datatype with two constructors, each getting
 464       a single argument. A case expression scrutinizing this datatype
 465       could look like the following:
 466
 467       \startlambda
 468         case s of
 469           A word -> High
 470           B bit -> bit
 471       \stoplambda
 472
 473       What this expression does is check the constructor of the
 474       scrutinee \lam{s}. If it is \lam{A}, it always evaluates to
 475       \lam{High}. If the constructor is \lam{B}, the binder \lam{bit} is
 476       bound to the argument passed to \lam{B} and the case expression
 477       evaluates to this bit.
 478
 479       If none of the alternatives match, the \lam{DEFAULT} alternative
 480       is chosen. A case expression must always be exhaustive, \ie it
 481       must cover all possible constructors that the scrutinee can have
 482       (if all of them are covered explicitly, the \lam{DEFAULT}
 483       alternative can be left out).
 484
 485       Since we can only match the top level constructor, there can be no overlap
 486       in the alternatives and thus order of alternatives is not relevant (though
 487       the \lam{DEFAULT} alternative must appear first for implementation
 488       efficiency).
 489
 490       To support strictness, the scrutinee is always evaluated into
 491       \small{WHNF}, even when there is only a \lam{DEFAULT} alternative. This
 492       allows aplication of the strict function \lam{f} to the argument \lam{a}
 493       to be written like:
 494
 495       \startlambda
 496       f (case a of arg DEFAULT -> arg)
 497       \stoplambda
 498
 499       According to the \GHC documentation, this is the only use for the extra
 500       binder to which the scrutinee is bound.  When not using strictness
 501       annotations (which is rather pointless in hardware descriptions),
 502       \small{GHC} seems to never generate any code making use of this binder.
 503       In fact, \GHC has never been observed to generate code using this
 504       binder, even when strictness was involved.  Nonetheless, the prototype
 505       handles this binder as expected.
 506
 507       Note that these case expressions are less powerful than the full Haskell
 508       case expressions. In particular, they do not support complex patterns like
 509       in Haskell. Only the constructor of an expression can be matched,
 510       complex patterns are implemented using multiple nested case expressions.
 511
 512       Case expressions are also used for unpacking of algebraic datatypes, even
 513       when there is only a single constructor. For examples, to add the elements
 514       of a tuple, the following Core is generated:
 515
 516       \startlambda
 517       sum = λtuple.case tuple of
 518         (,) a b -> a + b
 519       \stoplambda
 520
 521       Here, there is only a single alternative (but no \lam{DEFAULT}
 522       alternative, since the single alternative is already exhaustive). When
 523       it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
 524       (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
 525     \stopdesc
 526
 527     \startdesc{Cast expression}
 528       \defref{cast expression}
 529       \startlambda
 530       body ▶ targettype
 531       \stoplambda
 532       A cast expression allows you to change the type of an expression to an
 533       equivalent type. Note that this is not meant to do any actual work, like
 534       conversion of data from one format to another, or force a complete type
 535       change. Instead, it is meant to change between different representations
 536       of the same type, \eg switch between types that are provably equal (but
 537       look different).
 538
 539       In our hardware descriptions, we typically see casts to change between a
 540       Haskell newtype and its contained type, since those are effectively
 541       different types (so a cast is needed) with the same representation (but
 542       no work is done by the cast).
 543
 544       More complex are types that are proven to be equal by the typechecker,
 545       but look different at first glance. To ensure that, once the typechecker
 546       has proven equality, this information sticks around, explicit casts are
 547       added. In our notation we only write the target type, but in reality a
 548       cast expressions carries around a \emph{coercion}, which can be seen as a
 549       proof of equality. \todo{Example}
 550
 551       The value of a cast is the value of its body, unchanged. The type of this
 552       value is equal to the target type, not the type of its body.
 553
 554       \todo{Move and update this paragraph}
 555       Note that this syntax is also used sometimes to indicate that a particular
 556       expression has a particular type, even when no cast expression is
 557       involved. This is then purely informational, since the only elements that
 558       are explicitely typed in the Core language are the binder references and
 559       cast expressions, the types of all other elements are determined at
 560       runtime.
 561     \stopdesc
 562
 563     \startdesc{Note}
 564       The Core language in \small{GHC} allows adding \emph{notes}, which serve
 565       as hints to the inliner or add custom (string) annotations to a core
 566       expression. These shouldn't be generated normally, so these are not
 567       handled in any way in the prototype.
 568     \stopdesc
 569
 570     \startdesc{Type}
 571       \defref{type expression}
 572       \startlambda
 573       @T
 574       \stoplambda
 575       It is possibly to use a Core type as a Core expression. To prevent
 576       confusion between types and values, the \lam{@} sign is used to
 577       explicitly mark a type that is used in a Core expression.
 578
 579       For the actual types supported by Core, see
 580       \in{section}[sec:prototype:coretypes]. This \quote{lifting} of a
 581       type into the value domain is done to allow for type abstractions
 582       and applications to be handled as normal lambda abstractions and
 583       applications above. This means that a type expression in Core can
 584       only ever occur in the argument position of an application, and
 585       only if the type of the function that is applied to expects a type
 586       as the first argument. This happens for all polymorphic functions,
 587       for example, the \lam{fst} function:
 588
 589       \startlambda
 590       fst :: \forall t1. \forall t2. (t1, t2) ->t1
 591       fst = λt1.λt2.λ(tup :: (t1, t2)). case tup of (,) a b -> a
 592
 593       fstint :: (Int, Int) -> Int
 594       fstint = λa.λb.fst @Int @Int a b
 595       \stoplambda
 596
 597       The type of \lam{fst} has two universally quantified type variables. When
 598       \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
 599       (which are substitued for \lam{t1} and \lam{t2} in the type of \lam{fst}, so
 600       the actual type of arguments and result of \lam{fst} can be found:
 601       \lam{fst @Int @Int :: (Int, Int) -> Int}).
 602     \stopdesc
 603
 604     \subsection[sec:prototype:coretypes]{Core type system}
 605       Whereas the expression syntax of Core is very simple, its type system is
 606       a bit more complicated. It turns out it is harder to \quote{desugar}
 607       Haskell's complex type system into something more simple. Most of the
 608       type system is thus very similar to that of Haskell.
 609
 610       We will slightly limit our view on Core's type system, since the more
 611       complicated parts of it are only meant to support Haskell's (or rather,
 612       \GHC's) type extensions, such as existential types, \small{GADT}s, type
 613       families and other non-standard Haskell stuff which we don't (plan to)
 614       support.
 615
 616       In Core, every expression is typed. The translation to Core happens
 617       after the typechecker, so types in Core are always correct as well
 618       (though you could of course construct invalidly typed expressions
 619       through the \GHC API).
 620
 621       Any type in core is one of the following:
 622
 623       \startdesc{A type variable}
 624         \startlambda
 625         t
 626         \stoplambda
 627
 628         This is a reference to a type defined elsewhere. This can either be a
 629         polymorphic type (like the latter two \lam{t}'s in \lam{id :: \forall t.
 630         t -> t}), or a type constructor (like \lam{Bool} in \lam{not :: Bool ->
 631         Bool}). Like in Haskell, polymorphic type variables always
 632         start with a lowercase letter, while type constructors always start
 633         with an uppercase letter.
 634
 635         \todo{How to define (new) type constructors?}
 636
 637         A special case of a type constructor is the \emph{function type
 638         constructor}, \lam{->}. This is a type constructor taking two arguments
 639         (using application below). The function type constructor is commonly
 640         written inline, so we write \lam{a -> b} when we really mean \lam{-> a
 641         b}, the function type constructor applied to \lam{a} and \lam{b}.
 642
 643         Polymorphic type variables can only be defined by a lambda
 644         abstraction, see the forall type below.
 645       \stopdesc
 646
 647       \startdesc{A type application}
 648         \startlambda
 649           Maybe Int
 650         \stoplambda
 651
 652         This applies some type to another type. This is particularly used to
 653         apply type variables (type constructors) to their arguments.
 654
 655         As mentioned above, applications of some type constructors have
 656         special notation. In particular, these are applications of the
 657         \emph{function type constructor} and \emph{tuple type constructors}:
 658         \startlambda
 659           foo :: t1 -> t2
 660           foo' :: -> t1 t2
 661           bar :: (t1, t2, t3)
 662           bar' :: (,,) t1 t2 t3
 663         \stoplambda
 664       \stopdesc
 665
 666       \startdesc{The forall type}
 667         \startlambda
 668           id :: \forall t. t -> t
 669         \stoplambda
 670         The forall type introduces polymorphism. It is the only way to
 671         introduce new type variables, which are completely unconstrained (Any
 672         possible type can be assigned to it). Constraints can be added later
 673         using predicate types, see below.
 674
 675         A forall type is always (and only) introduced by a type lambda
 676         expression. For example, the Core translation of the
 677         id function is:
 678         \startlambda
 679           id = λt.λ(x :: t).x
 680         \stoplambda
 681
 682         Here, the type of the binder \lam{x} is \lam{t}, referring to the
 683         binder in the topmost lambda.
 684
 685         When using a value with a forall type, the actual type
 686         used must be applied first. For example haskell expression \hs{id
 687         True} (the function \hs{id} appleid to the dataconstructor \hs{True})
 688         translates to the following Core:
 689
 690         \startlambda
 691         id @Bool True
 692         \stoplambda
 693
 694         Here, id is first applied to the type to work with. Note that the type
 695         then changes from \lam{id :: \forall t. t -> t} to \lam{id @Bool ::
 696         Bool -> Bool}. Note that the type variable \lam{a} has been
 697         substituted with the actual type.
 698
 699         In Haskell, forall types are usually not explicitly specified (The use
 700         of a lowercase type variable implicitly introduces a forall type for
 701         that variable). In fact, in standard Haskell there is no way to
 702         explicitly specify forall types. Through a language extension, the
 703         \hs{forall} keyword is available, but still optional for normal forall
 704         types (it is needed for \emph{existentially quantified types}, which
 705         Cλash does not support).
 706       \stopdesc
 707
 708       \startdesc{Predicate type}
 709         \startlambda
 710           show :: \forall t. Show t ⇒ t → String
 711         \stoplambda
 712
 713         \todo{Sidenote: type classes?}
 714
 715         A predicate type introduces a constraint on a type variable introduced
 716         by a forall type (or type lambda). In the example above, the type
 717         variable \lam{t} can only contain types that are an \emph{instance} of
 718         the \emph{type class} \lam{Show}. \refdef{type class}
 719
 720         There are other sorts of predicate types, used for the type families
 721         extension, which we will not discuss here.
 722
 723         A predicate type is introduced by a lambda abstraction. Unlike with
 724         the forall type, this is a value lambda abstraction, that must be
 725         applied to a value. We call this value a \emph{dictionary}.
 726
 727         Without going into the implementation details, a dictionary can be
 728         seen as a lookup table all the methods for a given (single) type class
 729         instance. This means that all the dictionaries for the same type class
 730         look the same (\eg contain methods with the same names). However,
 731         dictionaries for different instances of the same class contain
 732         different methods, of course.
 733
 734         A dictionary is introduced by \small{GHC} whenever it encounters an
 735         instance declaration. This dictionary, as well as the binder
 736         introduced by a lambda that introduces a dictionary, have the
 737         predicate type as their type. These binders are usually named starting
 738         with a \lam{\$}. Usually the name of the type concerned is not
 739         reflected in the name of the dictionary, but the name of the type
 740         class is. The Haskell expression \hs{show True} thus becomes:
 741
 742         \startlambda
 743         show @Bool \$dShow True
 744         \stoplambda
 745       \stopdesc
 746
 747       Using this set of types, all types in basic Haskell can be represented.
 748
 749       \todo{Overview of polymorphism with more examples (or move examples
 750       here)}.
 751
 752   \section[sec:prototype:statetype]{State annotations in Haskell}
 753     As noted in \in{section}[sec:description:stateann], Cλash needs some
 754     way to let the programmer explicitly specify which of a function's
 755     arguments and which part of a function's result represent the
 756     function's state.
 757
 758     Using the Haskell type systems, there are a few ways we can tackle this.
 759
 760     \subsection{Type synonyms}
 761       Haskell provides type synonyms as a way to declare a new type that is
 762       equal to an existing type (or rather, a new name for an existing type).
 763       This allows both the original type and the synonym to be used
 764       interchangedly in a Haskell program. This means no explicit conversion
 765       is needed either. For example, a simple accumulator would become:
 766
 767       \starthaskell
 768       type State s = s
 769       acc :: Word -> State Word -> (State Word, Word)
 770       acc i s = let sum = s + i in (sum, sum)
 771       \stophaskell
 772
 773       This looks nice in Haskell, but turns out to be hard to implement. There
 774       are no explicit conversion in Haskell, but not in Core either. This
 775       means the type of a value might be show as \hs{AccState} in some places,
 776       but \hs{Word} in others (and this can even change due to
 777       transformations). Since every binder has an explicit type associated
 778       with it, the type of every function type will be properly preserved and
 779       could be used to track down the statefulness of each value by the
 780       compiler. However, this makes the implementation a lot more complicated
 781       than it currently is using \hs{newtypes}.
 782
 783     % Use \type instead of \hs here, since the latter breaks inside
 784     % section headings.
 785     \subsection{Type renaming (\type{newtype})}
 786       Haskell also supports type renamings as a way to declare a new type that
 787       has the same (runtime) representation as an existing type (but is in
 788       fact a different type to the typechecker). With type renaming, an
 789       explicit conversion between values of the two types is needed. The
 790       accumulator would then become:
 791
 792       \starthaskell
 793       newtype State s = State s
 794       acc :: Word -> State Word -> (State Word, Word)
 795       acc i (State s) = let sum = s + i in (State sum, sum)
 796       \stophaskell
 797
 798       The \hs{newtype} line declares a new type \hs{State} that has one type
 799       argument, \hs{s}. This type contains one \quote{constructor} \hs{State}
 800       with a single argument of type \hs{s}. It is customary to name the
 801       constructor the same as the type, which is allowed (since types can
 802       never cause name collisions with values). The difference with the type
 803       synonym example is in the explicit conversion between the \hs{State
 804       Word} and \hs{Word} types by pattern matching and by using the explicit
 805       the \hs{State constructor}.
 806
 807       This explicit conversion makes the \VHDL generation easier: Whenever we
 808       remove (unpack) the \hs{State} type, this means we are accessing the
 809       current state (\eg, accessing the register output). Whenever we are a
 810       adding (packing) the \hs{State} type, we are producing a new value for
 811       the state (\eg, providing the register input).
 812
 813       When dealing with nested states (a stateful function that calls stateful
 814       functions, which might call stateful functions, etc.) the state type
 815       could quickly grow complex because of all the \hs{State} type constructors
 816       needed. For example, consider the following state type (this is just the
 817       state type, not the entire function type):
 818
 819       \starttyping
 820       State (State Bit, State (State Word, Bit), Word)
 821       \stoptyping
 822
 823       We cannot leave all these \hs{State} type constructors out, since that
 824       would change the type (unlike when using type synonyms). However, when
 825       using type synonyms to hide away substates (see
 826       \in{section}[sec:prototype:substatesynonyms] below), this
 827       disadvantage should be limited.
 828
 829       \subsubsection{Different input and output types}
 830         An alternative could be to use different types for input and output
 831         state (\ie current and updated state). The accumulator example would
 832         then become something like:
 833
 834         \starthaskell
 835         newtype StateIn s = StateIn s
 836         newtype StateOut s = StateOut s
 837         acc :: Word -> StateIn Word -> (StateIn Word, Word)
 838         acc i (StateIn s) = let sum = s + i in (StateIn sum, sum)
 839         \stophaskell
 840
 841         This could make the implementation easier and the hardware
 842         descriptions less errorprone (you can no longer \quote{forget} to
 843         unpack and repack a state variable and just return it directly, which
 844         can be a problem in the current prototype). However, it also means we
 845         need twice as many type synonyms to hide away substates, making this
 846         approach a bit cumbersome. It also makes it harder to copmare input
 847         and output state types, possible reducing the type safety of the
 848         descriptions.
 849
 850     \subsection[sec:prototype:substatesynonyms]{Type synonyms for substates}
 851       As noted above, when using nested (hierarchical) states, the state types
 852       of the \quote{upper} functions (those that call other functions, which
 853       call other functions, etc.) quickly becomes complicated. Also, when the
 854       state type of one of the \quote{lower} functions changes, the state
 855       types of all the upper functions changes as well. If the state type for
 856       each function is explicitly and completely specified, this means that a
 857       lot of code needs updating whenever a state type changes.
 858
 859       To prevent this, it is recommended (but not enforced) to use a type
 860       synonym for the state type of every function. Every function calling
 861       other functions will then use the state type synonym of the called
 862       functions in its own type, requiring no code changes when the state type
 863       of a called function changes. This approach is used in
 864       \in{example}[ex:AvgState] below. The \hs{AccState} and \hs{AvgState}
 865       are examples of such state type synonyms.
 866
 867     \subsection{Chosen approach}
 868       To keep implementation simple, the current prototype uses the type
 869       renaming approach, with a single type for both input and output
 870       states. In the future, it might be worthwhile to revisit this
 871       approach if more complicated flow analysis is implemented for
 872       state variables. This analysis is needed to add proper error
 873       checking anyway and might allow the use of type synonyms without
 874       losing any expressivity.
 875
 876       \subsubsection{Example}
 877         As an example of the used approach, there is a simple averaging circuit in
 878         \in{example}[ex:AvgState]. This circuit lets the accumulation of the
 879         inputs be done by a subcomponent, \hs{acc}, but keeps a count of value
 880         accumulated in its own state.\footnote{Currently, the prototype
 881         is not able to compile this example, since the builtin function
 882         for division has not been added.}
 883
 884         \startbuffer[AvgState]
 885           -- The state type annotation
 886           newtype State s = State s
 887
 888           -- The accumulator state type
 889           type AccState = State Word
 890           -- The accumulator
 891           acc :: Word -> AccState -> (AccState, Word)
 892           acc i (State s) = let sum = s + i in (State sum, sum)
 893
 894           -- The averaging circuit state type
 895           type AvgState = State (AccState, Word)
 896           -- The averaging circuit
 897           avg :: Word -> AvgState -> (AvgState, Word)
 898           avg i (State s) = (State s', o)
 899             where
 900               (accs, count) = s
 901               -- Pass our input through the accumulator, which outputs a sum
 902               (accs', sum) = acc i accs
 903               -- Increment the count (which will be our new state)
 904               count' = count + 1
 905               -- Compute the average
 906               o = sum / count'
 907               s' = (accs', count')
 908         \stopbuffer
 909
 910         \placeexample[here][ex:AvgState]{Simple stateful averaging circuit.}
 911           %\startcombination[2*1]
 912             {\typebufferhs{AvgState}}%{Haskell description using function applications.}
 913           %  {\boxedgraphic{AvgState}}{The architecture described by the Haskell description.}
 914           %\stopcombination
 915         \todo{Picture}
 916
 917   \section{Implementing state}
 918     Now its clear how to put state annotations in the Haskell source,
 919     there is the question of how to implement this state translation. As
 920     we've seen in \in{section}[sec:prototype:design], the translation to
 921     \VHDL happens as a simple, final step in the compilation process.
 922     This step works on a core expression in normal form. The specifics
 923     of normal form will be explained in
 924     \in{chapter}[chap:normalization], but the examples given should be
 925     easy to understand using the definitin of Core given above.
 926
 927         \startbuffer[AvgStateNormal]
 928           acc = λi.λspacked.
 929             let
 930               -- Remove the State newtype
 931               s = spacked ▶ Word
 932               s' = s + i
 933               o = s + i
 934               -- Add the State newtype again
 935               spacked' = s' ▶ State Word
 936               res = (spacked', o)
 937             in
 938               res
 939
 940           avg = λi.λspacked.
 941             let
 942               s = spacked ▶ (AccState, Word)
 943               accs = case s of (accs, _) -> accs
 944               count = case s of (_, count) -> count
 945               accres = acc i accs
 946               accs' = case accres of (accs', _) -> accs'
 947               sum = case accres of (_, sum) -> sum
 948               count' = count + 1
 949               o = sum / count'
 950               s' = (accs', count')
 951               spacked' = s' ▶ State (AccState, Word)
 952               res = (spacked', o)
 953             in
 954               res
 955         \stopbuffer
 956
 957         \placeexample[here][ex:AvgStateNormal]{Normalized version of \in{example}[ex:AvgState]}
 958             {\typebufferlam{AvgStateNormal}}
 959
 960     \subsection[sec:prototype:statelimits]{State in normal form}
 961       Before describing how to translate state from normal form to
 962       \VHDL, we will first see how state handling looks in normal form.
 963       What limitations are there on their use to guarantee that proper
 964       \VHDL can be generated?
 965
 966       We will try to formulate a number of rules about what operations are
 967       allowed with state variables. These rules apply to the normalized Core
 968       representation, but will in practice apply to the original Haskell
 969       hardware description as well. Ideally, these rules would become part
 970       of the intended normal form definition \refdef{intended normal form
 971       definition}, but this is not the case right now. This can cause some
 972       problems, which are detailed in
 973       \in{section}[sec:normalization:stateproblems].
 974
 975       In these rules we use the terms \emph{state variable} to refer to any
 976       variable that has a \lam{State} type. A \emph{state-containing
 977       variable} is any variable whose type contains a \lam{State} type,
 978       but is not one itself (like \lam{(AccState, Word)} in the example,
 979       which is a tuple type, but contains \lam{AccState}, which is again
 980       equal to \lam{State Word}).
 981
 982       We also use a distinction between \emph{input} and \emph{output
 983       (state) variables} and \emph{substate variables}, which will be
 984       defined in the rules themselves.
 985
 986       \startdesc{State variables can appear as an argument.}
 987         \startlambda
 988           avg = λi.λspacked. ...
 989         \stoplambda
 990
 991         Any lambda that binds a variable with a state type, creates a new
 992         input state variable.
 993       \stopdesc
 994
 995       \startdesc{Input state variables can be unpacked.}
 996         \startlambda
 997           s = spacked ▶ (AccState, Word)
 998         \stoplambda
 999
1000         An input state variable may be unpacked using a cast operation. This
1001         removes the \lam{State} type renaming and the result has no longer a
1002         \lam{State} type.
1003
1004         If the result of this unpacking does not have a state type and does
1005         not contain state variables, there are no limitations on its use.
1006         Otherwise if it does not have a state type but does contain
1007         substates, we refer to it as a \emph{state-containing input
1008         variable} and the limitations below apply. If it has a state type
1009         itself, we refer to it as an \emph{input substate variable} and the
1010         below limitations apply as well.
1011
1012         It may seem strange to consider a variable that still has a state
1013         type directly after unpacking, but consider the case where a
1014         function does not have any state of its own, but does call a single
1015         stateful function. This means it must have a state argument that
1016         contains just a substate. The function signature of such a function
1017         could look like:
1018
1019         \starthaskell
1020           type FooState = State AccState
1021         \stophaskell
1022
1023         Which is of course equivalent to \lam{State (State Word)}.
1024       \stopdesc
1025
1026       \startdesc{Variables can be extracted from state-containing input variables.}
1027         \startlambda
1028           accs = case s of (accs, _) -> accs
1029         \stoplambda
1030
1031         A state-containing input variable is typically a tuple containing
1032         multiple elements (like the current function's state, substates or
1033         more tuples containing substates). All of these can be extracted
1034         from an input variable using an extractor case (or possibly
1035         multiple, when the input variable is nested).
1036
1037         If the result has no state type and does not contain any state
1038         variables either, there are no further limitations on its use. If
1039         the result has no state type but does contain state variables we
1040         refer to it as a \emph{state-containing input variable} and this
1041         limitation keeps applying. If the variable has a state type itself,
1042         we refer to it as an \emph{input substate variable} and below
1043         limitations apply.
1044
1045       \startdesc{Input substate variables can be passed to functions.}
1046         \startlambda
1047           accres = acc i accs
1048           accs' = case accres of (accs', _) -> accs'
1049         \stoplambda
1050
1051         An input substate variable can (only) be passed to a function.
1052         Additionally, every input substate variable must be used in exactly
1053         \emph{one} application, no more and no less.
1054
1055         The function result should contain exactly one state variable, which
1056         can be extracted using (multiple) case expressions. The extracted
1057         state variable is referred to the \emph{output substate}
1058
1059         The type of this output substate must be identical to the type of
1060         the input substate passed to the function.
1061       \stopdesc
1062
1063       \startdesc{Variables can be inserted into a state-containing output variable.}
1064         \startlambda
1065           s' = (accs', count')
1066         \stoplambda
1067
1068         A function's output state is usually a tuple containing its own
1069         updated state variables and all output substates. This result is
1070         built up using any single-constructor algebraic datatype.
1071
1072         The result of these expressions is referred to as a
1073         \emph{state-containing output variable}, which are subject to these
1074         limitations.
1075       \stopdesc
1076
1077       \startdesc{State containing output variables can be packed.}
1078         \startlambda
1079           spacked' = s' ▶ State (AccState, Word)
1080         \stoplambda
1081
1082         As soon as all a functions own update state and output substate
1083         variables have been joined together, the resulting
1084         state-containing output variable can be packed into an output
1085         state variable. Packing is done by casting into a state type.
1086       \stopdesc
1087
1088       \startdesc{Output state variables can appear as (part of) a function result.}
1089         \startlambda
1090           avg = λi.λspacked.
1091             let
1092             \vdots
1093             res = (spacked', o)
1094           in
1095             res
1096         \stoplambda
1097         When the output state is packed, it can be returned as a part
1098         of the function result. Nothing else can be done with this
1099         value (or any value that contains it).
1100       \stopdesc
1101
1102       There is one final limitation that is hard to express in the above
1103       itemization. Whenever substates are extracted from the input state
1104       to be passed to functions, the corresponding output substates
1105       should be inserted into the output state in the same way. In other
1106       words, each pair of corresponding substates in the input and
1107       output states should be passed / returned from the same called
1108       function.
1109
1110       The prototype currently does not check much of the above
1111       conditions. This means that if the conditions are violated,
1112       sometimes a compile error is generated, but in other cases output
1113       can be generated that is not valid \VHDL or at the very least does
1114       not correspond to the input.
1115
1116     \subsection{Translating to \VHDL}
1117       As noted above, the basic approach when generating \VHDL for stateful
1118       functions is to generate a single register for every stateful function.
1119       We look around the normal form to find the let binding that removes the
1120       \lam{State} newtype (using a cast). We also find the let binding that
1121       adds a \lam{State} type. These are connected to the output and the input
1122       of the generated let binding respectively. This means that there can
1123       only be one let binding that adds and one that removes the \lam{State}
1124       type. It is easy to violate this constraint. This problem is detailed in
1125       \in{section}[sec:normalization:stateproblems].
1126
1127       This approach seems simple enough, but will this also work for more
1128       complex stateful functions involving substates?  Observe that any
1129       component of a function's state that is a substate, \ie passed on as
1130       the state of another function, should have no influence on the
1131       hardware generated for the calling function. Any state-specific
1132       \small{VHDL} for this component can be generated entirely within the
1133       called function.  So, we can completely ignore substates when
1134       generating \VHDL for a function.
1135
1136       From this observation, we might think to remove the substates from a
1137       function's states alltogether, and leave only the state components
1138       which are actual states of the current function. While doing this
1139       would not remove any information needed to generate \small{VHDL} from
1140       the function, it would cause the function definition to become invalid
1141       (since we won't have any substate to pass to the functions anymore).
1142       We could solve the syntactic problems by passing \type{undefined} for
1143       state variables, but that would still break the code on the semantic
1144       level (\ie, the function would no longer be semantically equivalent to
1145       the original input).
1146
1147       To keep the function definition correct until the very end of the
1148       process, we will not deal with (sub)states until we get to the
1149       \small{VHDL} generation.  Then, we are translating from Core to
1150       \small{VHDL}, and we can simply ignore substates, effectively removing
1151       the substate components alltogether.
1152
1153       But, how will we know what exactly is a substate? Since any state
1154       argument or return value that represents state must be of the
1155       \type{State} type, we can look at the type of a value. However, we
1156       must be careful to ignore only \emph{substates}, and not a
1157       function's own state.
1158
1159       In \in{example}[ex:AvgStateNorm] above, we should generate a register
1160       connected with its output connected to \lam{s} and its input connected
1161       to \lam{s'}. However, \lam{s'} is build up from both \lam{accs'} and
1162       \lam{count'}, while only \lam{count'} should end up in the register.
1163       \lam{accs'} is a substate for the \lam{acc} function, for which a
1164       register will be created when generating \VHDL for the \lam{acc}
1165       function.
1166
1167       Fortunately, the \lam{accs'} variable (and any other substate) has a
1168       property that we can easily check: It has a \lam{State} type
1169       annotation. This means that whenever \VHDL is generated for a tuple
1170       (or other algebraic type), we can simply leave out all elements that
1171       have a \lam{State} type. This will leave just the parts of the state
1172       that do not have a \lam{State} type themselves, like \lam{count'},
1173       which is exactly a function's own state. This approach also means that
1174       the state part of the result is automatically excluded when generating
1175       the output port, which is also required.
1176
1177       We can formalize this translation a bit, using the following
1178       rules.
1179
1180       \startitemize
1181         \item A state unpack operation should not generate any \small{VHDL}.
1182         The binder to which the unpacked state is bound should still be
1183         declared, this signal will become the register and will hold the
1184         current state.
1185         \item A state pack operation should not generate any \small{VHDL}.
1186         The binder to which the packed state is bound should not be
1187         declared. The binder that is packed is the signal that will hold the
1188         new state.
1189         \item Any values of a State type should not be translated to
1190         \small{VHDL}. In particular, State elements should be removed from
1191         tuples (and other datatypes) and arguments with a state type should
1192         not generate ports.
1193         \item To make the state actually work, a simple \small{VHDL}
1194         (sequential) process should be generated. This process updates
1195         the state at every clockcycle, by assigning the new state to the
1196         current state. This will be recognized by synthesis tools as a
1197         register specification.
1198       \stopitemize
1199
1200       When applying these rules to the description in
1201       \in{example}[ex:AvgStateNormal], we be left with the description
1202       in \in{example}[ex:AvgStateRemoved]. All the parts that don't
1203       generate any \VHDL directly are crossed out, leaving just the
1204       actual flow of values in the final hardware.
1205
1206       \startlambda
1207         avg = iλ.λ--spacked.--
1208           let
1209             s = --spacked ▶ (AccState, Word)--
1210             --accs = case s of (accs, _) -> accs--
1211             count = case s of (--_,-- count) -> count
1212             accres = acc i --accs--
1213             --accs' = case accres of (accs', _) -> accs'--
1214             sum = case accres of (--_,-- sum) -> sum
1215             count' = count + 1
1216             o = sum / count'
1217             s' = (--accs',-- count')
1218             --spacked' = s' ▶ State (AccState, Word)--
1219             res = (--spacked',-- o)
1220           in
1221             res
1222       \stoplambda
1223
1224       When we would really leave out the crossed out parts, we get a slightly
1225       weird program: There is a variable \lam{s} which has no value, and there
1226       is a variable \lam{s'} that is never used. Together, these two will form
1227       the state process of the function. \lam{s} contains the "current" state,
1228       \lam{s'} is assigned the "next" state. So, at the end of each clock
1229       cycle, \lam{s'} should be assigned to \lam{s}.
1230
1231       As you can see, the definition of \lam{s'} is still present, since
1232       it does not have a state type. The \lam{accums'} substate has been
1233       removed, leaving us just with the state of \lam{avg} itself.
1234
1235       As an illustration of the result of this function,
1236       \in{example}[ex:AccStateVHDL] and \in{example}[ex:AvgStateVHDL] show the the \VHDL that is
1237       generated from the examples is this section.
1238
1239       \startbuffer[AvgStateVHDL]
1240         entity avgComponent_0 is
1241              port (\izAlE2\ : in \unsigned_31\;
1242                    \foozAo1zAo12\ : out \(,)unsigned_31\;
1243                    clock : in std_logic;
1244                    resetn : in std_logic);
1245         end entity avgComponent_0;
1246
1247
1248         architecture structural of avgComponent_0 is
1249              signal \szAlG2\ : \(,)unsigned_31\;
1250              signal \countzAlW2\ : \unsigned_31\;
1251              signal \dszAm62\ : \(,)unsigned_31\;
1252              signal \sumzAmk3\ : \unsigned_31\;
1253              signal \reszAnCzAnM2\ : \unsigned_31\;
1254              signal \foozAnZzAnZ2\ : \unsigned_31\;
1255              signal \reszAnfzAnj3\ : \unsigned_31\;
1256              signal \s'zAmC2\ : \(,)unsigned_31\;
1257         begin
1258              \countzAlW2\ <= \szAlG2\.A;
1259
1260              \comp_ins_dszAm62\ : entity accComponent_1
1261                                        port map (\izAob3\ => \izAlE2\,
1262                                                  \foozAoBzAoB2\ => \dszAm62\,
1263                                                  clock => clock,
1264                                                  resetn => resetn);
1265
1266              \sumzAmk3\ <= \dszAm62\.A;
1267
1268              \reszAnCzAnM2\ <= to_unsigned(1, 32);
1269
1270              \foozAnZzAnZ2\ <= \countzAlW2\ + \reszAnCzAnM2\;
1271
1272              \reszAnfzAnj3\ <= \sumzAmk3\ * \foozAnZzAnZ2\;
1273
1274              \s'zAmC2\.A <= \foozAnZzAnZ2\;
1275
1276              \foozAo1zAo12\.A <= \reszAnfzAnj3\;
1277
1278              state : process (clock, resetn)
1279              begin
1280                   if resetn = '0' then
1281                   elseif rising_edge(clock) then
1282                        \szAlG2\ <= \s'zAmC2\;
1283                   end if;
1284              end process state;
1285         end architecture structural;
1286       \stopbuffer
1287       \startbuffer[AccStateVHDL]
1288         entity accComponent_1 is
1289              port (\izAob3\ : in \unsigned_31\;
1290                    \foozAoBzAoB2\ : out \(,)unsigned_31\;
1291                    clock : in std_logic;
1292                    resetn : in std_logic);
1293         end entity accComponent_1;
1294
1295
1296         architecture structural of accComponent_1 is
1297              signal \szAod3\ : \unsigned_31\;
1298              signal \reszAonzAor3\ : \unsigned_31\;
1299         begin
1300              \reszAonzAor3\ <= \szAod3\ + \izAob3\;
1301
1302              \foozAoBzAoB2\.A <= \reszAonzAor3\;
1303
1304              state : process (clock, resetn)
1305              begin
1306                   if resetn = '0' then
1307                   elseif rising_edge(clock) then
1308                        \szAod3\ <= \reszAonzAor3\;
1309                   end if;
1310              end process state;
1311         end architecture structural;
1312       \stopbuffer
1313
1314       \placeexample[][ex:AccStateVHDL]{\VHDL generated for acc from \in{example}[ex:AvgState]}
1315           {\typebuffer[AccStateVHDL]}
1316       \placeexample[][ex:AvgStateVHDL]{\VHDL generated for avg from \in{example}[ex:AvgState]}
1317           {\typebuffer[AvgStateVHDL]}
1318 %    \subsection{Initial state}
1319 %      How to specify the initial state? Cannot be done inside a hardware
1320 %      function, since the initial state is its own state argument for the first
1321 %      call (unless you add an explicit, synchronous reset port).
1322 %
1323 %      External init state is natural for simulation.
1324 %
1325 %      External init state works for hardware generation as well.
1326 %
1327 %      Implementation issues: state splitting, linking input to output state,
1328 %      checking usage constraints on state variables.
1329 %
1330 %      \todo{Implementation issues: Separate compilation, simplified core.}
1331 %
1332 % vim: set sw=2 sts=2 expandtab: