Chapters/Prototype.tex

   1 \chapter[chap:prototype]{Prototype}
   2   An important step in this research is the creation of a prototype compiler.
   3   Having this prototype allows us to apply the ideas from the previous chapter
   4   to actual hardware descriptions and evaluate their usefulness. Having a
   5   prototype also helps to find new techniques and test possible
   6   interpretations.
   7
   8   Obviously the prototype was not created after all research
   9   ideas were formed, but its implementation has been interleaved with the
  10   research itself. Also, the prototype described here is the final version, it
  11   has gone through a number of design iterations which we will not completely
  12   describe here.
  13
  14   \section[sec:prototype:input]{Input language}
  15     When implementing this prototype, the first question to ask is:
  16     Which (functional) language will be used to describe our hardware?
  17     (Note that this does not concern the \emph{implementation language}
  18     of the compiler, just the language \emph{translated by} the
  19     compiler).
  20
  21     Initially, we have two choices:
  22
  23     \startitemize
  24       \item Create a new functional language from scratch. This has the
  25       advantage of having a language that contains exactly those elements that
  26       are convenient for describing hardware and can contain special
  27       constructs that allows our hardware descriptions to be more powerful or
  28       concise.
  29       \item Use an existing language and create a new backend for it. This has
  30       the advantage that existing tools can be reused, which will speed up
  31       development.
  32     \stopitemize
  33
  34
  35     \placeintermezzo{}{
  36       \startframedtext[width=8cm,background=box,frame=no]
  37       \startalignment[center]
  38         {\tfa No \small{EDSL} or Template Haskell}
  39       \stopalignment
  40       \blank[medium]
  41
  42       Note that in this consideration, embedded domain-specific
  43       languages (\small{EDSL}) and Template Haskell (\small{TH})
  44       approaches have not been included. As we have seen in
  45       \in{section}[sec:context:fhdls], these approaches have all kinds
  46       of limitations on the description language that we would like to
  47       avoid.
  48       \stopframedtext
  49     }
  50     Considering that we required a prototype which should be working quickly,
  51     and that implementing parsers, semantic checkers and especially
  52     typcheckers is not exactly the core of this research (but it is lots and
  53     lots of work!), using an existing language is the obvious choice. This
  54     also has the advantage that a large set of language features is available
  55     to experiment with and it is easy to find which features apply well and
  56     which do not. A possible second prototype could use a custom language with
  57     just the useful features (and possibly extra features that are specific to
  58     the domain of hardware description as well).
  59
  60     The second choice is which of the many existing languages to use. As
  61     mentioned before, the chosen language is Haskell.  This choice has not been the
  62     result of a thorough comparison of languages, for the simple reason that
  63     the requirements on the language were completely unclear at the start of
  64     this research. The fact that Haskell is a language with a broad spectrum
  65     of features, that it is commonly used in research projects and that the
  66     primary compiler, \GHC, provides a high level API to its internals, made
  67     Haskell an obvious choice.
  68
  69   \section[sec:prototype:output]{Output format}
  70     The second important question is: What will be our output format?
  71     This output format should at least allow for programming the
  72     hardware design into a field-programmable gate array (\small{FPGA}).
  73     The choice of output format is thus limited by what hardware
  74     synthesis and programming tools can process.
  75
  76     Looking at other tools in the industry, the Electronic Design Interchange
  77     Format (\small{EDIF}) is commonly used for storing intermediate
  78     \emph{netlists} (lists of components and connections between these
  79     components) and is commonly the target for \small{VHDL} and Verilog
  80     compilers.
  81
  82     However, \small{EDIF} is not completely tool-independent. It specifies a
  83     meta-format, but the hardware components that can be used vary between
  84     various tool and hardware vendors, as well as the interpretation of the
  85     \small{EDIF} standard. \cite[li89]
  86
  87     This means that when working with \small{EDIF}, our prototype would become
  88     technology dependent (\eg only work with \small{FPGA}s of a specific
  89     vendor, or even only with specific chips). This limits the applicability
  90     of our prototype. Also, the tools we would like to use for verifying,
  91     simulating and draw pretty pictures of our output (like Precision, or
  92     QuestaSim) are designed for \small{VHDL} or Verilog input.
  93
  94     For these reasons, we will not use \small{EDIF}, but \small{VHDL} as our
  95     output language.  We choose \VHDL over Verilog simply because we are
  96     familiar with \small{VHDL} already. The differences between \small{VHDL}
  97     and Verilog are on the higher level, while we will be using \small{VHDL}
  98     mainly to write low level, netlist-like descriptions anyway.
  99
 100     An added advantage of using VHDL is that we can profit from existing
 101     optimizations in VHDL synthesizers. A lot of optimizations are done on the
 102     VHDL level by existing tools. These tools have been under
 103     development for years, so it would not be reasonable to assume we
 104     could achieve a similar amount of optimization in our prototype (nor
 105     should it be a goal, considering this is just a prototype).
 106
 107     Note that we will be using \small{VHDL} as our output language, but will
 108     not use its full expressive power. Our output will be limited to using
 109     simple, structural descriptions, without any complex behavioural
 110     descriptions like arbitrary sequential statements (which might not
 111     be supported by all tools). This ensures that any tool that works
 112     with \VHDL will understand our output (most tools do not support
 113     synthesis of more complex \VHDL).  This also leaves open the option
 114     to switch to \small{EDIF} in the future, with minimal changes to the
 115     prototype.
 116
 117   \section[sec:prototype:design]{Prototype design}
 118     As suggested above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
 119     implement our prototype compiler. To understand the design of the
 120     compiler, we will first dive into the \small{GHC} compiler a bit. Its
 121     compilation consists of the following steps (slightly simplified):
 122
 123     \startuseMPgraphic{ghc-pipeline}
 124       % Create objects
 125       save inp, front, desugar, simpl, back, out;
 126       newEmptyBox.inp(0,0);
 127       newBox.front(btex Frontend etex);
 128       newBox.desugar(btex Desugarer etex);
 129       newBox.simpl(btex Simplifier etex);
 130       newBox.back(btex Backend etex);
 131       newEmptyBox.out(0,0);
 132
 133       % Space the boxes evenly
 134       inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c
 135         = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
 136       out.c = origin;
 137
 138       % Draw lines between the boxes. We make these lines "deferred" and give
 139       % them a name, so we can use ObjLabel to draw a label beside them.
 140       ncline.inp(inp)(front) "name(haskell)";
 141       ncline.front(front)(desugar) "name(ast)";
 142       ncline.desugar(desugar)(simpl) "name(core)";
 143       ncline.simpl(simpl)(back) "name(simplcore)";
 144       ncline.back(back)(out) "name(native)";
 145       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 146       ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
 147       ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
 148       ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
 149       ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
 150
 151       % Draw the objects (and deferred labels)
 152       drawObj (inp, front, desugar, simpl, back, out);
 153     \stopuseMPgraphic
 154     \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
 155
 156     \startdesc{Frontend}
 157       This step takes the Haskell source files and parses them into an
 158       abstract syntax tree (\small{AST}). This \small{AST} can express the
 159       complete Haskell language and is thus a very complex one (in contrast
 160       with the Core \small{AST}, later on). All identifiers in this
 161       \small{AST} are resolved by the renamer and all types are checked by the
 162       typechecker.
 163     \stopdesc
 164     \startdesc{Desugaring}
 165       This steps takes the full \small{AST} and translates it to the
 166       \emph{Core} language. Core is a very small functional language with lazy
 167       semantics, that can still express everything Haskell can express. Its
 168       simpleness makes Core very suitable for further simplification and
 169       translation. Core is the language we will be working with as well.
 170     \stopdesc
 171     \startdesc{Simplification}
 172       Through a number of simplification steps (such as inlining, common
 173       subexpression elimination, etc.) the Core program is simplified to make
 174       it faster or easier to process further.
 175     \stopdesc
 176     \startdesc{Backend}
 177       This step takes the simplified Core program and generates an actual
 178       runnable program for it. This is a big and complicated step we will not
 179       discuss it any further, since it is not required for our prototype.
 180     \stopdesc
 181
 182     In this process, there are a number of places where we can start our work.
 183     Assuming that we do not want to deal with (or modify) parsing, typechecking
 184     and other frontend business and that native code is not really a useful
 185     format anymore, we are left with the choice between the full Haskell
 186     \small{AST}, or the smaller (simplified) core representation.
 187
 188     The advantage of taking the full \small{AST} is that the exact structure
 189     of the source program is preserved. We can see exactly what the hardware
 190     description looks like and which syntax constructs were used. However,
 191     the full \small{AST} is a very complicated datastructure. If we are to
 192     handle everything it offers, we will quickly get a big compiler.
 193
 194     Using the core representation gives us a much more compact datastructure
 195     (a core expression only uses 9 constructors). Note that this does not mean
 196     that the core representation itself is smaller, on the contrary.
 197     Since the core language has less constructs, most Core expressions
 198     are larger than the equivalent versions in Haskell.
 199
 200     However, the fact that the core language is so much smaller, means it is a
 201     lot easier to analyze and translate it into something else. For the same
 202     reason, \small{GHC} runs its simplifications and optimizations on the core
 203     representation as well \cite[jones96].
 204
 205     However, we will use the normal core representation, not the simplified
 206     core. Reasons for this are detailed below. \todo{Ref}
 207
 208     The final prototype roughly consists of three steps:
 209
 210     \startuseMPgraphic{clash-pipeline}
 211       % Create objects
 212       save inp, front, norm, vhdl, out;
 213       newEmptyBox.inp(0,0);
 214       newBox.front(btex \small{GHC} frontend etex);
 215       newBox.norm(btex Normalization etex);
 216       newBox.vhdl(btex \small{VHDL} generation etex);
 217       newEmptyBox.out(0,0);
 218
 219       % Space the boxes evenly
 220       inp.c - front.c = front.c - norm.c = norm.c - vhdl.c
 221         = vhdl.c - out.c = (0, 1.5cm);
 222       out.c = origin;
 223
 224       % Draw lines between the boxes. We make these lines "deferred" and give
 225       % them a name, so we can use ObjLabel to draw a label beside them.
 226       ncline.inp(inp)(front) "name(haskell)";
 227       ncline.front(front)(norm) "name(core)";
 228       ncline.norm(norm)(vhdl) "name(normal)";
 229       ncline.vhdl(vhdl)(out) "name(vhdl)";
 230       ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
 231       ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
 232       ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
 233       ObjLabel.vhdl(btex \small{VHDL} description etex) "labpathname(vhdl)", "labdir(rt)";
 234
 235       % Draw the objects (and deferred labels)
 236       drawObj (inp, front, norm, vhdl, out);
 237     \stopuseMPgraphic
 238     \placefigure[right]{Cλash compiler pipeline}{\useMPgraphic{clash-pipeline}}
 239
 240     \startdesc{Frontend}
 241       This is exactly the frontend from the \small{GHC} pipeline, that
 242       translates Haskell sources to a typed Core representation.
 243     \stopdesc
 244     \startdesc{Normalization}
 245       This is a step that transforms the core representation into a normal
 246       form. This normal form is still expressed in the core language, but has
 247       to adhere to an additional set of constraints. This normal form is less
 248       expressive than the full core language (e.g., it can have limited
 249       higher-order expressions, has a specific structure, etc.), but is
 250       also very close to directly describing hardware.
 251     \stopdesc
 252     \startdesc{\small{VHDL} generation}
 253       The last step takes the normal formed core representation and generates
 254       \small{VHDL} for it. Since the normal form has a specific, hardware-like
 255       structure, this final step is very straightforward.
 256     \stopdesc
 257
 258     The most interesting step in this process is the normalization step. That
 259     is where more complicated functional constructs, which have no direct
 260     hardware interpretation, are removed and translated into hardware
 261     constructs. This step is described in a lot of detail at
 262     \in{chapter}[chap:normalization].
 263
 264
 265     \defref{entry function}Translation of a hardware description always
 266     starts at a single function, which is referred to as the \emph{entry
 267     function}. \VHDL is generated for this function first, followed by
 268     any functions used by the entry functions (recursively).
 269
 270   \section[sec:prototype:core]{The Core language}
 271     \defreftxt{core}{the Core language}
 272     Most of the prototype deals with handling the program in the Core
 273     language. In this section we will show what this language looks like and
 274     how it works.
 275
 276     The Core language is a functional language that describes
 277     \emph{expressions}. Every identifier used in Core is called a
 278     \emph{binder}, since it is bound to a value somewhere. On the highest
 279     level, a Core program is a collection of functions, each of which bind a
 280     binder (the function name) to an expression (the function value, which has
 281     a function type).
 282
 283     The Core language itself does not prescribe any program structure
 284     (like modules, declarations, imports, etc.), only expression
 285     structure. In the \small{GHC} compiler, the Haskell module structure
 286     is used for the resulting Core code as well. Since this is not so
 287     relevant for understanding the Core language or the Normalization
 288     process, we will only look at the Core expression language here.
 289
 290     Each Core expression consists of one of these possible expressions.
 291
 292     \startdesc{Variable reference}
 293       \defref{variable reference}
 294       \startlambda
 295       bndr :: T
 296       \stoplambda
 297       This is a reference to a binder. It is written down as the
 298       name of the binder that is being referred to along with its type. The
 299       binder name should of course be bound in a containing scope
 300       (including top level scope, so a reference to a top level function
 301       is also a variable reference). Additionally, constructors from
 302       algebraic datatypes also become variable references.
 303
 304       In our examples, binders will commonly consist of a single
 305       characters, but they can have any length.
 306
 307       The value of this expression is the value bound to the given
 308       binder.
 309
 310       Each binder also carries around its type (explicitly shown above), but
 311       this is usually not shown in the Core expressions. Only when the type is
 312       relevant (when a new binder is introduced, for example) will it be
 313       shown. In other cases, the binder is either not relevant, or easily
 314       derived from the context of the expression. \todo{Ref sidenote on type
 315       annotations}
 316     \stopdesc
 317
 318     \startdesc{Literal}
 319       \defref{literal}
 320       \startlambda
 321       10
 322       \stoplambda
 323       This is a literal. Only primitive types are supported, like
 324       chars, strings, ints and doubles. The types of these literals are the
 325       \quote{primitive}, unboxed versions, like \lam{Char\#} and \lam{Word\#}, not the
 326       normal Haskell versions (but there are built-in conversion
 327       functions). Without going into detail about these types, note that
 328       a few conversion functions exist to convert these to the normal
 329       (boxed) Haskell equivalents.
 330     \stopdesc
 331
 332     \startdesc{Application}
 333       \defref{application}
 334       \startlambda
 335       func arg
 336       \stoplambda
 337       This is function application. Each application consists of two
 338       parts: The function part and the argument part. Applications are used
 339       for normal function \quote{calls}, but also for applying type
 340       abstractions and data constructors.
 341
 342       In core, there is no distinction between an operator and a
 343       function. This means that, for example the addition of two numbers
 344       looks like the following in Core:
 345
 346       \startlambda
 347       (+) 1 2
 348       \stoplambda
 349
 350       Where the function \quote{\lam{(+)}} is applied to the numbers 1
 351       and 2. However, to increase readability, an application of an
 352       operator like \lam{(+)} is sometimes written infix. In this case,
 353       the parenthesis are also left out, just like in Haskell. In other
 354       words, the following means exactly the same as the addition above:
 355
 356       \startlambda
 357       1 + 2
 358       \stoplambda
 359
 360       The value of an application is the value of the function part, with the
 361       first argument binder bound to the argument part.
 362     \stopdesc
 363
 364     \startdesc{Lambda abstraction}
 365       \defref{lambda abstraction}
 366       \startlambda
 367       λbndr.body
 368       \stoplambda
 369       This is the basic lambda abstraction, as it occurs in lambda calculus.
 370       It consists of a binder part and a body part.  A lambda abstraction
 371       creates a function, that can be applied to an argument. The binder is
 372       usually a value binder, but it can also be a \emph{type binder} (or
 373       \emph{type variable}). The latter case introduces a new polymorphic
 374       variable, which can be used in types later on. See
 375       \in{section}[sec:prototype:coretypes] for details.
 376
 377       The body of a lambda abstraction extends all the way to the end of
 378       the expression, or the closing bracket surrounding the lambda. In
 379       other words, the lambda abstraction \quote{operator} has the
 380       lowest priority of all.
 381
 382       The value of an application is the value of the body part, with the
 383       binder bound to the value the entire lambda abstraction is applied to.
 384     \stopdesc
 385
 386     \startdesc{Non-recursive let expression}
 387       \defref{let expression}
 388       \startlambda
 389       let bndr = value in body
 390       \stoplambda
 391       A let expression allows you to bind a binder to some value, while
 392       evaluating to some other value (for which that binder is in scope). This
 393       allows for sharing of subexpressions (you can use a binder twice) and
 394       explicit \quote{naming} of arbitrary expressions. A binder is not
 395       in scope in the value bound it is bound to, so it is not possible
 396       to make recursive definitions with a non-recursive let expression
 397       (see the recursive form below).
 398
 399       Even though this let expression is an extension on the basic lambda
 400       calculus, it is easily translated to a lambda abstraction. The let
 401       expression above would then become:
 402
 403       \startlambda
 404       (λbndr.body) value
 405       \stoplambda
 406
 407       This notion might be useful for verifying certain properties on
 408       transformations, since a lot of verification work has been done on
 409       lambda calculus already.
 410
 411       The value of a let expression is the value of the body part, with the
 412       binder bound to the value.
 413     \stopdesc
 414
 415     \startdesc{Recursive let expression}
 416       \startlambda
 417       letrec
 418         bndr1 = value1
 419         \vdots
 420         bndrn = valuen
 421       in
 422         body
 423       \stoplambda
 424       This is the recursive version of the let expression. In \small{GHC}'s
 425       Core implementation, non-recursive and recursive lets are not so
 426       distinct as we present them here, but this provides a clearer overview.
 427
 428       The main difference with the normal let expression is that it can
 429       contain multiple bindings (or even none) and each of the binders
 430       is in scope in each of the values, in addition to the body. This
 431       allows for self-recursive or mutually recursive definitions.
 432
 433       It is also possible to express a recursive let expression using
 434       normal lambda calculus, if we use the \emph{least fixed-point
 435       operator}, \lam{Y} (but the details are too complicated to help
 436       clarify the let expression, so this will not be explored further).
 437     \stopdesc
 438
 439     \placeintermezzo{}{
 440       \startframedtext[width=8cm,background=box,frame=no]
 441       \startalignment[center]
 442         {\tfa Weak head normal form (\small{WHNF})}
 443       \stopalignment
 444       \blank[medium]
 445         An expression is in weak head normal form if it is either an
 446         constructor application or lambda abstraction. \todo{How about
 447         atoms?}
 448
 449         Without going into detail about the differences with head
 450         normal form and normal form, note that evaluating the scrutinee
 451         of a case expression to normal form (evaluating any function
 452         applications, variable references and case expressions) is
 453         sufficient to decide which case alternatives should be chosen.
 454         \todo{ref?}
 455       \stopframedtext
 456
 457     }
 458
 459     \startdesc{Case expression}
 460       \defref{case expression}
 461       \startlambda
 462         case scrutinee of bndr
 463           DEFAULT -> defaultbody
 464           C0 bndr0,0 ... bndr0,m -> body0
 465           \vdots
 466           Cn bndrn,0 ... bndrn,m -> bodyn
 467       \stoplambda
 468
 469       A case expression is the only way in Core to choose between values. All
 470       \hs{if} expressions and pattern matchings from the original Haskell
 471       PRogram have been translated to case expressions by the desugarer.
 472
 473       A case expression evaluates its scrutinee, which should have an
 474       algebraic datatype, into weak head normal form (\small{WHNF}) and
 475       (optionally) binds it to \lam{bndr}. Every alternative lists a
 476       single constructor (\lam{C0 ... Cn}). Based on the actual
 477       constructor of the scrutinee, the corresponding alternative is
 478       chosen. The binders in the chosen alternative (\lam{bndr0,0 ....
 479       bndr0,m} are bound to the actual arguments to the constructor in
 480       the scrutinee.
 481
 482       This is best illustrated with an example. Assume
 483       there is an algebraic datatype declared as follows\footnote{This
 484       datatype is not suported by the current Cλash implementation, but
 485       serves well to illustrate the case expression}:
 486
 487       \starthaskell
 488       data D = A Word | B Bit
 489       \stophaskell
 490
 491       This is an algebraic datatype with two constructors, each getting
 492       a single argument. A case expression scrutinizing this datatype
 493       could look like the following:
 494
 495       \startlambda
 496         case s of
 497           A word -> High
 498           B bit -> bit
 499       \stoplambda
 500
 501       What this expression does is check the constructor of the
 502       scrutinee \lam{s}. If it is \lam{A}, it always evaluates to
 503       \lam{High}. If the constructor is \lam{B}, the binder \lam{bit} is
 504       bound to the argument passed to \lam{B} and the case expression
 505       evaluates to this bit.
 506
 507       If none of the alternatives match, the \lam{DEFAULT} alternative
 508       is chosen. A case expression must always be exhaustive, \ie it
 509       must cover all possible constructors that the scrutinee can have
 510       (if all of them are covered explicitly, the \lam{DEFAULT}
 511       alternative can be left out).
 512
 513       Since we can only match the top level constructor, there can be no overlap
 514       in the alternatives and thus order of alternatives is not relevant (though
 515       the \lam{DEFAULT} alternative must appear first for implementation
 516       efficiency).
 517
 518       To support strictness, the scrutinee is always evaluated into
 519       \small{WHNF}, even when there is only a \lam{DEFAULT} alternative. This
 520       allows aplication of the strict function \lam{f} to the argument \lam{a}
 521       to be written like:
 522
 523       \startlambda
 524       f (case a of arg DEFAULT -> arg)
 525       \stoplambda
 526
 527       According to the \GHC documentation, this is the only use for the extra
 528       binder to which the scrutinee is bound.  When not using strictness
 529       annotations (which is rather pointless in hardware descriptions),
 530       \small{GHC} seems to never generate any code making use of this binder.
 531       In fact, \GHC has never been observed to generate code using this
 532       binder, even when strictness was involved.  Nonetheless, the prototype
 533       handles this binder as expected.
 534
 535       Note that these case expressions are less powerful than the full Haskell
 536       case expressions. In particular, they do not support complex patterns like
 537       in Haskell. Only the constructor of an expression can be matched,
 538       complex patterns are implemented using multiple nested case expressions.
 539
 540       Case expressions are also used for unpacking of algebraic datatypes, even
 541       when there is only a single constructor. For examples, to add the elements
 542       of a tuple, the following Core is generated:
 543
 544       \startlambda
 545       sum = λtuple.case tuple of
 546         (,) a b -> a + b
 547       \stoplambda
 548
 549       Here, there is only a single alternative (but no \lam{DEFAULT}
 550       alternative, since the single alternative is already exhaustive). When
 551       its body is evaluated, the arguments to the tuple constructor \lam{(,)}
 552       (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
 553     \stopdesc
 554
 555     \startdesc{Cast expression}
 556       \defref{cast expression}
 557       \startlambda
 558       body ▶ targettype
 559       \stoplambda
 560       A cast expression allows you to change the type of an expression to an
 561       equivalent type. Note that this is not meant to do any actual work, like
 562       conversion of data from one format to another, or force a complete type
 563       change. Instead, it is meant to change between different representations
 564       of the same type, \eg switch between types that are provably equal (but
 565       look different).
 566
 567       In our hardware descriptions, we typically see casts to change between a
 568       Haskell newtype and its contained type, since those are effectively
 569       different types (so a cast is needed) with the same representation (but
 570       no work is done by the cast).
 571
 572       More complex are types that are proven to be equal by the typechecker,
 573       but look different at first glance. To ensure that, once the typechecker
 574       has proven equality, this information sticks around, explicit casts are
 575       added. In our notation we only write the target type, but in reality a
 576       cast expressions carries around a \emph{coercion}, which can be seen as a
 577       proof of equality. \todo{Example}
 578
 579       The value of a cast is the value of its body, unchanged. The type of this
 580       value is equal to the target type, not the type of its body.
 581
 582       \todo{Move and update this paragraph}
 583       Note that this syntax is also used sometimes to indicate that a particular
 584       expression has a particular type, even when no cast expression is
 585       involved. This is then purely informational, since the only elements that
 586       are explicitly typed in the Core language are the binder references and
 587       cast expressions, the types of all other elements are determined at
 588       runtime.
 589     \stopdesc
 590
 591     \startdesc{Note}
 592       The Core language in \small{GHC} allows adding \emph{notes}, which serve
 593       as hints to the inliner or add custom (string) annotations to a core
 594       expression. These should not be generated normally, so these are not
 595       handled in any way in the prototype.
 596     \stopdesc
 597
 598     \startdesc{Type}
 599       \defref{type expression}
 600       \startlambda
 601       @T
 602       \stoplambda
 603       It is possibly to use a Core type as a Core expression. To prevent
 604       confusion between types and values, the \lam{@} sign is used to
 605       explicitly mark a type that is used in a Core expression.
 606
 607       For the actual types supported by Core, see
 608       \in{section}[sec:prototype:coretypes]. This \quote{lifting} of a
 609       type into the value domain is done to allow for type abstractions
 610       and applications to be handled as normal lambda abstractions and
 611       applications above. This means that a type expression in Core can
 612       only ever occur in the argument position of an application, and
 613       only if the type of the function that is applied to expects a type
 614       as the first argument. This happens in applications of all
 615       polymorphic functions. Consider the \lam{fst} function:
 616
 617       \startlambda
 618       fst :: \forall t1. \forall t2. (t1, t2) ->t1
 619       fst = λt1.λt2.λ(tup :: (t1, t2)). case tup of (,) a b -> a
 620
 621       fstint :: (Int, Int) -> Int
 622       fstint = λa.λb.fst @Int @Int a b
 623       \stoplambda
 624
 625       The type of \lam{fst} has two universally quantified type variables. When
 626       \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
 627       (which are substitued for \lam{t1} and \lam{t2} in the type of \lam{fst}, so
 628       the actual type of arguments and result of \lam{fst} can be found:
 629       \lam{fst @Int @Int :: (Int, Int) -> Int}).
 630     \stopdesc
 631
 632     \subsection[sec:prototype:coretypes]{Core type system}
 633       Whereas the expression syntax of Core is very simple, its type system is
 634       a bit more complicated. It turns out it is harder to \quote{desugar}
 635       Haskell's complex type system into something more simple. Most of the
 636       type system is thus very similar to that of Haskell.
 637
 638       We will slightly limit our view on Core's type system, since the more
 639       complicated parts of it are only meant to support Haskell's (or rather,
 640       \GHC's) type extensions, such as existential types, \small{GADT}s, type
 641       families and other non-standard Haskell stuff which we do not (plan to)
 642       support.
 643
 644       In Core, every expression is typed. The translation to Core happens
 645       after the typechecker, so types in Core are always correct as well
 646       (though you could of course construct invalidly typed expressions
 647       through the \GHC API).
 648
 649       Any type in core is one of the following:
 650
 651       \startdesc{A type variable}
 652         \startlambda
 653         t
 654         \stoplambda
 655
 656         This is a reference to a type defined elsewhere. This can either be a
 657         polymorphic type (like the latter two \lam{t}'s in \lam{id :: \forall t.
 658         t -> t}), or a type constructor (like \lam{Bool} in \lam{not :: Bool ->
 659         Bool}). Like in Haskell, polymorphic type variables always
 660         start with a lowercase letter, while type constructors always start
 661         with an uppercase letter.
 662
 663         \todo{How to define (new) type constructors?}
 664
 665         A special case of a type constructor is the \emph{function type
 666         constructor}, \lam{->}. This is a type constructor taking two arguments
 667         (using application below). The function type constructor is commonly
 668         written inline, so we write \lam{a -> b} when we really mean \lam{-> a
 669         b}, the function type constructor applied to \lam{a} and \lam{b}.
 670
 671         Polymorphic type variables can only be defined by a lambda
 672         abstraction, see the forall type below.
 673       \stopdesc
 674
 675       \startdesc{A type application}
 676         \startlambda
 677           Maybe Int
 678         \stoplambda
 679
 680         This applies some type to another type. This is particularly used to
 681         apply type variables (type constructors) to their arguments.
 682
 683         As mentioned above, applications of some type constructors have
 684         special notation. In particular, these are applications of the
 685         \emph{function type constructor} and \emph{tuple type constructors}:
 686         \startlambda
 687           foo :: t1 -> t2
 688           foo' :: -> t1 t2
 689           bar :: (t1, t2, t3)
 690           bar' :: (,,) t1 t2 t3
 691         \stoplambda
 692       \stopdesc
 693
 694       \startdesc{The forall type}
 695         \startlambda
 696           id :: \forall t. t -> t
 697         \stoplambda
 698         The forall type introduces polymorphism. It is the only way to
 699         introduce new type variables, which are completely unconstrained (Any
 700         possible type can be assigned to it). Constraints can be added later
 701         using predicate types, see below.
 702
 703         A forall type is always (and only) introduced by a type lambda
 704         expression. For example, the Core translation of the
 705         id function is:
 706         \startlambda
 707           id = λt.λ(x :: t).x
 708         \stoplambda
 709
 710         Here, the type of the binder \lam{x} is \lam{t}, referring to the
 711         binder in the topmost lambda.
 712
 713         When using a value with a forall type, the actual type
 714         used must be applied first. For example Haskell expression \hs{id
 715         True} (the function \hs{id} appleid to the dataconstructor \hs{True})
 716         translates to the following Core:
 717
 718         \startlambda
 719         id @Bool True
 720         \stoplambda
 721
 722         Here, id is first applied to the type to work with. Note that the type
 723         then changes from \lam{id :: \forall t. t -> t} to \lam{id @Bool ::
 724         Bool -> Bool}. Note that the type variable \lam{a} has been
 725         substituted with the actual type.
 726
 727         In Haskell, forall types are usually not explicitly specified (The use
 728         of a lowercase type variable implicitly introduces a forall type for
 729         that variable). In fact, in standard Haskell there is no way to
 730         explicitly specify forall types. Through a language extension, the
 731         \hs{forall} keyword is available, but still optional for normal forall
 732         types (it is needed for \emph{existentially quantified types}, which
 733         Cλash does not support).
 734       \stopdesc
 735
 736       \startdesc{Predicate type}
 737         \startlambda
 738           show :: \forall t. Show t ⇒ t → String
 739         \stoplambda
 740
 741         \todo{Sidenote: type classes?}
 742
 743         A predicate type introduces a constraint on a type variable introduced
 744         by a forall type (or type lambda). In the example above, the type
 745         variable \lam{t} can only contain types that are an \emph{instance} of
 746         the \emph{type class} \lam{Show}. \refdef{type class}
 747
 748         There are other sorts of predicate types, used for the type families
 749         extension, which we will not discuss here.
 750
 751         A predicate type is introduced by a lambda abstraction. Unlike with
 752         the forall type, this is a value lambda abstraction, that must be
 753         applied to a value. We call this value a \emph{dictionary}.
 754
 755         Without going into the implementation details, a dictionary can be
 756         seen as a lookup table all the methods for a given (single) type class
 757         instance. This means that all the dictionaries for the same type class
 758         look the same (\eg contain methods with the same names). However,
 759         dictionaries for different instances of the same class contain
 760         different methods, of course.
 761
 762         A dictionary is introduced by \small{GHC} whenever it encounters an
 763         instance declaration. This dictionary, as well as the binder
 764         introduced by a lambda that introduces a dictionary, have the
 765         predicate type as their type. These binders are usually named starting
 766         with a \lam{\$}. Usually the name of the type concerned is not
 767         reflected in the name of the dictionary, but the name of the type
 768         class is. The Haskell expression \hs{show True} thus becomes:
 769
 770         \startlambda
 771         show @Bool \$dShow True
 772         \stoplambda
 773       \stopdesc
 774
 775       Using this set of types, all types in basic Haskell can be represented.
 776
 777       \todo{Overview of polymorphism with more examples (or move examples
 778       here)}.
 779
 780   \section[sec:prototype:statetype]{State annotations in Haskell}
 781     As noted in \in{section}[sec:description:stateann], Cλash needs some
 782     way to let the programmer explicitly specify which of a function's
 783     arguments and which part of a function's result represent the
 784     function's state.
 785
 786     Using the Haskell type systems, there are a few ways we can tackle this.
 787
 788     \subsection{Type synonyms}
 789       Haskell provides type synonyms as a way to declare a new type that is
 790       equal to an existing type (or rather, a new name for an existing type).
 791       This allows both the original type and the synonym to be used
 792       interchangedly in a Haskell program. This means no explicit conversion
 793       is needed either. For example, a simple accumulator would become:
 794
 795       \starthaskell
 796       type State s = s
 797       acc :: Word -> State Word -> (State Word, Word)
 798       acc i s = let sum = s + i in (sum, sum)
 799       \stophaskell
 800
 801       This looks nice in Haskell, but turns out to be hard to implement. There
 802       are no explicit conversion in Haskell, but not in Core either. This
 803       means the type of a value might be show as \hs{AccState} in some places,
 804       but \hs{Word} in others (and this can even change due to
 805       transformations). Since every binder has an explicit type associated
 806       with it, the type of every function type will be properly preserved and
 807       could be used to track down the statefulness of each value by the
 808       compiler. However, this makes the implementation a lot more complicated
 809       than it currently is using \hs{newtypes}.
 810
 811     % Use \type instead of \hs here, since the latter breaks inside
 812     % section headings.
 813     \subsection{Type renaming (\type{newtype})}
 814       Haskell also supports type renamings as a way to declare a new type that
 815       has the same (runtime) representation as an existing type (but is in
 816       fact a different type to the typechecker). With type renaming, an
 817       explicit conversion between values of the two types is needed. The
 818       accumulator would then become:
 819
 820       \starthaskell
 821       newtype State s = State s
 822       acc :: Word -> State Word -> (State Word, Word)
 823       acc i (State s) = let sum = s + i in (State sum, sum)
 824       \stophaskell
 825
 826       The \hs{newtype} line declares a new type \hs{State} that has one type
 827       argument, \hs{s}. This type contains one \quote{constructor} \hs{State}
 828       with a single argument of type \hs{s}. It is customary to name the
 829       constructor the same as the type, which is allowed (since types can
 830       never cause name collisions with values). The difference with the type
 831       synonym example is in the explicit conversion between the \hs{State
 832       Word} and \hs{Word} types by pattern matching and by using the explicit
 833       the \hs{State constructor}.
 834
 835       This explicit conversion makes the \VHDL generation easier: Whenever we
 836       remove (unpack) the \hs{State} type, this means we are accessing the
 837       current state (\eg, accessing the register output). Whenever we are a
 838       adding (packing) the \hs{State} type, we are producing a new value for
 839       the state (\eg, providing the register input).
 840
 841       When dealing with nested states (a stateful function that calls stateful
 842       functions, which might call stateful functions, etc.) the state type
 843       could quickly grow complex because of all the \hs{State} type constructors
 844       needed. For example, consider the following state type (this is just the
 845       state type, not the entire function type):
 846
 847       \starttyping
 848       State (State Bit, State (State Word, Bit), Word)
 849       \stoptyping
 850
 851       We cannot leave all these \hs{State} type constructors out, since that
 852       would change the type (unlike when using type synonyms). However, when
 853       using type synonyms to hide away substates (see
 854       \in{section}[sec:prototype:substatesynonyms] below), this
 855       disadvantage should be limited.
 856
 857       \subsubsection{Different input and output types}
 858         An alternative could be to use different types for input and output
 859         state (\ie current and updated state). The accumulator example would
 860         then become something like:
 861
 862         \starthaskell
 863         newtype StateIn s = StateIn s
 864         newtype StateOut s = StateOut s
 865         acc :: Word -> StateIn Word -> (StateIn Word, Word)
 866         acc i (StateIn s) = let sum = s + i in (StateIn sum, sum)
 867         \stophaskell
 868
 869         This could make the implementation easier and the hardware
 870         descriptions less errorprone (you can no longer \quote{forget} to
 871         unpack and repack a state variable and just return it directly, which
 872         can be a problem in the current prototype). However, it also means we
 873         need twice as many type synonyms to hide away substates, making this
 874         approach a bit cumbersome. It also makes it harder to copmare input
 875         and output state types, possible reducing the type safety of the
 876         descriptions.
 877
 878     \subsection[sec:prototype:substatesynonyms]{Type synonyms for substates}
 879       As noted above, when using nested (hierarchical) states, the state types
 880       of the \quote{upper} functions (those that call other functions, which
 881       call other functions, etc.) quickly becomes complicated. Also, when the
 882       state type of one of the \quote{lower} functions changes, the state
 883       types of all the upper functions changes as well. If the state type for
 884       each function is explicitly and completely specified, this means that a
 885       lot of code needs updating whenever a state type changes.
 886
 887       To prevent this, it is recommended (but not enforced) to use a type
 888       synonym for the state type of every function. Every function calling
 889       other functions will then use the state type synonym of the called
 890       functions in its own type, requiring no code changes when the state type
 891       of a called function changes. This approach is used in
 892       \in{example}[ex:AvgState] below. The \hs{AccState} and \hs{AvgState}
 893       are examples of such state type synonyms.
 894
 895     \subsection{Chosen approach}
 896       To keep implementation simple, the current prototype uses the type
 897       renaming approach, with a single type for both input and output
 898       states. In the future, it might be worthwhile to revisit this
 899       approach if more complicated flow analysis is implemented for
 900       state variables. This analysis is needed to add proper error
 901       checking anyway and might allow the use of type synonyms without
 902       losing any expressivity.
 903
 904       \subsubsection{Example}
 905         As an example of the used approach, there is a simple averaging circuit in
 906         \in{example}[ex:AvgState]. This circuit lets the accumulation of the
 907         inputs be done by a subcomponent, \hs{acc}, but keeps a count of value
 908         accumulated in its own state.\footnote{Currently, the prototype
 909         is not able to compile this example, since the built-in function
 910         for division has not been added.}
 911
 912         \startbuffer[AvgState]
 913           -- The state type annotation
 914           newtype State s = State s
 915
 916           -- The accumulator state type
 917           type AccState = State Word
 918           -- The accumulator
 919           acc :: Word -> AccState -> (AccState, Word)
 920           acc i (State s) = let sum = s + i in (State sum, sum)
 921
 922           -- The averaging circuit state type
 923           type AvgState = State (AccState, Word)
 924           -- The averaging circuit
 925           avg :: Word -> AvgState -> (AvgState, Word)
 926           avg i (State s) = (State s', o)
 927             where
 928               (accs, count) = s
 929               -- Pass our input through the accumulator, which outputs a sum
 930               (accs', sum) = acc i accs
 931               -- Increment the count (which will be our new state)
 932               count' = count + 1
 933               -- Compute the average
 934               o = sum / count'
 935               s' = (accs', count')
 936         \stopbuffer
 937
 938         \placeexample[here][ex:AvgState]{Simple stateful averaging circuit.}
 939           %\startcombination[2*1]
 940             {\typebufferhs{AvgState}}%{Haskell description using function applications.}
 941           %  {\boxedgraphic{AvgState}}{The architecture described by the Haskell description.}
 942           %\stopcombination
 943         \todo{Picture}
 944
 945   \section{Implementing state}
 946     Now its clear how to put state annotations in the Haskell source,
 947     there is the question of how to implement this state translation. As
 948     we have seen in \in{section}[sec:prototype:design], the translation to
 949     \VHDL happens as a simple, final step in the compilation process.
 950     This step works on a core expression in normal form. The specifics
 951     of normal form will be explained in
 952     \in{chapter}[chap:normalization], but the examples given should be
 953     easy to understand using the definitin of Core given above.
 954
 955         \startbuffer[AvgStateNormal]
 956           acc = λi.λspacked.
 957             let
 958               -- Remove the State newtype
 959               s = spacked ▶ Word
 960               s' = s + i
 961               o = s + i
 962               -- Add the State newtype again
 963               spacked' = s' ▶ State Word
 964               res = (spacked', o)
 965             in
 966               res
 967
 968           avg = λi.λspacked.
 969             let
 970               s = spacked ▶ (AccState, Word)
 971               accs = case s of (accs, _) -> accs
 972               count = case s of (_, count) -> count
 973               accres = acc i accs
 974               accs' = case accres of (accs', _) -> accs'
 975               sum = case accres of (_, sum) -> sum
 976               count' = count + 1
 977               o = sum / count'
 978               s' = (accs', count')
 979               spacked' = s' ▶ State (AccState, Word)
 980               res = (spacked', o)
 981             in
 982               res
 983         \stopbuffer
 984
 985         \placeexample[here][ex:AvgStateNormal]{Normalized version of \in{example}[ex:AvgState]}
 986             {\typebufferlam{AvgStateNormal}}
 987
 988     \subsection[sec:prototype:statelimits]{State in normal form}
 989       Before describing how to translate state from normal form to
 990       \VHDL, we will first see how state handling looks in normal form.
 991       What limitations are there on their use to guarantee that proper
 992       \VHDL can be generated?
 993
 994       We will try to formulate a number of rules about what operations are
 995       allowed with state variables. These rules apply to the normalized Core
 996       representation, but will in practice apply to the original Haskell
 997       hardware description as well. Ideally, these rules would become part
 998       of the intended normal form definition \refdef{intended normal form
 999       definition}, but this is not the case right now. This can cause some
1000       problems, which are detailed in
1001       \in{section}[sec:normalization:stateproblems].
1002
1003       In these rules we use the terms \emph{state variable} to refer to any
1004       variable that has a \lam{State} type. A \emph{state-containing
1005       variable} is any variable whose type contains a \lam{State} type,
1006       but is not one itself (like \lam{(AccState, Word)} in the example,
1007       which is a tuple type, but contains \lam{AccState}, which is again
1008       equal to \lam{State Word}).
1009
1010       We also use a distinction between \emph{input} and \emph{output
1011       (state) variables} and \emph{substate variables}, which will be
1012       defined in the rules themselves.
1013
1014       \startdesc{State variables can appear as an argument.}
1015         \startlambda
1016           avg = λi.λspacked. ...
1017         \stoplambda
1018
1019         Any lambda that binds a variable with a state type, creates a new
1020         input state variable.
1021       \stopdesc
1022
1023       \startdesc{Input state variables can be unpacked.}
1024         \startlambda
1025           s = spacked ▶ (AccState, Word)
1026         \stoplambda
1027
1028         An input state variable may be unpacked using a cast operation. This
1029         removes the \lam{State} type renaming and the result has no longer a
1030         \lam{State} type.
1031
1032         If the result of this unpacking does not have a state type and does
1033         not contain state variables, there are no limitations on its use.
1034         Otherwise if it does not have a state type but does contain
1035         substates, we refer to it as a \emph{state-containing input
1036         variable} and the limitations below apply. If it has a state type
1037         itself, we refer to it as an \emph{input substate variable} and the
1038         below limitations apply as well.
1039
1040         It may seem strange to consider a variable that still has a state
1041         type directly after unpacking, but consider the case where a
1042         function does not have any state of its own, but does call a single
1043         stateful function. This means it must have a state argument that
1044         contains just a substate. The function signature of such a function
1045         could look like:
1046
1047         \starthaskell
1048           type FooState = State AccState
1049         \stophaskell
1050
1051         Which is of course equivalent to \lam{State (State Word)}.
1052       \stopdesc
1053
1054       \startdesc{Variables can be extracted from state-containing input variables.}
1055         \startlambda
1056           accs = case s of (accs, _) -> accs
1057         \stoplambda
1058
1059         A state-containing input variable is typically a tuple containing
1060         multiple elements (like the current function's state, substates or
1061         more tuples containing substates). All of these can be extracted
1062         from an input variable using an extractor case (or possibly
1063         multiple, when the input variable is nested).
1064
1065         If the result has no state type and does not contain any state
1066         variables either, there are no further limitations on its use. If
1067         the result has no state type but does contain state variables we
1068         refer to it as a \emph{state-containing input variable} and this
1069         limitation keeps applying. If the variable has a state type itself,
1070         we refer to it as an \emph{input substate variable} and below
1071         limitations apply.
1072
1073       \startdesc{Input substate variables can be passed to functions.}
1074         \startlambda
1075           accres = acc i accs
1076           accs' = case accres of (accs', _) -> accs'
1077         \stoplambda
1078
1079         An input substate variable can (only) be passed to a function.
1080         Additionally, every input substate variable must be used in exactly
1081         \emph{one} application, no more and no less.
1082
1083         The function result should contain exactly one state variable, which
1084         can be extracted using (multiple) case expressions. The extracted
1085         state variable is referred to the \emph{output substate}
1086
1087         The type of this output substate must be identical to the type of
1088         the input substate passed to the function.
1089       \stopdesc
1090
1091       \startdesc{Variables can be inserted into a state-containing output variable.}
1092         \startlambda
1093           s' = (accs', count')
1094         \stoplambda
1095
1096         A function's output state is usually a tuple containing its own
1097         updated state variables and all output substates. This result is
1098         built up using any single-constructor algebraic datatype.
1099
1100         The result of these expressions is referred to as a
1101         \emph{state-containing output variable}, which are subject to these
1102         limitations.
1103       \stopdesc
1104
1105       \startdesc{State containing output variables can be packed.}
1106         \startlambda
1107           spacked' = s' ▶ State (AccState, Word)
1108         \stoplambda
1109
1110         As soon as all a functions own update state and output substate
1111         variables have been joined together, the resulting
1112         state-containing output variable can be packed into an output
1113         state variable. Packing is done by casting into a state type.
1114       \stopdesc
1115
1116       \startdesc{Output state variables can appear as (part of) a function result.}
1117         \startlambda
1118           avg = λi.λspacked.
1119             let
1120             \vdots
1121             res = (spacked', o)
1122           in
1123             res
1124         \stoplambda
1125         When the output state is packed, it can be returned as a part
1126         of the function result. Nothing else can be done with this
1127         value (or any value that contains it).
1128       \stopdesc
1129
1130       There is one final limitation that is hard to express in the above
1131       itemization. Whenever substates are extracted from the input state
1132       to be passed to functions, the corresponding output substates
1133       should be inserted into the output state in the same way. In other
1134       words, each pair of corresponding substates in the input and
1135       output states should be passed / returned from the same called
1136       function.
1137
1138       The prototype currently does not check much of the above
1139       conditions. This means that if the conditions are violated,
1140       sometimes a compile error is generated, but in other cases output
1141       can be generated that is not valid \VHDL or at the very least does
1142       not correspond to the input.
1143
1144     \subsection{Translating to \VHDL}
1145       As noted above, the basic approach when generating \VHDL for stateful
1146       functions is to generate a single register for every stateful function.
1147       We look around the normal form to find the let binding that removes the
1148       \lam{State} newtype (using a cast). We also find the let binding that
1149       adds a \lam{State} type. These are connected to the output and the input
1150       of the generated let binding respectively. This means that there can
1151       only be one let binding that adds and one that removes the \lam{State}
1152       type. It is easy to violate this constraint. This problem is detailed in
1153       \in{section}[sec:normalization:stateproblems].
1154
1155       This approach seems simple enough, but will this also work for more
1156       complex stateful functions involving substates?  Observe that any
1157       component of a function's state that is a substate, \ie passed on as
1158       the state of another function, should have no influence on the
1159       hardware generated for the calling function. Any state-specific
1160       \small{VHDL} for this component can be generated entirely within the
1161       called function.  So, we can completely ignore substates when
1162       generating \VHDL for a function.
1163
1164       From this observation it might seem logical to remove the
1165       substates from a function's states altogether and leave only the
1166       state components which are actual states of the current function.
1167       While doing this would not remove any information needed to
1168       generate \small{VHDL} from the function, it would cause the
1169       function definition to become invalid (since we will not have any
1170       substate to pass to the functions anymore).  We could solve the
1171       syntactic problems by passing \type{undefined} for state
1172       variables, but that would still break the code on the semantic
1173       level (\ie, the function would no longer be semantically
1174       equivalent to the original input).
1175
1176       To keep the function definition correct until the very end of the
1177       process, we will not deal with (sub)states until we get to the
1178       \small{VHDL} generation.  Then, we are translating from Core to
1179       \small{VHDL}, and we can simply ignore substates, effectively removing
1180       the substate components altogether.
1181
1182       But, how will we know what exactly is a substate? Since any state
1183       argument or return value that represents state must be of the
1184       \type{State} type, we can look at the type of a value. However, we
1185       must be careful to ignore only \emph{substates}, and not a
1186       function's own state.
1187
1188       In \in{example}[ex:AvgStateNorm] above, we should generate a register
1189       connected with its output connected to \lam{s} and its input connected
1190       to \lam{s'}. However, \lam{s'} is build up from both \lam{accs'} and
1191       \lam{count'}, while only \lam{count'} should end up in the register.
1192       \lam{accs'} is a substate for the \lam{acc} function, for which a
1193       register will be created when generating \VHDL for the \lam{acc}
1194       function.
1195
1196       Fortunately, the \lam{accs'} variable (and any other substate) has a
1197       property that we can easily check: It has a \lam{State} type
1198       annotation. This means that whenever \VHDL is generated for a tuple
1199       (or other algebraic type), we can simply leave out all elements that
1200       have a \lam{State} type. This will leave just the parts of the state
1201       that do not have a \lam{State} type themselves, like \lam{count'},
1202       which is exactly a function's own state. This approach also means that
1203       the state part of the result is automatically excluded when generating
1204       the output port, which is also required.
1205
1206       We can formalize this translation a bit, using the following
1207       rules.
1208
1209       \startitemize
1210         \item A state unpack operation should not generate any \small{VHDL}.
1211         The binder to which the unpacked state is bound should still be
1212         declared, this signal will become the register and will hold the
1213         current state.
1214         \item A state pack operation should not generate any \small{VHDL}.
1215         The binder to which the packed state is bound should not be
1216         declared. The binder that is packed is the signal that will hold the
1217         new state.
1218         \item Any values of a State type should not be translated to
1219         \small{VHDL}. In particular, State elements should be removed from
1220         tuples (and other datatypes) and arguments with a state type should
1221         not generate ports.
1222         \item To make the state actually work, a simple \small{VHDL}
1223         (sequential) process should be generated. This process updates
1224         the state at every clockcycle, by assigning the new state to the
1225         current state. This will be recognized by synthesis tools as a
1226         register specification.
1227       \stopitemize
1228
1229       When applying these rules to the description in
1230       \in{example}[ex:AvgStateNormal], we be left with the description
1231       in \in{example}[ex:AvgStateRemoved]. All the parts that do not
1232       generate any \VHDL directly are crossed out, leaving just the
1233       actual flow of values in the final hardware.
1234
1235       \startlambda
1236         avg = iλ.λ--spacked.--
1237           let
1238             s = --spacked ▶ (AccState, Word)--
1239             --accs = case s of (accs, _) -> accs--
1240             count = case s of (--_,-- count) -> count
1241             accres = acc i --accs--
1242             --accs' = case accres of (accs', _) -> accs'--
1243             sum = case accres of (--_,-- sum) -> sum
1244             count' = count + 1
1245             o = sum / count'
1246             s' = (--accs',-- count')
1247             --spacked' = s' ▶ State (AccState, Word)--
1248             res = (--spacked',-- o)
1249           in
1250             res
1251       \stoplambda
1252
1253       When we would really leave out the crossed out parts, we get a slightly
1254       weird program: There is a variable \lam{s} which has no value, and there
1255       is a variable \lam{s'} that is never used. Together, these two will form
1256       the state process of the function. \lam{s} contains the "current" state,
1257       \lam{s'} is assigned the "next" state. So, at the end of each clock
1258       cycle, \lam{s'} should be assigned to \lam{s}.
1259
1260       In the example the definition of \lam{s'} is still present, since
1261       it does not have a state type. The \lam{accums'} substate has been
1262       removed, leaving us just with the state of \lam{avg} itself.
1263
1264       As an illustration of the result of this function,
1265       \in{example}[ex:AccStateVHDL] and \in{example}[ex:AvgStateVHDL] show the the \VHDL that is
1266       generated from the examples is this section.
1267
1268       \startbuffer[AvgStateVHDL]
1269         entity avgComponent_0 is
1270              port (\izAlE2\ : in \unsigned_31\;
1271                    \foozAo1zAo12\ : out \(,)unsigned_31\;
1272                    clock : in std_logic;
1273                    resetn : in std_logic);
1274         end entity avgComponent_0;
1275
1276
1277         architecture structural of avgComponent_0 is
1278              signal \szAlG2\ : \(,)unsigned_31\;
1279              signal \countzAlW2\ : \unsigned_31\;
1280              signal \dszAm62\ : \(,)unsigned_31\;
1281              signal \sumzAmk3\ : \unsigned_31\;
1282              signal \reszAnCzAnM2\ : \unsigned_31\;
1283              signal \foozAnZzAnZ2\ : \unsigned_31\;
1284              signal \reszAnfzAnj3\ : \unsigned_31\;
1285              signal \s'zAmC2\ : \(,)unsigned_31\;
1286         begin
1287              \countzAlW2\ <= \szAlG2\.A;
1288
1289              \comp_ins_dszAm62\ : entity accComponent_1
1290                                        port map (\izAob3\ => \izAlE2\,
1291                                                  \foozAoBzAoB2\ => \dszAm62\,
1292                                                  clock => clock,
1293                                                  resetn => resetn);
1294
1295              \sumzAmk3\ <= \dszAm62\.A;
1296
1297              \reszAnCzAnM2\ <= to_unsigned(1, 32);
1298
1299              \foozAnZzAnZ2\ <= \countzAlW2\ + \reszAnCzAnM2\;
1300
1301              \reszAnfzAnj3\ <= \sumzAmk3\ * \foozAnZzAnZ2\;
1302
1303              \s'zAmC2\.A <= \foozAnZzAnZ2\;
1304
1305              \foozAo1zAo12\.A <= \reszAnfzAnj3\;
1306
1307              state : process (clock, resetn)
1308              begin
1309                   if resetn = '0' then
1310                   elseif rising_edge(clock) then
1311                        \szAlG2\ <= \s'zAmC2\;
1312                   end if;
1313              end process state;
1314         end architecture structural;
1315       \stopbuffer
1316       \startbuffer[AccStateVHDL]
1317         entity accComponent_1 is
1318              port (\izAob3\ : in \unsigned_31\;
1319                    \foozAoBzAoB2\ : out \(,)unsigned_31\;
1320                    clock : in std_logic;
1321                    resetn : in std_logic);
1322         end entity accComponent_1;
1323
1324
1325         architecture structural of accComponent_1 is
1326              signal \szAod3\ : \unsigned_31\;
1327              signal \reszAonzAor3\ : \unsigned_31\;
1328         begin
1329              \reszAonzAor3\ <= \szAod3\ + \izAob3\;
1330
1331              \foozAoBzAoB2\.A <= \reszAonzAor3\;
1332
1333              state : process (clock, resetn)
1334              begin
1335                   if resetn = '0' then
1336                   elseif rising_edge(clock) then
1337                        \szAod3\ <= \reszAonzAor3\;
1338                   end if;
1339              end process state;
1340         end architecture structural;
1341       \stopbuffer
1342
1343       \placeexample[][ex:AccStateVHDL]{\VHDL generated for acc from \in{example}[ex:AvgState]}
1344           {\typebuffer[AccStateVHDL]}
1345       \placeexample[][ex:AvgStateVHDL]{\VHDL generated for avg from \in{example}[ex:AvgState]}
1346           {\typebuffer[AvgStateVHDL]}
1347 %    \subsection{Initial state}
1348 %      How to specify the initial state? Cannot be done inside a hardware
1349 %      function, since the initial state is its own state argument for the first
1350 %      call (unless you add an explicit, synchronous reset port).
1351 %
1352 %      External init state is natural for simulation.
1353 %
1354 %      External init state works for hardware generation as well.
1355 %
1356 %      Implementation issues: state splitting, linking input to output state,
1357 %      checking usage constraints on state variables.
1358 %
1359 %      \todo{Implementation issues: Separate compilation, simplified core.}
1360 %
1361 % vim: set sw=2 sts=2 expandtab: