Chapters/HardwareDescription.tex

   1 \chapter[chap:description]{Hardware description}
   2   This chapter will provide an overview of the hardware description language
   3   that was created and the issues that have arisen in the process. It will
   4   focus on the issues of the language, not the implementation.
   5
   6   When translating Haskell to hardware, we need to make choices in what kind
   7   of hardware to generate for what Haskell constructs. When faced with
   8   choices, we've tried to stick with the most obvious choice wherever
   9   possible. In a lot of cases, when you look at a hardware description it is
  10   comletely clear what hardware is described. We want our translator to
  11   generate exactly that hardware whenever possible, to minimize the amount of
  12   surprise for people working with it.
  13
  14   In this chapter we try to describe how we interpret a Haskell program from a
  15   hardware perspective. We provide a description of each Haskell language
  16   element that needs translation, to provide a clear picture of what is
  17   supported and how.
  18
  19   \section{Function application}
  20   The basic syntactic element of a functional program are functions and
  21   function application. These have a single obvious \small{VHDL} translation: Each
  22   function becomes a hardware component, where each argument is an input port
  23   and the result value is the output port.
  24
  25   Each function application in turn becomes component instantiation. Here, the
  26   result of each argument expression is assigned to a signal, which is mapped
  27   to the corresponding input port. The output port of the function is also
  28   mapped to a signal, which is used as the result of the application.
  29
  30   \in{Example}[ex:And3] shows a simple program using only function
  31   application and the corresponding architecture.
  32
  33 \startbuffer[And3]
  34 -- | A simple function that returns
  35 --   the and of three bits
  36 and3 :: Bit -> Bit -> Bit -> Bit
  37 and3 a b c = and (and a b) c
  38 \stopbuffer
  39
  40   \startuseMPgraphic{And3}
  41     save a, b, c, anda, andb, out;
  42
  43     % I/O ports
  44     newCircle.a(btex $a$ etex) "framed(false)";
  45     newCircle.b(btex $b$ etex) "framed(false)";
  46     newCircle.c(btex $c$ etex) "framed(false)";
  47     newCircle.out(btex $out$ etex) "framed(false)";
  48
  49     % Components
  50     newCircle.anda(btex $and$ etex);
  51     newCircle.andb(btex $and$ etex);
  52
  53     a.c    = origin;
  54     b.c    = a.c + (0cm, 1cm);
  55     c.c    = b.c + (0cm, 1cm);
  56     anda.c = midpoint(a.c, b.c) + (2cm, 0cm);
  57     andb.c = midpoint(b.c, c.c) + (4cm, 0cm);
  58
  59     out.c   = andb.c + (2cm, 0cm);
  60
  61     % Draw objects and lines
  62     drawObj(a, b, c, anda, andb, out);
  63
  64     ncarc(a)(anda) "arcangle(-10)";
  65     ncarc(b)(anda);
  66     ncarc(anda)(andb);
  67     ncarc(c)(andb);
  68     ncline(andb)(out);
  69   \stopuseMPgraphic
  70
  71   \placeexample[here][ex:And3]{Simple three port and.}
  72     \startcombination[2*1]
  73       {\typebufferhs{And3}}{Haskell description using function applications.}
  74       {\boxedgraphic{And3}}{The architecture described by the Haskell description.}
  75     \stopcombination
  76
  77   \subsection{Partial application}
  78   It should be obvious that we cannot generate hardware signals for all
  79   expressions we can express in Haskell. The most obvious criterium for this
  80   is the type of an expression. We will see more of this below, but for now it
  81   should be obvious that any expression of a function type cannot be
  82   represented as a signal or i/o port to a component.
  83
  84   From this, we can see that the above translation rules do not apply to a
  85   partial application. \in{Example}[ex:Quadruple] shows an example use of
  86   partial application and the corresponding architecture.
  87
  88 \startbuffer[Quadruple]
  89 -- | Multiply the input word by four.
  90 quadruple :: Word -> Word
  91 quadruple n = mul (mul n)
  92   where
  93     mul = (*) 2
  94 \stopbuffer
  95
  96   \startuseMPgraphic{Quadruple}
  97     save in, two, mula, mulb, out;
  98
  99     % I/O ports
 100     newCircle.in(btex $n$ etex) "framed(false)";
 101     newCircle.two(btex $2$ etex) "framed(false)";
 102     newCircle.out(btex $out$ etex) "framed(false)";
 103
 104     % Components
 105     newCircle.mula(btex $\times$ etex);
 106     newCircle.mulb(btex $\times$ etex);
 107
 108     two.c    = origin;
 109     in.c     = two.c + (0cm, 1cm);
 110     mula.c  = in.c + (2cm, 0cm);
 111     mulb.c  = mula.c + (2cm, 0cm);
 112     out.c   = mulb.c + (2cm, 0cm);
 113
 114     % Draw objects and lines
 115     drawObj(in, two, mula, mulb, out);
 116
 117     nccurve(two)(mula) "angleA(0)", "angleB(45)";
 118     nccurve(two)(mulb) "angleA(0)", "angleB(45)";
 119     ncline(in)(mula);
 120     ncline(mula)(mulb);
 121     ncline(mulb)(out);
 122   \stopuseMPgraphic
 123
 124   \placeexample[here][ex:Quadruple]{Simple three port and.}
 125     \startcombination[2*1]
 126       {\typebufferhs{Quadruple}}{Haskell description using function applications.}
 127       {\boxedgraphic{Quadruple}}{The architecture described by the Haskell description.}
 128     \stopcombination
 129
 130   Here, the definition of mul is a partial function application: It applies
 131   \hs{2 :: Word} to the function \hs{(*) :: Word -> Word -> Word} resulting in
 132   the expression \hs{(*) 2 :: Word -> Word}. Since this resulting expression
 133   is again a function, we can't generate hardware for it directly. This is
 134   because the hardware to generate for \hs{mul} depends completely on where
 135   and how it is used. In this example, it is even used twice!
 136
 137   However, it is clear that the above hardware description actually describes
 138   valid hardware. In general, we can see that any partial applied function
 139   must eventually become completely applied, at which point we can generate
 140   hardware for it using the rules for function application above. It might
 141   mean that a partial application is passed around quite a bit (even beyond
 142   function boundaries), but eventually, the partial application will become
 143   completely applied.
 144
 145   \section{State}
 146     A very important concept in hardware designs is \emph{state}. In a
 147     stateless (or, \emph{combinatoric}) design, every output is a directly and solely dependent on the
 148     inputs. In a stateful design, the outputs can depend on the history of
 149     inputs, or the \emph{state}. State is usually stored in \emph{registers},
 150     which retain their value during a clockcycle, and are typically updated at
 151     the start of every clockcycle. Since the updating of the state is tightly
 152     coupled (synchronized) to the clock signal, these state updates are often
 153     called \emph{synchronous}.
 154
 155     To make our hardware description language useful to describe more that
 156     simple combinatoric designs, we'll need to be able to describe state in
 157     some way.
 158
 159     \subsection{Approaches to state}
 160       In Haskell, functions are always pure (except when using unsafe
 161       functions like \hs{unsafePerformIO}, which should be prevented whenever
 162       possible). This means that the output of a function solely depends on
 163       its inputs. If you evaluate a given function with given inputs, it will
 164       always provide the same output.
 165
 166       TODO: Define pure
 167
 168       This is a perfect match for a combinatoric circuit, where the output
 169       also soley depend on the inputs. However, when state is involved, this
 170       no longer holds. Since we're in charge of our own language, we could
 171       remove this purity constraint and allow a function to return different
 172       values depending on the cycle in which it is evaluated (or rather, the
 173       current state). However, this means that all kinds of interesting
 174       properties of our functional language get lost, and all kinds of
 175       transformations and optimizations might no longer be meaning preserving.
 176
 177       Provided that we want to keep the function pure, the current state has
 178       to be present in the function's arguments in some way. There seem to be
 179       two obvious ways to do this: Adding the current state as an argument, or
 180       including the full history of each argument.
 181
 182       \subsubsection{Stream arguments and results}
 183         Including the entire history of each input (\eg, the value of that
 184         input for each previous clockcycle) is an obvious way to make outputs
 185         depend on all previous input. This is easily done by making every
 186         input a list instead of a single value, containing all previous values
 187         as well as the current value.
 188
 189         An obvious downside of this solution is that on each cycle, all the
 190         previous cycles must be resimulated to obtain the current state. To do
 191         this, it might be needed to have a recursive helper function as well,
 192         wich might be hard to properly analyze by the compiler.
 193
 194         A slight variation on this approach is one taken by some of the other
 195         functional \small{HDL}s in the field (TODO: References to Lava,
 196         ForSyDe, ...): Make functions operate on complete streams. This means
 197         that a function is no longer called on every cycle, but just once. It
 198         takes stream as inputs instead of values, where each stream contains
 199         all the values for every clockcycle since system start. This is easily
 200         modeled using an (infinite) list, with one element for each clock
 201         cycle. Since the funciton is only evaluated once, its output is also a
 202         stream. Note that, since we are working with infinite lists and still
 203         want to be able to simulate the system cycle-by-cycle, this relies
 204         heavily on the lazy semantics of Haskell.
 205
 206         Since our inputs and outputs are streams, all other (intermediate)
 207         values must be streams. All of our primitive operators (\eg, addition,
 208         substraction, bitwise operations, etc.) must operate on streams as
 209         well (note that changing a single-element operation to a stream
 210         operation can done with \hs{map}, \hs{zipwith}, etc.).
 211
 212         Note that the concept of \emph{state} is no more than having some way
 213         to communicate a value from one cycle to the next. By introducing a
 214         \hs{delay} function, we can do exactly that: Delay (each value in) a
 215         stream so that we can "look into" the past. This \hs{delay} function
 216         simply outputs a stream where each value is the same as the input
 217         value, but shifted one cycle. This causes a \quote{gap} at the
 218         beginning of the stream: What is the value of the delay output in the
 219         first cycle? For this, the \hs{delay} function has a second input
 220         (which is a value, not a stream!).
 221
 222         \in{Example}[ex:DelayAcc] shows a simple accumulator expressed in this
 223         style.
 224
 225 \startbuffer[DelayAcc]
 226 acc :: Stream Word -> Stream Word
 227 acc in = out
 228   where
 229     out = (delay out 0) + in
 230 \stopbuffer
 231
 232 \startuseMPgraphic{DelayAcc}
 233   save in, out, add, reg;
 234
 235   % I/O ports
 236   newCircle.in(btex $in$ etex) "framed(false)";
 237   newCircle.out(btex $out$ etex) "framed(false)";
 238
 239   % Components
 240   newReg.reg("") "dx(4mm)", "dy(6mm)", "reflect(true)";
 241   newCircle.add(btex + etex);
 242
 243   in.c    = origin;
 244   add.c   = in.c + (2cm, 0cm);
 245   out.c   = add.c + (2cm, 0cm);
 246   reg.c   = add.c + (0cm, 2cm);
 247
 248   % Draw objects and lines
 249   drawObj(in, out, add, reg);
 250
 251   nccurve(add)(reg) "angleA(0)", "angleB(180)", "posB(d)";
 252   nccurve(reg)(add) "angleA(180)", "angleB(-45)", "posA(out)";
 253   ncline(in)(add);
 254   ncline(add)(out);
 255 \stopuseMPgraphic
 256
 257
 258         \placeexample[here][ex:DelayAcc]{Simple accumulator architecture.}
 259           \startcombination[2*1]
 260             {\typebufferhs{DelayAcc}}{Haskell description using streams.}
 261             {\boxedgraphic{DelayAcc}}{The architecture described by the Haskell description.}
 262           \stopcombination
 263
 264
 265         This notation can be confusing (especially due to the loop in the
 266         definition of out), but is essentially easy to interpret. There is a
 267         single call to delay, resulting in a circuit with a single register,
 268         whose input is connected to \hs{outl (which is the output of the
 269         adder)}, and it's output is the \hs{delay out 0} (which is connected
 270         to one of the adder inputs).
 271
 272         This notation has a number of downsides, amongst which are limited
 273         readability and ambiguity in the interpretation. TODO: Reference
 274         Christiaan.
 275
 276       \subsubsection{Explicit state arguments and results}
 277         A more explicit way to model state, is to simply add an extra argument
 278         containing the current state value. This allows an output to depend on
 279         both the inputs as well as the current state while keeping the
 280         function pure (letting the result depend only on the arguments), since
 281         the current state is now an argument.
 282
 283         In Haskell, this would look like \in{example}[ex:ExplicitAcc].
 284
 285 \startbuffer[ExplicitAcc]
 286 acc :: Word -> (State Word) -> (State Word, Word)
 287 acc in (State s) = (State s', out)
 288   where
 289     out = s + in
 290     s'  = out
 291 \stopbuffer
 292
 293         \placeexample[here][ex:ExplicitAcc]{Simple accumulator architecture.}
 294           \startcombination[2*1]
 295             {\typebufferhs{ExplicitAcc}}{Haskell description using explicit state arguments.}
 296             % Picture is identical to the one we had just now.
 297             {\boxedgraphic{DelayAcc}}{The architecture described by the Haskell description.}
 298           \stopcombination
 299
 300         This approach makes a function's state very explicit, which state
 301         variables are used by a function can be completely determined from its
 302         type signature (as opposed to the stream approach, where a function
 303         looks the same from the outside, regardless of what state variables it
 304         uses (or wether it's stateful at all).
 305
 306         A direct consequence of this, is that if a function calls other
 307         stateful functions (\eg, has subcircuits), it has to somehow know the
 308         current state for these called functions. The only way to do this, is
 309         to put these \emph{substates} inside the caller's state. This means
 310         that a function's state is the sum of the states of all functions it
 311         calls, and its own state.
 312
 313         This approach is the one chosen for Cλash and will be examined more
 314         closely below.
 315
 316     \subsection{Explicit state specification}
 317       Note about semantic correctness of top level state.
 318
 319       Note about automatic ``down-pushing'' of state.
 320
 321       Note about explicit state specification as the best solution.
 322
 323       Note about substates
 324
 325       Note about conditions on state variables and checking them.
 326
 327     \subsection{Explicit state implementation}
 328       Recording state variables at the type level.
 329
 330       Ideal: Type synonyms, since there is no additional code overhead for
 331       packing and unpacking. Downside: there is no explicit conversion in Core
 332       either, so type synonyms tend to get lost in expressions (they can be
 333       preserved in binders, but this makes implementation harder, since that
 334       statefulness of a value must be manually tracked).
 335
 336       Less ideal: Newtype. Requires explicit packing and unpacking of function
 337       arguments. If you don't unpack substates, there is no overhead for
 338       (un)packing substates. This will result in many nested State constructors
 339       in a nested state type. \eg:
 340
 341   \starttyping
 342   State (State Bit, State (State Word, Bit), Word)
 343   \stoptyping
 344
 345       Alternative: Provide different newtypes for input and output state. This
 346       makes the code even more explicit, and typechecking can find even more
 347       errors. However, this requires defining two type synomyms for each
 348       stateful function instead of just one. \eg:
 349   \starttyping
 350   type AccumStateIn = StateIn Bit
 351   type AccumStateOut = StateOut Bit
 352   \stoptyping
 353       This also increases the possibility of having different input and output
 354       states. Checking for identical input and output state types is also
 355       harder, since each element in the state must be unpacked and compared
 356       separately.
 357
 358       Alternative: Provide a type for the entire result type of a stateful
 359       function, not just the state part. \eg:
 360
 361   \starttyping
 362   newtype Result state result = Result (state, result)
 363   \stoptyping
 364
 365       This makes it easy to say "Any stateful function must return a
 366       \type{Result} type, without having to sort out result from state. However,
 367       this either requires a second type for input state (similar to
 368       \type{StateIn} / \type{StateOut} above), or requires the compiler to
 369       select the right argument for input state by looking at types (which works
 370       for complex states, but when that state has the same type as an argument,
 371       things get ambiguous) or by selecting a fixed (\eg, the last) argument,
 372       which might be limiting.
 373
 374       \subsubsection{Example}
 375       As an example of the used approach, a simple averaging circuit, that lets
 376       the accumulation of the inputs be done by a subcomponent.
 377
 378       \starttyping
 379         newtype State s = State s
 380
 381         type AccumState = State Bit
 382         accum :: Word -> AccumState -> (AccumState, Word)
 383         accum i (State s) = (State (s + i), s + i)
 384
 385         type AvgState = (AccumState, Word)
 386         avg :: Word -> AvgState -> (AvgState, Word)
 387         avg i (State s) = (State s', o)
 388           where
 389             (accums, count) = s
 390             -- Pass our input through the accumulator, which outputs a sum
 391             (accums', sum) = accum i accums
 392             -- Increment the count (which will be our new state)
 393             count' = count + 1
 394             -- Compute the average
 395             o = sum / count'
 396             s' = (accums', count')
 397       \stoptyping
 398
 399       And the normalized, core-like versions:
 400
 401       \starttyping
 402         accum i spacked = res
 403           where
 404             s = case spacked of (State s) -> s
 405             s' = s + i
 406             spacked' = State s'
 407             o = s + i
 408             res = (spacked', o)
 409
 410         avg i spacked = res
 411           where
 412             s = case spacked of (State s) -> s
 413             accums = case s of (accums, \_) -> accums
 414             count = case s of (\_, count) -> count
 415             accumres = accum i accums
 416             accums' = case accumres of (accums', \_) -> accums'
 417             sum = case accumres of (\_, sum) -> sum
 418             count' = count + 1
 419             o = sum / count'
 420             s' = (accums', count')
 421             spacked' = State s'
 422             res = (spacked', o)
 423       \stoptyping
 424
 425
 426
 427       As noted above, any component of a function's state that is a substate,
 428       \eg passed on as the state of another function, should have no influence
 429       on the hardware generated for the calling function. Any state-specific
 430       \small{VHDL} for this component can be generated entirely within the called
 431       function. So,we can completely leave out substates from any function.
 432
 433       From this observation, we might think to remove the substates from a
 434       function's states alltogether, and leave only the state components which
 435       are actual states of the current function. While doing this would not
 436       remove any information needed to generate \small{VHDL} from the function, it would
 437       cause the function definition to become invalid (since we won't have any
 438       substate to pass to the functions anymore). We could solve the syntactic
 439       problems by passing \type{undefined} for state variables, but that would
 440       still break the code on the semantic level (\ie, the function would no
 441       longer be semantically equivalent to the original input).
 442
 443       To keep the function definition correct until the very end of the process,
 444       we will not deal with (sub)states until we get to the \small{VHDL} generation.
 445       Here, we are translating from Core to \small{VHDL}, and we can simply not generate
 446       \small{VHDL} for substates, effectively removing the substate components
 447       alltogether.
 448
 449       There are a few important points when ignore substates.
 450
 451       First, we have to have some definition of "substate". Since any state
 452       argument or return value that represents state must be of the \type{State}
 453       type, we can simply look at its type. However, we must be careful to
 454       ignore only {\em substates}, and not a function's own state.
 455
 456       In the example above, this means we should remove \type{accums'} from
 457       \type{s'}, but not throw away \type{s'} entirely. We should, however,
 458       remove \type{s'} from the output port of the function, since the state
 459       will be handled by a \small{VHDL} procedure within the function.
 460
 461       When looking at substates, these can appear in two places: As part of an
 462       argument and as part of a return value. As noted above, these substates
 463       can only be used in very specific ways.
 464
 465       \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
 466       completely ignore the argument and generate no input port for it.
 467
 468       \desc{State variables can be extracted from other state variables.} When
 469       extracting a state variable from another state variable, this always means
 470       we're extracting a substate, which we can ignore. So, we simply generate no
 471       \small{VHDL} for any extraction operation that has a state variable as a result.
 472
 473       \desc{State variables can be passed to functions.} When passing a
 474       state variable to a function, this always means we're passing a substate
 475       to a subcomponent. The entire argument can simply be ingored in the
 476       resulting port map.
 477
 478       \desc{State variables can be returned from functions.} When returning a
 479       state variable from a function (probably as a part of an algebraic
 480       datatype), this always mean we're returning a substate from a
 481       subcomponent. The entire state variable should be ignored in the resulting
 482       port map. The type binder of the binder that the function call is bound
 483       to should not include the state type either.
 484
 485       \startdesc{State variables can be inserted into other variables.} When inserting
 486       a state variable into another variable (usually by constructing that new
 487       variable using its constructor), we can identify two cases:
 488
 489       \startitemize
 490         \item The state is inserted into another state variable. In this case,
 491         the inserted state is a substate, and can be safely left out of the
 492         constructed variable.
 493         \item The state is inserted into a non-state variable. This happens when
 494         building up the return value of a function, where you put state and
 495         retsult variables together in an algebraic type (usually a tuple). In
 496         this case, we should leave the state variable out as well, since we
 497         don't want it to be included as an output port.
 498       \stopitemize
 499
 500       So, in both cases, we can simply leave out the state variable from the
 501       resulting value. In the latter case, however, we should generate a state
 502       proc instead, which assigns the state variable to the input state variable
 503       at each clock tick.
 504       \stopdesc
 505
 506       \desc{State variables can appear as (part of) a function result.} When
 507       generating \small{VHDL}, we can completely ignore any part of a function result
 508       that has a state type. If the entire result is a state type, this will
 509       mean the entity will not have an output port. Otherwise, the state
 510       elements will be removed from the type of the output port.
 511
 512
 513       Now, we know how to handle each use of a state variable separately. If we
 514       look at the whole, we can conclude the following:
 515
 516       \startitemize
 517       \item A state unpack operation should not generate any \small{VHDL}. The binder
 518       to which the unpacked state is bound should still be declared, this signal
 519       will become the register and will hold the current state.
 520       \item A state pack operation should not generate any \small{VHDL}. The binder th
 521       which the packed state is bound should not be declared. The binder that is
 522       packed is the signal that will hold the new state.
 523       \item Any values of a State type should not be translated to \small{VHDL}. In
 524       particular, State elements should be removed from tuples (and other
 525       datatypes) and arguments with a state type should not generate ports.
 526       \item To make the state actually work, a simple \small{VHDL} proc should be
 527       generated. This proc updates the state at every clockcycle, by assigning
 528       the new state to the current state. This will be recognized by synthesis
 529       tools as a register specification.
 530       \stopitemize
 531
 532
 533       When applying these rules to the example program (in normal form), we will
 534       get the following result. All the parts that don't generate any value are
 535       crossed out, leaving some very boring assignments here and there.
 536
 537
 538   \starthaskell
 539     avg i --spacked-- = res
 540       where
 541         s = --case spacked of (State s) -> s--
 542         --accums = case s of (accums, \_) -> accums--
 543         count = case s of (--\_,-- count) -> count
 544         accumres = accum i --accums--
 545         --accums' = case accumres of (accums', \_) -> accums'--
 546         sum = case accumres of (--\_,-- sum) -> sum
 547         count' = count + 1
 548         o = sum / count'
 549         s' = (--accums',-- count')
 550         --spacked' = State s'--
 551         res = (--spacked',-- o)
 552   \stophaskell
 553
 554       When we would really leave out the crossed out parts, we get a slightly
 555       weird program: There is a variable \type{s} which has no value, and there
 556       is a variable \type{s'} that is never used. Together, these two will form
 557       the state proc of the function. \type{s} contains the "current" state,
 558       \type{s'} is assigned the "next" state. So, at the end of each clock
 559       cycle, \type{s'} should be assigned to \type{s}.
 560
 561       Note that the definition of \type{s'} is not removed, even though one
 562       might think it as having a state type. Since the state type has a single
 563       argument constructor \type{State}, some type that should be the resulting
 564       state should always be explicitly packed with the State constructor,
 565       allowing us to remove the packed version, but still generate \small{VHDL} for the
 566       unpacked version (of course with any substates removed).
 567
 568       As you can see, the definition of \type{s'} is still present, since it
 569       does not have a state type (The State constructor. The \type{accums'} substate has been removed,
 570       leaving us just with the state of \type{avg} itself.
 571     \subsection{Initial state}
 572       How to specify the initial state? Cannot be done inside a hardware
 573       function, since the initial state is its own state argument for the first
 574       call (unless you add an explicit, synchronous reset port).
 575
 576       External init state is natural for simulation.
 577
 578       External init state works for hardware generation as well.
 579
 580       Implementation issues: state splitting, linking input to output state,
 581       checking usage constraints on state variables.
 582
 583   \section[sec:recursion]{Recursion}
 584   An import concept in functional languages is recursion. In it's most basic
 585   form, recursion is a function that is defined in terms of itself. This
 586   usually requires multiple evaluations of this function, with changing
 587   arguments, until eventually an evaluation of the function no longer requires
 588   itself.
 589
 590   Recursion in a hardware description is a bit of a funny thing. Usually,
 591   recursion is associated with a lot of nondeterminism, stack overflows, but
 592   also flexibility and expressive power.
 593
 594   Given the notion that each function application will translate to a
 595   component instantiation, we are presented with a problem. A recursive
 596   function would translate to a component that contains itself. Or, more
 597   precisely, that contains an instance of itself. This instance would again
 598   contain an instance of itself, and again, into infinity. This is obviously a
 599   problem for generating hardware.
 600
 601   This is expected for functions that describe infinite recursion. In that
 602   case, we can't generate hardware that shows correct behaviour in a single
 603   cycle (at best, we could generate hardware that needs an infinite number of
 604   cycles to complete).
 605
 606   However, most recursive hardware descriptions will describe finite
 607   recursion. This is because the recursive call is done conditionally. There
 608   is usually a case statement where at least one alternative does not contain
 609   the recursive call, which we call the "base case". If, for each call to the
 610   recursive function, we would be able to detect which alternative applies,
 611   we would be able to remove the case expression and leave only the base case
 612   when it applies. This will ensure that expanding the recursive functions
 613   will terminate after a bounded number of expansions.
 614
 615   This does imply the extra requirement that the base case is detectable at
 616   compile time. In particular, this means that the decision between the base
 617   case and the recursive case must not depend on runtime data.
 618
 619   \subsection{List recursion}
 620   The most common deciding factor in recursion is the length of a list that is
 621   passed in as an argument. Since we represent lists as vectors that encode
 622   the length in the vector type, it seems easy to determine the base case. We
 623   can simply look at the argument type for this. However, it turns out that
 624   this is rather non-trivial to write down in Haskell in the first place. As
 625   an example, we would like to write down something like this:
 626
 627   \starthaskell
 628     sum :: Vector n Word -> Word
 629     sum xs = case null xs of
 630       True -> 0
 631       False -> head xs + sum (tail xs)
 632   \stophaskell
 633
 634   However, the typechecker will now use the following reasoning (element type
 635   of the vector is left out):
 636
 637   \startitemize
 638   \item tail has the type \hs{(n > 0) => Vector n -> Vector (n - 1)}
 639   \item This means that xs must have the type \hs{(n > 0) => Vector n}
 640   \item This means that sum must have the type \hs{(n > 0) => Vector n -> a}
 641   \item sum is called with the result of tail as an argument, which has the
 642   type \hs{Vector n} (since \hs{(n > 0) => n - 1 == m}).
 643   \item This means that sum must have the type \hs{Vector n -> a}
 644   \item This is a contradiction between the type deduced from the body of sum
 645   (the input vector must be non-empty) and the use of sum (the input vector
 646   could have any length).
 647   \stopitemize
 648
 649   As you can see, using a simple case at value level causes the type checker
 650   to always typecheck both alternatives, which can't be done! This is a
 651   fundamental problem, that would seem perfectly suited for a type class.
 652   Considering that we need to switch between to implementations of the sum
 653   function, based on the type of the argument, this sounds like the perfect
 654   problem to solve with a type class. However, this approach has its own
 655   problems (not the least of them that you need to define a new typeclass for
 656   every recursive function you want to define).
 657
 658   Another approach tried involved using GADTs to be able to do pattern
 659   matching on empty / non empty lists. While this worked partially, it also
 660   created problems with more complex expressions.
 661
 662   TODO: How much detail should there be here? I can probably refer to
 663   Christiaan instead.
 664
 665   Evaluating all possible (and non-possible) ways to add recursion to our
 666   descriptions, it seems better to leave out list recursion alltogether. This
 667   allows us to focus on other interesting areas instead. By including
 668   (builtin) support for a number of higher order functions like map and fold,
 669   we can still express most of the things we would use list recursion for.
 670
 671   \subsection{General recursion}
 672   Of course there are other forms of recursion, that do not depend on the
 673   length (and thus type) of a list. For example, simple recursion using a
 674   counter could be expressed, but only translated to hardware for a fixed
 675   number of iterations. Also, this would require extensive support for compile
 676   time simplification (constant propagation) and compile time evaluation
 677   (evaluation constant comparisons), to ensure non-termination. Even then, it
 678   is hard to really guarantee termination, since the user (or GHC desugarer)
 679   might use some obscure notation that results in a corner case of the
 680   simplifier that is not caught and thus non-termination.
 681
 682   Due to these complications, we leave other forms of recursion as
 683   future work as well.