Chapters/HardwareDescription.tex

   1 \chapter[chap:description]{Hardware description}
   2   This chapter will provide an overview of the hardware description language
   3   that was created and the issues that have arisen in the process. It will
   4   focus on the issues of the language, not the implementation.
   5
   6   When translating Haskell to hardware, we need to make choices in what kind
   7   of hardware to generate for what Haskell constructs. When faced with
   8   choices, we've tried to stick with the most obvious choice wherever
   9   possible. In a lot of cases, when you look at a hardware description it is
  10   comletely clear what hardware is described. We want our translator to
  11   generate exactly that hardware whenever possible, to minimize the amount of
  12   surprise for people working with it.
  13
  14   In this chapter we try to describe how we interpret a Haskell program from a
  15   hardware perspective. We provide a description of each Haskell language
  16   element that needs translation, to provide a clear picture of what is
  17   supported and how.
  18
  19   \section{Function application}
  20   The basic syntactic element of a functional program are functions and
  21   function application. These have a single obvious \small{VHDL} translation: Each
  22   function becomes a hardware component, where each argument is an input port
  23   and the result value is the output port.
  24
  25   Each function application in turn becomes component instantiation. Here, the
  26   result of each argument expression is assigned to a signal, which is mapped
  27   to the corresponding input port. The output port of the function is also
  28   mapped to a signal, which is used as the result of the application.
  29
  30   An example of a simple program using only function application would be:
  31
  32   \starthaskell
  33   -- | A simple function that returns the and of three bits
  34   and3 :: Bit -> Bit -> Bit -> Bit
  35   and3 a b c = and (and a b) c
  36   \stophaskell
  37
  38   This results in the following hardware:
  39
  40   TODO: Pretty picture
  41
  42   \subsection{Partial application}
  43   It should be obvious that we cannot generate hardware signals for all
  44   expressions we can express in Haskell. The most obvious criterium for this
  45   is the type of an expression. We will see more of this below, but for now it
  46   should be obvious that any expression of a function type cannot be
  47   represented as a signal or i/o port to a component.
  48
  49   From this, we can see that the above translation rules do not apply to a
  50   partial application. Let's look at an example:
  51
  52   \starthaskell
  53   -- | Multiply the input word by four.
  54   quadruple :: Word -> Word
  55   quadruple n = mul (mul n)
  56     where
  57       mul = (*) 2
  58   \stophaskell
  59
  60   It should be clear that the above code describes the following hardware:
  61
  62   TODO: Pretty picture
  63
  64   Here, the definition of mul is a partial function application: It applies
  65   \hs{2 :: Word} to the function \hs{(*) :: Word -> Word -> Word} resulting in
  66   the expression \hs{(*) 2 :: Word -> Word}. Since this resulting expression
  67   is again a function, we can't generate hardware for it directly. This is
  68   because the hardware to generate for \hs{mul} depends completely on where
  69   and how it is used. In this example, it is even used twice!
  70
  71   However, it is clear that the above hardware description actually describes
  72   valid hardware. In general, we can see that any partial applied function
  73   must eventually become completely applied, at which point we can generate
  74   hardware for it using the rules for function application above. It might
  75   mean that a partial application is passed around quite a bit (even beyond
  76   function boundaries), but eventually, the partial application will become
  77   completely applied.
  78
  79   \section{State}
  80     \subsection{Introduction}
  81       Provide some examples
  82
  83
  84     \subsection{Approaches to state}
  85       Explain impact of state (or rather, temporal behaviour) on function signature.
  86       \subsubsection{Stream arguments and results}
  87       \subsubsection{Explicit state arguments and results}
  88         Nested state for called functions.
  89
  90     \subsection{Explicit state specification}
  91       Note about semantic correctness of top level state.
  92
  93       Note about automatic ``down-pushing'' of state.
  94
  95       Note about explicit state specification as the best solution.
  96
  97       Note about substates
  98
  99       Note about conditions on state variables and checking them.
 100
 101     \subsection{Explicit state implementation}
 102       Recording state variables at the type level.
 103
 104       Ideal: Type synonyms, since there is no additional code overhead for
 105       packing and unpacking. Downside: there is no explicit conversion in Core
 106       either, so type synonyms tend to get lost in expressions (they can be
 107       preserved in binders, but this makes implementation harder, since that
 108       statefulness of a value must be manually tracked).
 109
 110       Less ideal: Newtype. Requires explicit packing and unpacking of function
 111       arguments. If you don't unpack substates, there is no overhead for
 112       (un)packing substates. This will result in many nested State constructors
 113       in a nested state type. \eg:
 114
 115   \starttyping
 116   State (State Bit, State (State Word, Bit), Word)
 117   \stoptyping
 118
 119       Alternative: Provide different newtypes for input and output state. This
 120       makes the code even more explicit, and typechecking can find even more
 121       errors. However, this requires defining two type synomyms for each
 122       stateful function instead of just one. \eg:
 123   \starttyping
 124   type AccumStateIn = StateIn Bit
 125   type AccumStateOut = StateOut Bit
 126   \stoptyping
 127       This also increases the possibility of having different input and output
 128       states. Checking for identical input and output state types is also
 129       harder, since each element in the state must be unpacked and compared
 130       separately.
 131
 132       Alternative: Provide a type for the entire result type of a stateful
 133       function, not just the state part. \eg:
 134
 135   \starttyping
 136   newtype Result state result = Result (state, result)
 137   \stoptyping
 138
 139       This makes it easy to say "Any stateful function must return a
 140       \type{Result} type, without having to sort out result from state. However,
 141       this either requires a second type for input state (similar to
 142       \type{StateIn} / \type{StateOut} above), or requires the compiler to
 143       select the right argument for input state by looking at types (which works
 144       for complex states, but when that state has the same type as an argument,
 145       things get ambiguous) or by selecting a fixed (\eg, the last) argument,
 146       which might be limiting.
 147
 148       \subsubsection{Example}
 149       As an example of the used approach, a simple averaging circuit, that lets
 150       the accumulation of the inputs be done by a subcomponent.
 151
 152       \starttyping
 153         newtype State s = State s
 154
 155         type AccumState = State Bit
 156         accum :: Word -> AccumState -> (AccumState, Word)
 157         accum i (State s) = (State (s + i), s + i)
 158
 159         type AvgState = (AccumState, Word)
 160         avg :: Word -> AvgState -> (AvgState, Word)
 161         avg i (State s) = (State s', o)
 162           where
 163             (accums, count) = s
 164             -- Pass our input through the accumulator, which outputs a sum
 165             (accums', sum) = accum i accums
 166             -- Increment the count (which will be our new state)
 167             count' = count + 1
 168             -- Compute the average
 169             o = sum / count'
 170             s' = (accums', count')
 171       \stoptyping
 172
 173       And the normalized, core-like versions:
 174
 175       \starttyping
 176         accum i spacked = res
 177           where
 178             s = case spacked of (State s) -> s
 179             s' = s + i
 180             spacked' = State s'
 181             o = s + i
 182             res = (spacked', o)
 183
 184         avg i spacked = res
 185           where
 186             s = case spacked of (State s) -> s
 187             accums = case s of (accums, \_) -> accums
 188             count = case s of (\_, count) -> count
 189             accumres = accum i accums
 190             accums' = case accumres of (accums', \_) -> accums'
 191             sum = case accumres of (\_, sum) -> sum
 192             count' = count + 1
 193             o = sum / count'
 194             s' = (accums', count')
 195             spacked' = State s'
 196             res = (spacked', o)
 197       \stoptyping
 198
 199
 200
 201       As noted above, any component of a function's state that is a substate,
 202       \eg passed on as the state of another function, should have no influence
 203       on the hardware generated for the calling function. Any state-specific
 204       \small{VHDL} for this component can be generated entirely within the called
 205       function. So,we can completely leave out substates from any function.
 206
 207       From this observation, we might think to remove the substates from a
 208       function's states alltogether, and leave only the state components which
 209       are actual states of the current function. While doing this would not
 210       remove any information needed to generate \small{VHDL} from the function, it would
 211       cause the function definition to become invalid (since we won't have any
 212       substate to pass to the functions anymore). We could solve the syntactic
 213       problems by passing \type{undefined} for state variables, but that would
 214       still break the code on the semantic level (\ie, the function would no
 215       longer be semantically equivalent to the original input).
 216
 217       To keep the function definition correct until the very end of the process,
 218       we will not deal with (sub)states until we get to the \small{VHDL} generation.
 219       Here, we are translating from Core to \small{VHDL}, and we can simply not generate
 220       \small{VHDL} for substates, effectively removing the substate components
 221       alltogether.
 222
 223       There are a few important points when ignore substates.
 224
 225       First, we have to have some definition of "substate". Since any state
 226       argument or return value that represents state must be of the \type{State}
 227       type, we can simply look at its type. However, we must be careful to
 228       ignore only {\em substates}, and not a function's own state.
 229
 230       In the example above, this means we should remove \type{accums'} from
 231       \type{s'}, but not throw away \type{s'} entirely. We should, however,
 232       remove \type{s'} from the output port of the function, since the state
 233       will be handled by a \small{VHDL} procedure within the function.
 234
 235       When looking at substates, these can appear in two places: As part of an
 236       argument and as part of a return value. As noted above, these substates
 237       can only be used in very specific ways.
 238
 239       \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
 240       completely ignore the argument and generate no input port for it.
 241
 242       \desc{State variables can be extracted from other state variables.} When
 243       extracting a state variable from another state variable, this always means
 244       we're extracting a substate, which we can ignore. So, we simply generate no
 245       \small{VHDL} for any extraction operation that has a state variable as a result.
 246
 247       \desc{State variables can be passed to functions.} When passing a
 248       state variable to a function, this always means we're passing a substate
 249       to a subcomponent. The entire argument can simply be ingored in the
 250       resulting port map.
 251
 252       \desc{State variables can be returned from functions.} When returning a
 253       state variable from a function (probably as a part of an algebraic
 254       datatype), this always mean we're returning a substate from a
 255       subcomponent. The entire state variable should be ignored in the resulting
 256       port map. The type binder of the binder that the function call is bound
 257       to should not include the state type either.
 258
 259       \startdesc{State variables can be inserted into other variables.} When inserting
 260       a state variable into another variable (usually by constructing that new
 261       variable using its constructor), we can identify two cases:
 262
 263       \startitemize
 264         \item The state is inserted into another state variable. In this case,
 265         the inserted state is a substate, and can be safely left out of the
 266         constructed variable.
 267         \item The state is inserted into a non-state variable. This happens when
 268         building up the return value of a function, where you put state and
 269         retsult variables together in an algebraic type (usually a tuple). In
 270         this case, we should leave the state variable out as well, since we
 271         don't want it to be included as an output port.
 272       \stopitemize
 273
 274       So, in both cases, we can simply leave out the state variable from the
 275       resulting value. In the latter case, however, we should generate a state
 276       proc instead, which assigns the state variable to the input state variable
 277       at each clock tick.
 278       \stopdesc
 279
 280       \desc{State variables can appear as (part of) a function result.} When
 281       generating \small{VHDL}, we can completely ignore any part of a function result
 282       that has a state type. If the entire result is a state type, this will
 283       mean the entity will not have an output port. Otherwise, the state
 284       elements will be removed from the type of the output port.
 285
 286
 287       Now, we know how to handle each use of a state variable separately. If we
 288       look at the whole, we can conclude the following:
 289
 290       \startitemize
 291       \item A state unpack operation should not generate any \small{VHDL}. The binder
 292       to which the unpacked state is bound should still be declared, this signal
 293       will become the register and will hold the current state.
 294       \item A state pack operation should not generate any \small{VHDL}. The binder th
 295       which the packed state is bound should not be declared. The binder that is
 296       packed is the signal that will hold the new state.
 297       \item Any values of a State type should not be translated to \small{VHDL}. In
 298       particular, State elements should be removed from tuples (and other
 299       datatypes) and arguments with a state type should not generate ports.
 300       \item To make the state actually work, a simple \small{VHDL} proc should be
 301       generated. This proc updates the state at every clockcycle, by assigning
 302       the new state to the current state. This will be recognized by synthesis
 303       tools as a register specification.
 304       \stopitemize
 305
 306
 307       When applying these rules to the example program (in normal form), we will
 308       get the following result. All the parts that don't generate any value are
 309       crossed out, leaving some very boring assignments here and there.
 310
 311
 312   \starthaskell
 313     avg i --spacked-- = res
 314       where
 315         s = --case spacked of (State s) -> s--
 316         --accums = case s of (accums, \_) -> accums--
 317         count = case s of (--\_,-- count) -> count
 318         accumres = accum i --accums--
 319         --accums' = case accumres of (accums', \_) -> accums'--
 320         sum = case accumres of (--\_,-- sum) -> sum
 321         count' = count + 1
 322         o = sum / count'
 323         s' = (--accums',-- count')
 324         --spacked' = State s'--
 325         res = (--spacked',-- o)
 326   \stophaskell
 327
 328       When we would really leave out the crossed out parts, we get a slightly
 329       weird program: There is a variable \type{s} which has no value, and there
 330       is a variable \type{s'} that is never used. Together, these two will form
 331       the state proc of the function. \type{s} contains the "current" state,
 332       \type{s'} is assigned the "next" state. So, at the end of each clock
 333       cycle, \type{s'} should be assigned to \type{s}.
 334
 335       Note that the definition of \type{s'} is not removed, even though one
 336       might think it as having a state type. Since the state type has a single
 337       argument constructor \type{State}, some type that should be the resulting
 338       state should always be explicitly packed with the State constructor,
 339       allowing us to remove the packed version, but still generate \small{VHDL} for the
 340       unpacked version (of course with any substates removed).
 341
 342       As you can see, the definition of \type{s'} is still present, since it
 343       does not have a state type (The State constructor. The \type{accums'} substate has been removed,
 344       leaving us just with the state of \type{avg} itself.
 345     \subsection{Initial state}
 346       How to specify the initial state? Cannot be done inside a hardware
 347       function, since the initial state is its own state argument for the first
 348       call (unless you add an explicit, synchronous reset port).
 349
 350       External init state is natural for simulation.
 351
 352       External init state works for hardware generation as well.
 353
 354       Implementation issues: state splitting, linking input to output state,
 355       checking usage constraints on state variables.
 356
 357   \section[sec:recursion]{Recursion}
 358   An import concept in functional languages is recursion. In it's most basic
 359   form, recursion is a function that is defined in terms of itself. This
 360   usually requires multiple evaluations of this function, with changing
 361   arguments, until eventually an evaluation of the function no longer requires
 362   itself.
 363
 364   Recursion in a hardware description is a bit of a funny thing. Usually,
 365   recursion is associated with a lot of nondeterminism, stack overflows, but
 366   also flexibility and expressive power.
 367
 368   Given the notion that each function application will translate to a
 369   component instantiation, we are presented with a problem. A recursive
 370   function would translate to a component that contains itself. Or, more
 371   precisely, that contains an instance of itself. This instance would again
 372   contain an instance of itself, and again, into infinity. This is obviously a
 373   problem for generating hardware.
 374
 375   This is expected for functions that describe infinite recursion. In that
 376   case, we can't generate hardware that shows correct behaviour in a single
 377   cycle (at best, we could generate hardware that needs an infinite number of
 378   cycles to complete).
 379
 380   However, most recursive hardware descriptions will describe finite
 381   recursion. This is because the recursive call is done conditionally. There
 382   is usually a case statement where at least one alternative does not contain
 383   the recursive call, which we call the "base case". If, for each call to the
 384   recursive function, we would be able to detect which alternative applies,
 385   we would be able to remove the case expression and leave only the base case
 386   when it applies. This will ensure that expanding the recursive functions
 387   will terminate after a bounded number of expansions.
 388
 389   This does imply the extra requirement that the base case is detectable at
 390   compile time. In particular, this means that the decision between the base
 391   case and the recursive case must not depend on runtime data.
 392
 393   \subsection{List recursion}
 394   The most common deciding factor in recursion is the length of a list that is
 395   passed in as an argument. Since we represent lists as vectors that encode
 396   the length in the vector type, it seems easy to determine the base case. We
 397   can simply look at the argument type for this. However, it turns out that
 398   this is rather non-trivial to write down in Haskell in the first place. As
 399   an example, we would like to write down something like this:
 400
 401   \starthaskell
 402     sum :: Vector n Word -> Word
 403     sum xs = case null xs of
 404       True -> 0
 405       False -> head xs + sum (tail xs)
 406   \stophaskell
 407
 408   However, the typechecker will now use the following reasoning (element type
 409   of the vector is left out):
 410
 411   \startitemize
 412   \item tail has the type \hs{(n > 0) => Vector n -> Vector (n - 1)}
 413   \item This means that xs must have the type \hs{(n > 0) => Vector n}
 414   \item This means that sum must have the type \hs{(n > 0) => Vector n -> a}
 415   \item sum is called with the result of tail as an argument, which has the
 416   type \hs{Vector n} (since \hs{(n > 0) => n - 1 == m}).
 417   \item This means that sum must have the type \hs{Vector n -> a}
 418   \item This is a contradiction between the type deduced from the body of sum
 419   (the input vector must be non-empty) and the use of sum (the input vector
 420   could have any length).
 421   \stopitemize
 422
 423   As you can see, using a simple case at value level causes the type checker
 424   to always typecheck both alternatives, which can't be done! This is a
 425   fundamental problem, that would seem perfectly suited for a type class.
 426   Considering that we need to switch between to implementations of the sum
 427   function, based on the type of the argument, this sounds like the perfect
 428   problem to solve with a type class. However, this approach has its own
 429   problems (not the least of them that you need to define a new typeclass for
 430   every recursive function you want to define).
 431
 432   Another approach tried involved using GADTs to be able to do pattern
 433   matching on empty / non empty lists. While this worked partially, it also
 434   created problems with more complex expressions.
 435
 436   TODO: How much detail should there be here? I can probably refer to
 437   Christiaan instead.
 438
 439   Evaluating all possible (and non-possible) ways to add recursion to our
 440   descriptions, it seems better to leave out list recursion alltogether. This
 441   allows us to focus on other interesting areas instead. By including
 442   (builtin) support for a number of higher order functions like map and fold,
 443   we can still express most of the things we would use list recursion for.
 444
 445   \subsection{General recursion}
 446   Of course there are other forms of recursion, that do not depend on the
 447   length (and thus type) of a list. For example, simple recursion using a
 448   counter could be expressed, but only translated to hardware for a fixed
 449   number of iterations. Also, this would require extensive support for compile
 450   time simplification (constant propagation) and compile time evaluation
 451   (evaluation constant comparisons), to ensure non-termination. Even then, it
 452   is hard to really guarantee termination, since the user (or GHC desugarer)
 453   might use some obscure notation that results in a corner case of the
 454   simplifier that is not caught and thus non-termination.
 455
 456   Due to these complications, we leave other forms of recursion as
 457   future work as well.