+ \section{State}
+ A very important concept in hardware designs is \emph{state}. In a
+ stateless (or, \emph{combinatoric}) design, every output is a directly and solely dependent on the
+ inputs. In a stateful design, the outputs can depend on the history of
+ inputs, or the \emph{state}. State is usually stored in \emph{registers},
+ which retain their value during a clockcycle, and are typically updated at
+ the start of every clockcycle. Since the updating of the state is tightly
+ coupled (synchronized) to the clock signal, these state updates are often
+ called \emph{synchronous}.
+
+ To make our hardware description language useful to describe more that
+ simple combinatoric designs, we'll need to be able to describe state in
+ some way.
+
+ \subsection{Approaches to state}
+ In Haskell, functions are always pure (except when using unsafe
+ functions like \hs{unsafePerformIO}, which should be prevented whenever
+ possible). This means that the output of a function solely depends on
+ its inputs. If you evaluate a given function with given inputs, it will
+ always provide the same output.
+
+ TODO: Define pure
+
+ This is a perfect match for a combinatoric circuit, where the output
+ also soley depend on the inputs. However, when state is involved, this
+ no longer holds. Since we're in charge of our own language, we could
+ remove this purity constraint and allow a function to return different
+ values depending on the cycle in which it is evaluated (or rather, the
+ current state). However, this means that all kinds of interesting
+ properties of our functional language get lost, and all kinds of
+ transformations and optimizations might no longer be meaning preserving.
+
+ Provided that we want to keep the function pure, the current state has
+ to be present in the function's arguments in some way. There seem to be
+ two obvious ways to do this: Adding the current state as an argument, or
+ including the full history of each argument.
+
+ \subsubsection{Stream arguments and results}
+ Including the entire history of each input (\eg, the value of that
+ input for each previous clockcycle) is an obvious way to make outputs
+ depend on all previous input. This is easily done by making every
+ input a list instead of a single value, containing all previous values
+ as well as the current value.
+
+ An obvious downside of this solution is that on each cycle, all the
+ previous cycles must be resimulated to obtain the current state. To do
+ this, it might be needed to have a recursive helper function as well,
+ wich might be hard to properly analyze by the compiler.
+
+ A slight variation on this approach is one taken by some of the other
+ functional \small{HDL}s in the field (TODO: References to Lava,
+ ForSyDe, ...): Make functions operate on complete streams. This means
+ that a function is no longer called on every cycle, but just once. It
+ takes stream as inputs instead of values, where each stream contains
+ all the values for every clockcycle since system start. This is easily
+ modeled using an (infinite) list, with one element for each clock
+ cycle. Since the funciton is only evaluated once, its output is also a
+ stream. Note that, since we are working with infinite lists and still
+ want to be able to simulate the system cycle-by-cycle, this relies
+ heavily on the lazy semantics of Haskell.
+
+ Since our inputs and outputs are streams, all other (intermediate)
+ values must be streams. All of our primitive operators (\eg, addition,
+ substraction, bitwise operations, etc.) must operate on streams as
+ well (note that changing a single-element operation to a stream
+ operation can done with \hs{map}, \hs{zipwith}, etc.).
+
+ Note that the concept of \emph{state} is no more than having some way
+ to communicate a value from one cycle to the next. By introducing a
+ \hs{delay} function, we can do exactly that: Delay (each value in) a
+ stream so that we can "look into" the past. This \hs{delay} function
+ simply outputs a stream where each value is the same as the input
+ value, but shifted one cycle. This causes a \quote{gap} at the
+ beginning of the stream: What is the value of the delay output in the
+ first cycle? For this, the \hs{delay} function has a second input
+ (which is a value, not a stream!).
+
+ \in{Example}[ex:DelayAcc] shows a simple accumulator expressed in this
+ style.
+
+\startbuffer[DelayAcc]
+acc :: Stream Word -> Stream Word
+acc in = out
+ where
+ out = (delay out 0) + in
+\stopbuffer
+
+\startuseMPgraphic{DelayAcc}
+ save in, out, add, reg;
+
+ % I/O ports
+ newCircle.in(btex $in$ etex) "framed(false)";
+ newCircle.out(btex $out$ etex) "framed(false)";
+
+ % Components
+ newReg.reg("") "dx(4mm)", "dy(6mm)", "reflect(true)";
+ newCircle.add(btex + etex);
+
+ in.c = origin;
+ add.c = in.c + (2cm, 0cm);
+ out.c = add.c + (2cm, 0cm);
+ reg.c = add.c + (0cm, 2cm);
+
+ % Draw objects and lines
+ drawObj(in, out, add, reg);
+
+ nccurve(add)(reg) "angleA(0)", "angleB(180)", "posB(d)";
+ nccurve(reg)(add) "angleA(180)", "angleB(-45)", "posA(out)";
+ ncline(in)(add);
+ ncline(add)(out);
+\stopuseMPgraphic
+
+
+ \placeexample[here][ex:DelayAcc]{Simple accumulator architecture.}
+ \startcombination[2*1]
+ {\typebufferhs{DelayAcc}}{Haskell description using streams.}
+ {\boxedgraphic{DelayAcc}}{The architecture described by the Haskell description.}
+ \stopcombination
+
+
+ This notation can be confusing (especially due to the loop in the
+ definition of out), but is essentially easy to interpret. There is a
+ single call to delay, resulting in a circuit with a single register,
+ whose input is connected to \hs{outl (which is the output of the
+ adder)}, and it's output is the \hs{delay out 0} (which is connected
+ to one of the adder inputs).
+
+ This notation has a number of downsides, amongst which are limited
+ readability and ambiguity in the interpretation. TODO: Reference
+ Christiaan.
+
+ \subsubsection{Explicit state arguments and results}
+ A more explicit way to model state, is to simply add an extra argument
+ containing the current state value. This allows an output to depend on
+ both the inputs as well as the current state while keeping the
+ function pure (letting the result depend only on the arguments), since
+ the current state is now an argument.
+
+ In Haskell, this would look like \in{example}[ex:ExplicitAcc].
+
+\startbuffer[ExplicitAcc]
+acc :: Word -> (State Word) -> (State Word, Word)
+acc in (State s) = (State s', out)
+ where
+ out = s + in
+ s' = out
+\stopbuffer
+
+ \placeexample[here][ex:ExplicitAcc]{Simple accumulator architecture.}
+ \startcombination[2*1]
+ {\typebufferhs{ExplicitAcc}}{Haskell description using explicit state arguments.}
+ % Picture is identical to the one we had just now.
+ {\boxedgraphic{DelayAcc}}{The architecture described by the Haskell description.}
+ \stopcombination
+
+ This approach makes a function's state very explicit, which state
+ variables are used by a function can be completely determined from its
+ type signature (as opposed to the stream approach, where a function
+ looks the same from the outside, regardless of what state variables it
+ uses (or wether it's stateful at all).
+
+ A direct consequence of this, is that if a function calls other
+ stateful functions (\eg, has subcircuits), it has to somehow know the
+ current state for these called functions. The only way to do this, is
+ to put these \emph{substates} inside the caller's state. This means
+ that a function's state is the sum of the states of all functions it
+ calls, and its own state.
+
+ This approach is the one chosen for Cλash and will be examined more
+ closely below.
+
+ \subsection{Explicit state specification}
+ Note about semantic correctness of top level state.
+
+ Note about automatic ``down-pushing'' of state.
+
+ Note about explicit state specification as the best solution.
+
+ Note about substates
+
+ Note about conditions on state variables and checking them.
+
+ \subsection{Explicit state implementation}
+ Recording state variables at the type level.
+
+ Ideal: Type synonyms, since there is no additional code overhead for
+ packing and unpacking. Downside: there is no explicit conversion in Core
+ either, so type synonyms tend to get lost in expressions (they can be
+ preserved in binders, but this makes implementation harder, since that
+ statefulness of a value must be manually tracked).
+
+ Less ideal: Newtype. Requires explicit packing and unpacking of function
+ arguments. If you don't unpack substates, there is no overhead for
+ (un)packing substates. This will result in many nested State constructors
+ in a nested state type. \eg:
+
+ \starttyping
+ State (State Bit, State (State Word, Bit), Word)
+ \stoptyping
+
+ Alternative: Provide different newtypes for input and output state. This
+ makes the code even more explicit, and typechecking can find even more
+ errors. However, this requires defining two type synomyms for each
+ stateful function instead of just one. \eg:
+ \starttyping
+ type AccumStateIn = StateIn Bit
+ type AccumStateOut = StateOut Bit
+ \stoptyping
+ This also increases the possibility of having different input and output
+ states. Checking for identical input and output state types is also
+ harder, since each element in the state must be unpacked and compared
+ separately.
+
+ Alternative: Provide a type for the entire result type of a stateful
+ function, not just the state part. \eg:
+
+ \starttyping
+ newtype Result state result = Result (state, result)
+ \stoptyping
+
+ This makes it easy to say "Any stateful function must return a
+ \type{Result} type, without having to sort out result from state. However,
+ this either requires a second type for input state (similar to
+ \type{StateIn} / \type{StateOut} above), or requires the compiler to
+ select the right argument for input state by looking at types (which works
+ for complex states, but when that state has the same type as an argument,
+ things get ambiguous) or by selecting a fixed (\eg, the last) argument,
+ which might be limiting.
+
+ \subsubsection{Example}
+ As an example of the used approach, a simple averaging circuit, that lets
+ the accumulation of the inputs be done by a subcomponent.
+
+ \starttyping
+ newtype State s = State s
+
+ type AccumState = State Bit
+ accum :: Word -> AccumState -> (AccumState, Word)
+ accum i (State s) = (State (s + i), s + i)
+
+ type AvgState = (AccumState, Word)
+ avg :: Word -> AvgState -> (AvgState, Word)
+ avg i (State s) = (State s', o)
+ where
+ (accums, count) = s
+ -- Pass our input through the accumulator, which outputs a sum
+ (accums', sum) = accum i accums
+ -- Increment the count (which will be our new state)
+ count' = count + 1
+ -- Compute the average
+ o = sum / count'
+ s' = (accums', count')
+ \stoptyping
+
+ And the normalized, core-like versions:
+
+ \starttyping
+ accum i spacked = res
+ where
+ s = case spacked of (State s) -> s
+ s' = s + i
+ spacked' = State s'
+ o = s + i
+ res = (spacked', o)
+
+ avg i spacked = res
+ where
+ s = case spacked of (State s) -> s
+ accums = case s of (accums, \_) -> accums
+ count = case s of (\_, count) -> count
+ accumres = accum i accums
+ accums' = case accumres of (accums', \_) -> accums'
+ sum = case accumres of (\_, sum) -> sum
+ count' = count + 1
+ o = sum / count'
+ s' = (accums', count')
+ spacked' = State s'
+ res = (spacked', o)
+ \stoptyping
+
+
+
+ As noted above, any component of a function's state that is a substate,
+ \eg passed on as the state of another function, should have no influence
+ on the hardware generated for the calling function. Any state-specific
+ \small{VHDL} for this component can be generated entirely within the called
+ function. So,we can completely leave out substates from any function.
+
+ From this observation, we might think to remove the substates from a
+ function's states alltogether, and leave only the state components which
+ are actual states of the current function. While doing this would not
+ remove any information needed to generate \small{VHDL} from the function, it would
+ cause the function definition to become invalid (since we won't have any
+ substate to pass to the functions anymore). We could solve the syntactic
+ problems by passing \type{undefined} for state variables, but that would
+ still break the code on the semantic level (\ie, the function would no
+ longer be semantically equivalent to the original input).
+
+ To keep the function definition correct until the very end of the process,
+ we will not deal with (sub)states until we get to the \small{VHDL} generation.
+ Here, we are translating from Core to \small{VHDL}, and we can simply not generate
+ \small{VHDL} for substates, effectively removing the substate components
+ alltogether.
+
+ There are a few important points when ignore substates.
+
+ First, we have to have some definition of "substate". Since any state
+ argument or return value that represents state must be of the \type{State}
+ type, we can simply look at its type. However, we must be careful to
+ ignore only {\em substates}, and not a function's own state.
+
+ In the example above, this means we should remove \type{accums'} from
+ \type{s'}, but not throw away \type{s'} entirely. We should, however,
+ remove \type{s'} from the output port of the function, since the state
+ will be handled by a \small{VHDL} procedure within the function.
+
+ When looking at substates, these can appear in two places: As part of an
+ argument and as part of a return value. As noted above, these substates
+ can only be used in very specific ways.
+
+ \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
+ completely ignore the argument and generate no input port for it.
+
+ \desc{State variables can be extracted from other state variables.} When
+ extracting a state variable from another state variable, this always means
+ we're extracting a substate, which we can ignore. So, we simply generate no
+ \small{VHDL} for any extraction operation that has a state variable as a result.
+
+ \desc{State variables can be passed to functions.} When passing a
+ state variable to a function, this always means we're passing a substate
+ to a subcomponent. The entire argument can simply be ingored in the
+ resulting port map.
+
+ \desc{State variables can be returned from functions.} When returning a
+ state variable from a function (probably as a part of an algebraic
+ datatype), this always mean we're returning a substate from a
+ subcomponent. The entire state variable should be ignored in the resulting
+ port map. The type binder of the binder that the function call is bound
+ to should not include the state type either.
+
+ \startdesc{State variables can be inserted into other variables.} When inserting
+ a state variable into another variable (usually by constructing that new
+ variable using its constructor), we can identify two cases:
+
+ \startitemize
+ \item The state is inserted into another state variable. In this case,
+ the inserted state is a substate, and can be safely left out of the
+ constructed variable.
+ \item The state is inserted into a non-state variable. This happens when
+ building up the return value of a function, where you put state and
+ retsult variables together in an algebraic type (usually a tuple). In
+ this case, we should leave the state variable out as well, since we
+ don't want it to be included as an output port.
+ \stopitemize
+
+ So, in both cases, we can simply leave out the state variable from the
+ resulting value. In the latter case, however, we should generate a state
+ proc instead, which assigns the state variable to the input state variable
+ at each clock tick.
+ \stopdesc
+
+ \desc{State variables can appear as (part of) a function result.} When
+ generating \small{VHDL}, we can completely ignore any part of a function result
+ that has a state type. If the entire result is a state type, this will
+ mean the entity will not have an output port. Otherwise, the state
+ elements will be removed from the type of the output port.
+
+
+ Now, we know how to handle each use of a state variable separately. If we
+ look at the whole, we can conclude the following:
+
+ \startitemize
+ \item A state unpack operation should not generate any \small{VHDL}. The binder
+ to which the unpacked state is bound should still be declared, this signal
+ will become the register and will hold the current state.
+ \item A state pack operation should not generate any \small{VHDL}. The binder th
+ which the packed state is bound should not be declared. The binder that is
+ packed is the signal that will hold the new state.
+ \item Any values of a State type should not be translated to \small{VHDL}. In
+ particular, State elements should be removed from tuples (and other
+ datatypes) and arguments with a state type should not generate ports.
+ \item To make the state actually work, a simple \small{VHDL} proc should be
+ generated. This proc updates the state at every clockcycle, by assigning
+ the new state to the current state. This will be recognized by synthesis
+ tools as a register specification.
+ \stopitemize
+
+
+ When applying these rules to the example program (in normal form), we will
+ get the following result. All the parts that don't generate any value are
+ crossed out, leaving some very boring assignments here and there.
+
+
+ \starthaskell
+ avg i --spacked-- = res
+ where
+ s = --case spacked of (State s) -> s--
+ --accums = case s of (accums, \_) -> accums--
+ count = case s of (--\_,-- count) -> count
+ accumres = accum i --accums--
+ --accums' = case accumres of (accums', \_) -> accums'--
+ sum = case accumres of (--\_,-- sum) -> sum
+ count' = count + 1
+ o = sum / count'
+ s' = (--accums',-- count')
+ --spacked' = State s'--
+ res = (--spacked',-- o)
+ \stophaskell
+
+ When we would really leave out the crossed out parts, we get a slightly
+ weird program: There is a variable \type{s} which has no value, and there
+ is a variable \type{s'} that is never used. Together, these two will form
+ the state proc of the function. \type{s} contains the "current" state,
+ \type{s'} is assigned the "next" state. So, at the end of each clock
+ cycle, \type{s'} should be assigned to \type{s}.
+
+ Note that the definition of \type{s'} is not removed, even though one
+ might think it as having a state type. Since the state type has a single
+ argument constructor \type{State}, some type that should be the resulting
+ state should always be explicitly packed with the State constructor,
+ allowing us to remove the packed version, but still generate \small{VHDL} for the
+ unpacked version (of course with any substates removed).
+
+ As you can see, the definition of \type{s'} is still present, since it
+ does not have a state type (The State constructor. The \type{accums'} substate has been removed,
+ leaving us just with the state of \type{avg} itself.
+ \subsection{Initial state}
+ How to specify the initial state? Cannot be done inside a hardware
+ function, since the initial state is its own state argument for the first
+ call (unless you add an explicit, synchronous reset port).
+
+ External init state is natural for simulation.
+
+ External init state works for hardware generation as well.
+
+ Implementation issues: state splitting, linking input to output state,
+ checking usage constraints on state variables.
+
+ \section[sec:recursion]{Recursion}