function boundaries), but eventually, the partial application will become
completely applied.
+ \section{Costless specialization}
+ Each (complete) function application in our description generates a
+ component instantiation, or a specific piece of hardware in the final
+ design. It is interesting to note that each application of a function
+ generates a \emph{separate} piece of hardware. In the final design, none
+ of the hardware is shared between applications, even when the applied
+ function is the same.
+
+ This is distinctly different from normal program compilation: Two separate
+ calls to the same function share the same machine code. Having more
+ machine code has implications for speed (due to less efficient caching)
+ and memory usage. For normal compilation, it is therefore important to
+ keep the amount of functions limited and maximize the code sharing.
+
+ When generating hardware, this is hardly an issue. Having more \quote{code
+ sharing} does reduce the amount of \small{VHDL} output (Since different
+ component instantiations still share the same component), but after
+ synthesis, the amount of hardware generated is not affected.
+
+ In particular, if we would duplicate all functions so that there is a
+ duplicate for every application in the program (\eg, each function is then
+ only applied exactly once), there would be no increase in hardware size
+ whatsoever.
+
+ TODO: Perhaps these next two sections are a bit too
+ implementation-oriented?
+
+ \subsection{Specialization}
+ Because of this, a common optimization technique called
+ \emph{specialization} is as good as free for hardware generation. Given
+ some function that has a \emph{domain} of $D$ (\eg, the set of all
+ possible arguments that could be applied), we create a specialized
+ function with exactly the same behaviour, but with an domain of $D'
+ \subset D$. This subset can be derived in all sort of ways, but commonly
+ this is done by limiting a polymorphic argument to a single type (\eg,
+ removing polymorphism) or by limiting an argument to just a single value
+ (\eg, cross-function constant propagation, effectively removing the
+ argument).
+
+ Since we limit the argument domain of the specialized function, its
+ definition can often be optimized further (since now more types or even
+ values of arguments are already know). By replacing any application of
+ the function that falls within the reduced domain by an application of
+ the specialized version, the code gets faster (but the code also gets
+ bigger, since we now have two versions instead of one!). If we apply
+ this technique often enough, we can often replace all applications of a
+ function by specialized versions, allowing the original function to be
+ removed (in some cases, this can even give a net reduction of the code
+ compared to the non-specialized version).
+
+ Specialization is useful for our hardware descriptions for functions
+ that contain arguments that cannot be translated to hardware directly
+ (polymorphic or higher order arguments, for example). If we can create
+ specialized functions that remove the argument, or make it translatable,
+ we can use specialization to make the original, untranslatable, function
+ obsolete.
+
+ \section{Higher order values}
+ What holds for partial application, can be easily generalized to any
+ higher order expression. This includes partial applications, plain
+ variables (e.g., a binder referring to a top level function), lambda
+ expressions and more complex expressions with a function type (a case
+ expression returning lambda's, for example).
+
+ Each of these values cannot be directly represented in hardware (just like
+ partial applications). Also, to make them representable, they need to be
+ applied: function variables and partial applications will then eventually
+ become complete applications, applied lambda expressions disappear by
+ applying β-reduction, etc.
+
+ So any higher order value will be \quote{pushed down} towards its
+ application just like partial applications. Whenever a function boundary
+ needs to be crossed, the called function can be specialized.
+
+ TODO: This is section should be improved
+
+ \section{Polymorphism}
+ In Haskell, values can be polymorphic: They can have multiple types. For
+ example, the function \hs{fst :: (a, b) -> a} is an example of a
+ polymorphic function: It works for tuples with any element types. Haskell
+ typeclasses allow a function to work on a specific set of types, but the
+ general idea is the same.
+
+% A type class is a collection of types for which some operations are
+% defined. It is thus possible for a value to be polymorphic while having
+% any number of \emph{class constraints}: The value is not defined for
+% every type, but only for types in the type class. An example of this is
+% the \hs{even :: (Integral a) => a -> Bool} function, which can map any
+% value of a type that is member of the \hs{Integral} type class
+
+ When generating hardware, polymorphism can't be easily translated. How
+ many wire will you lay down for a value that could have any type? When
+ type classes are involved, what hardware components will you lay down for
+ a class method (whose behaviour depends on the type of its arguments)?
+
+ Fortunately, we can again use the principle of specialization: Since every
+ function application generates separate pieces of hardware, we can know
+ the types of all arguments exactly. Provided that we don't use existential
+ typing, all of the polymorphic types in a function must depend on the
+ types of the arguments (In other words, the only way to introduce a type
+ variable is in a lambda abstraction). Our top level function must not have
+ a polymorphic type (otherwise we wouldn't know the hardware interface to
+ our top level function).
+
+ If a function is monomorphic, all values inside it are monomorphic as
+ well, so any function that is applied within the function can only be
+ applied to monomorphic values. The applied functions can then be
+ specialized to work just for these specific types, removing the
+ polymorphism from the applied functions as well.
+
+ By induction, this means that all functions that are (indirectly) called
+ by our top level function (meaning all functions that are translated in
+ the final hardware) become monomorphic.
+
\section{State}
A very important concept in hardware designs is \emph{state}. In a
stateless (or, \emph{combinatoric}) design, every output is a directly and solely dependent on the
TODO: Note about conditions on state variables and checking them.
- \subsection{Explicit state implementation}
- Recording state variables at the type level.
-
- Ideal: Type synonyms, since there is no additional code overhead for
- packing and unpacking. Downside: there is no explicit conversion in Core
- either, so type synonyms tend to get lost in expressions (they can be
- preserved in binders, but this makes implementation harder, since that
- statefulness of a value must be manually tracked).
-
- Less ideal: Newtype. Requires explicit packing and unpacking of function
- arguments. If you don't unpack substates, there is no overhead for
- (un)packing substates. This will result in many nested State constructors
- in a nested state type. \eg:
-
- \starttyping
- State (State Bit, State (State Word, Bit), Word)
- \stoptyping
-
- Alternative: Provide different newtypes for input and output state. This
- makes the code even more explicit, and typechecking can find even more
- errors. However, this requires defining two type synomyms for each
- stateful function instead of just one. \eg:
- \starttyping
- type AccumStateIn = StateIn Bit
- type AccumStateOut = StateOut Bit
- \stoptyping
- This also increases the possibility of having different input and output
- states. Checking for identical input and output state types is also
- harder, since each element in the state must be unpacked and compared
- separately.
-
- Alternative: Provide a type for the entire result type of a stateful
- function, not just the state part. \eg:
-
- \starttyping
- newtype Result state result = Result (state, result)
- \stoptyping
-
- This makes it easy to say "Any stateful function must return a
- \type{Result} type, without having to sort out result from state. However,
- this either requires a second type for input state (similar to
- \type{StateIn} / \type{StateOut} above), or requires the compiler to
- select the right argument for input state by looking at types (which works
- for complex states, but when that state has the same type as an argument,
- things get ambiguous) or by selecting a fixed (\eg, the last) argument,
- which might be limiting.
-
- \subsubsection{Example}
- As an example of the used approach, a simple averaging circuit, that lets
- the accumulation of the inputs be done by a subcomponent.
-
- \starttyping
- newtype State s = State s
-
- type AccumState = State Bit
- accum :: Word -> AccumState -> (AccumState, Word)
- accum i (State s) = (State (s + i), s + i)
-
- type AvgState = (AccumState, Word)
- avg :: Word -> AvgState -> (AvgState, Word)
- avg i (State s) = (State s', o)
- where
- (accums, count) = s
- -- Pass our input through the accumulator, which outputs a sum
- (accums', sum) = accum i accums
- -- Increment the count (which will be our new state)
- count' = count + 1
- -- Compute the average
- o = sum / count'
- s' = (accums', count')
- \stoptyping
-
- And the normalized, core-like versions:
-
- \starttyping
- accum i spacked = res
- where
- s = case spacked of (State s) -> s
- s' = s + i
- spacked' = State s'
- o = s + i
- res = (spacked', o)
-
- avg i spacked = res
- where
- s = case spacked of (State s) -> s
- accums = case s of (accums, \_) -> accums
- count = case s of (\_, count) -> count
- accumres = accum i accums
- accums' = case accumres of (accums', \_) -> accums'
- sum = case accumres of (\_, sum) -> sum
- count' = count + 1
- o = sum / count'
- s' = (accums', count')
- spacked' = State s'
- res = (spacked', o)
- \stoptyping
-
-
-
- As noted above, any component of a function's state that is a substate,
- \eg passed on as the state of another function, should have no influence
- on the hardware generated for the calling function. Any state-specific
- \small{VHDL} for this component can be generated entirely within the called
- function. So,we can completely leave out substates from any function.
-
- From this observation, we might think to remove the substates from a
- function's states alltogether, and leave only the state components which
- are actual states of the current function. While doing this would not
- remove any information needed to generate \small{VHDL} from the function, it would
- cause the function definition to become invalid (since we won't have any
- substate to pass to the functions anymore). We could solve the syntactic
- problems by passing \type{undefined} for state variables, but that would
- still break the code on the semantic level (\ie, the function would no
- longer be semantically equivalent to the original input).
-
- To keep the function definition correct until the very end of the process,
- we will not deal with (sub)states until we get to the \small{VHDL} generation.
- Here, we are translating from Core to \small{VHDL}, and we can simply not generate
- \small{VHDL} for substates, effectively removing the substate components
- alltogether.
-
- There are a few important points when ignore substates.
-
- First, we have to have some definition of "substate". Since any state
- argument or return value that represents state must be of the \type{State}
- type, we can simply look at its type. However, we must be careful to
- ignore only {\em substates}, and not a function's own state.
-
- In the example above, this means we should remove \type{accums'} from
- \type{s'}, but not throw away \type{s'} entirely. We should, however,
- remove \type{s'} from the output port of the function, since the state
- will be handled by a \small{VHDL} procedure within the function.
-
- When looking at substates, these can appear in two places: As part of an
- argument and as part of a return value. As noted above, these substates
- can only be used in very specific ways.
-
- \desc{State variables can appear as an argument.} When generating \small{VHDL}, we
- completely ignore the argument and generate no input port for it.
-
- \desc{State variables can be extracted from other state variables.} When
- extracting a state variable from another state variable, this always means
- we're extracting a substate, which we can ignore. So, we simply generate no
- \small{VHDL} for any extraction operation that has a state variable as a result.
-
- \desc{State variables can be passed to functions.} When passing a
- state variable to a function, this always means we're passing a substate
- to a subcomponent. The entire argument can simply be ingored in the
- resulting port map.
-
- \desc{State variables can be returned from functions.} When returning a
- state variable from a function (probably as a part of an algebraic
- datatype), this always mean we're returning a substate from a
- subcomponent. The entire state variable should be ignored in the resulting
- port map. The type binder of the binder that the function call is bound
- to should not include the state type either.
-
- \startdesc{State variables can be inserted into other variables.} When inserting
- a state variable into another variable (usually by constructing that new
- variable using its constructor), we can identify two cases:
-
- \startitemize
- \item The state is inserted into another state variable. In this case,
- the inserted state is a substate, and can be safely left out of the
- constructed variable.
- \item The state is inserted into a non-state variable. This happens when
- building up the return value of a function, where you put state and
- retsult variables together in an algebraic type (usually a tuple). In
- this case, we should leave the state variable out as well, since we
- don't want it to be included as an output port.
- \stopitemize
-
- So, in both cases, we can simply leave out the state variable from the
- resulting value. In the latter case, however, we should generate a state
- proc instead, which assigns the state variable to the input state variable
- at each clock tick.
- \stopdesc
-
- \desc{State variables can appear as (part of) a function result.} When
- generating \small{VHDL}, we can completely ignore any part of a function result
- that has a state type. If the entire result is a state type, this will
- mean the entity will not have an output port. Otherwise, the state
- elements will be removed from the type of the output port.
-
-
- Now, we know how to handle each use of a state variable separately. If we
- look at the whole, we can conclude the following:
-
- \startitemize
- \item A state unpack operation should not generate any \small{VHDL}. The binder
- to which the unpacked state is bound should still be declared, this signal
- will become the register and will hold the current state.
- \item A state pack operation should not generate any \small{VHDL}. The binder th
- which the packed state is bound should not be declared. The binder that is
- packed is the signal that will hold the new state.
- \item Any values of a State type should not be translated to \small{VHDL}. In
- particular, State elements should be removed from tuples (and other
- datatypes) and arguments with a state type should not generate ports.
- \item To make the state actually work, a simple \small{VHDL} proc should be
- generated. This proc updates the state at every clockcycle, by assigning
- the new state to the current state. This will be recognized by synthesis
- tools as a register specification.
- \stopitemize
-
-
- When applying these rules to the example program (in normal form), we will
- get the following result. All the parts that don't generate any value are
- crossed out, leaving some very boring assignments here and there.
-
+ \subsection{Explicit state annotation}
+ To make our stateful descriptions unambigious and easier to translate,
+ we need some way for the developer to describe which arguments and
+ results are intended to become stateful.
- \starthaskell
- avg i --spacked-- = res
- where
- s = --case spacked of (State s) -> s--
- --accums = case s of (accums, \_) -> accums--
- count = case s of (--\_,-- count) -> count
- accumres = accum i --accums--
- --accums' = case accumres of (accums', \_) -> accums'--
- sum = case accumres of (--\_,-- sum) -> sum
- count' = count + 1
- o = sum / count'
- s' = (--accums',-- count')
- --spacked' = State s'--
- res = (--spacked',-- o)
- \stophaskell
-
- When we would really leave out the crossed out parts, we get a slightly
- weird program: There is a variable \type{s} which has no value, and there
- is a variable \type{s'} that is never used. Together, these two will form
- the state proc of the function. \type{s} contains the "current" state,
- \type{s'} is assigned the "next" state. So, at the end of each clock
- cycle, \type{s'} should be assigned to \type{s}.
-
- Note that the definition of \type{s'} is not removed, even though one
- might think it as having a state type. Since the state type has a single
- argument constructor \type{State}, some type that should be the resulting
- state should always be explicitly packed with the State constructor,
- allowing us to remove the packed version, but still generate \small{VHDL} for the
- unpacked version (of course with any substates removed).
-
- As you can see, the definition of \type{s'} is still present, since it
- does not have a state type (The State constructor. The \type{accums'} substate has been removed,
- leaving us just with the state of \type{avg} itself.
- \subsection{Initial state}
- How to specify the initial state? Cannot be done inside a hardware
- function, since the initial state is its own state argument for the first
- call (unless you add an explicit, synchronous reset port).
-
- External init state is natural for simulation.
+ Roughly, we have two ways to achieve this:
+ \startitemize[KR]
+ \item Use some kind of annotation method or syntactic construction in
+ the language to indicate exactly which argument and (part of the)
+ result is stateful. This means that the annotation lives
+ \quote{outside} of the function, it is completely invisible when
+ looking at the function body.
+ \item Use some kind of annotation on the type level, \eg give stateful
+ arguments and (part of) results a different type. This has the
+ potential to make this annotation visible inside the function as well,
+ such that when looking at a value inside the function body you can
+ tell if it's stateful by looking at its type. This could possibly make
+ the translation process a lot easier, since less analysis of the
+ program flow might be required.
+ \stopitemize
- External init state works for hardware generation as well.
+ From these approaches, the type level \quote{annotations} have been
+ implemented in Cλash. \in{Section}[sec:prototype:statetype] expands on
+ the possible ways this could have been implemented.
- Implementation issues: state splitting, linking input to output state,
- checking usage constraints on state variables.
+ TODO: Say something about dependent types and fixed size vectors
\section[sec:recursion]{Recursion}
An import concept in functional languages is recursion. In it's most basic
Due to these complications, we leave other forms of recursion as
future work as well.
+
+ \section{Supported types}
+ TODO