X-Git-Url: https://git.stderr.nl/gitweb?p=matthijs%2Fmaster-project%2Freport.git;a=blobdiff_plain;f=Chapters%2FFuture.tex;h=6be0ba8be0d465adc19063ffa2030c9779d016cd;hp=10a883ae145bbbfbfd8ed432430caa2f8e0f44cf;hb=48f4c2d4c18166aac6485e1c4ce759940c320bb2;hpb=9bb3281de202ae2e33675885eaca9936c76a8a82 diff --git a/Chapters/Future.tex b/Chapters/Future.tex index 10a883a..6be0ba8 100644 --- a/Chapters/Future.tex +++ b/Chapters/Future.tex @@ -1,4 +1,4 @@ -\chapter{Future work} +\chapter[chap:future]{Future work} \section{Improved notation for hierarchical state} The hierarchic state model requires quite some boilerplate code for unpacking and distributing the input state and collecting and repacking the output @@ -9,7 +9,7 @@ flexibility cost at all, since there are no other ways that would work. Options: Misuse the do notation in Haskell, create some abstraction in Haskell or add new syntax. -\section{Improved notation or abstraction for pipelining} +\section[sec:future:pipelining]{Improved notation or abstraction for pipelining} Since pipelining is a very common optimization for hardware systems, it should be easy to specify a pipelined system. Since it involves quite some registers into an otherwise regular combinatoric system, we might look for some way to @@ -20,3 +20,328 @@ Example of naive pipelined code Using monadic do does not fit the typing Using custom combinators would work + +\section{Recursion} +The main problems of recursion have been described in +\in{section}[sec:recursion]. In the current implementation, recursion is +therefore not possible, instead we rely on a number of implicit list-recursive +builtin functions. + +Since recursion is a very important and central concept in functional +programming, it would very much improve the flexibility and elegance of our +hardware descriptions if we could support full recursion. + +For this, there are two main problems to solve: + +\startitemize +\item For list recursion, how to make a description type check? It is quite +possible that improvements in the \small{GHC} typechecker will make this +possible, though it will still stay a challenge. Further advances in +dependent typing support for Haskell will probably help here as well. + +TODO: Reference Christiaan and other type-level work +(http://personal.cis.strath.ac.uk/conor/pub/she/) +\item For all recursion, there is the obvious challenge of deciding when +recursion is finished. For list recursion, this might be easier (Since the +base case of the recursion influences the type signatures). For general +recursion, this requires a complete set of simplification and evaluation +transformations to prevent infinite expansion. The main challenge here is how +to make this set complete, or at least define the constraints on possible +recursion which guarantee it will work. + +TODO: Loop unrolling +\stopitemize + +\section{Multiple clock domains and asynchronicity} +Cλash currently only supports synchronous systems with a single clock domain. +In a lot of real-world systems, both of these limitations pose problems. + +There might be asynchronous events to which a system wants to respond. The +most obvious asynchronous event is of course a reset signal. Though a reset +signal can be synchronous, that is less flexible (and a hassle to describe in +Cλash, currently). Since every function in Cλash describes the behaviour on +each cycle boundary, we really can't fit in asynchronous behaviour easily. + +Due to the same reason, multiple clock domains cannot be supported. There is +currently no way for the compiler to know in which clock domain a function +should operate and since the clock signal is never explicit, there is also no +way to express circuits that synchronize various clock domains. + +A possible way to express more complex timing behaviour would be to make +functions more generic event handlers, where the system generates a stream of +events (Like \quote{clock up}, \quote{clock down}, \quote{input A changed}, +\quote{reset}, etc.). When working with multiple clock domains, each domain +could get its own events. + +As an example, we would have something like the following: + +\starthaskell +data Event = ClockUp | Reset | ... + +type MainState = State Word + +-- Initial state +initstate :: MainState +initstate = State 0 + +main :: Word -> Event -> MainState -> (MainState, Word) +main inp event (State acc) = (State acc', acc') + where + acc' = case event of + -- On a reset signal, reset the accumulator and output + Reset -> initstate + -- On a normal clock cycle, accumulate the result of func + ClockUp -> acc + (func inp event) + -- On any other event, leave state and output unchanged + _ -> acc + +-- func is some combinatoric expression +func :: Word -> Event -> Word +func inp _ = inp * 2 + 3 +\stophaskell + +TODO: Picture + +In this example, we see that every function takes an input of type +\hs{Event}. The function \hs{main} that takes the output of +\hs{func} and accumulates it on every clock cycle. On a reset signal, the +accumulator is reset. The function \hs{func} is just a combinatoric +function, with no synchronous elements. We can see this because there is no +state and the event input is completely ignored. If the compiler is smart +enough, we could even leave the event input out for functions that don't use +it, either because they are completely combinatoric (like in this example), or +because they rely on the the caller to select the clock signal. + +This structure is similar to the event handling structure used to perform I/O +in languages like Amanda. TODO: Ref. There is a top level case expression that +decides what to do depending on the current input event. + +A slightly more complex example that shows a system with two clock domains. +There is no real integration between the clock domains (there is one input and +one output for each clock domain), but it does show how multiple clocks can be +distinguished. + +\starthaskell +data Event = ClockUpA | ClockUpB | ... + +type MainState = State (SubAState, SubBState) + +-- Initial state +initstate :: MainState +initstate = State (initsubastate, initsubbstate) + +main :: Word -> Word -> Event -> MainState -> (MainState, Word, Word) +main inpa inpb event (State (sa, sb)) = (State (sa', sb'), outa, outb) + where + -- Only update the substates if the corresponding clock has an up + -- transition. + sa' = case event of + ClockUpA -> sa'' + _ -> sa + sb' = case event of + ClockUpB -> sb'' + _ -> sb + -- Always call suba and subb, so we can always have a value for our output + -- ports. + (sa'', outa) = suba inpa sa + (sb'', outb) = subb inpb sb + +type SubAState = State ... +suba :: Word -> SubAState -> (SubAState, Word) +suba = ... + +type SubBState = State ... +subb :: Word -> SubAState -> (SubAState, Word) +subb = ... +\stophaskell + +Note that in the above example the \hs{suba} and \hs{subb} functions are +\emph{always} called, to get at their combinatoric outputs. The new state +is calculated as well, but only saved when the right clock has an up +transition. + +As you can see, there is some code duplication in the case statement that +selects the right clock. One of the advantages of an explicit approach like +this, is that some of this duplication can be extracted away into helper +functions. For example, we could imagine a \hs{select_clock} function, which +takes a stateful function that is not aware of events, and changes it into a +function that only updates its state on a specific (clock) event. + +\starthaskell +select_clock :: Event + -> (input -> State s -> (State s, output)) + -> (input -> State s -> Event -> (State s, output)) +select_clock clock func inp state event = (state', out) + where + state' = if clock == event then state'' else state + (state'', out) = func inp state + +main :: Word -> Word -> Event -> MainState -> (MainState, Word, Word) +main inpa inpb event (State (sa, sb)) = (State (sa', sb'), outa, outb) + where + (sa'', outa) = (select_clock ClockUpA suba) inpa sa event + (sb'', outb) = (select_clock ClockUpB subb) inpb sb event +\stophaskell + +As you can see, this can greatly reduce the length of the main function, while +increasing the readability. As you might also have noted, the select\_clock +function takes any stateful function from the current Cλash prototype and +turns it into an event-aware function! + +Going along this route for more complex timing behaviour seems promising, +espicially since it seems possible to express very advanced timing behaviours, +while still allowing simple functions without any extra overhead when complex +behaviour is not needed. + +The main cost of this approach will probably be extra complexity in the +compiler: The paths (state) data can take become very non-trivial, and it +is probably hard to properly analyze these paths and produce the intended VHDL +description. + +\section{Multiple cycle descriptions} +In the current Cλash prototype, every description is a single-cycle +description. In other words, every function describes behaviour for each +separate cycle and any recursion (or folds, maps, etc.) is expanded in space. + +Sometimes, you might want to have a complex description that could possibly +take multiple cycles. Some examples include: + +\startitemize + \item Some kind of map or fold over a list that could be expanded in time + instead of space. This would result in a function that describes n cycles + instead of just one, where n is the lenght of the list. + \item A large combinatoric expressions that would introduce a very long + combinatoric path and thus limit clock frequency. Such an expression could + be broken up into multiple stages, which effectively results in a pipelined + system (see also \in{section}[sec:future:pipelining]) with a known delay. + There should probably be some way for the developer to specify the cycle + division of the expression, since automatically deciding on such a division + is too complex and contains too many tradeoffs, at least initially. + \item Unbounded recursion. It is not possible to expand unbounded (or even + recursion with a depth that is not known at compile time) in space, since + there is no way to tell how much hardware to create (it might even be + infinite). + + When expanding infinite recursion over time, each step of the recursion can + simply become a single clockcycle. When expanding bounded but unknown + recursion, we probably need to add an extra data valid output bit or + something similar. +\stopitemize + +Apart from translating each of these multiple cycle descriptions into a per-cycle +description, we also need to somehow match the type signature of the multiple +cycle description to the type signature of the single cycle description that +the rest of the system expects (assuming that the rest of the system is +described in the \quote{normal} per-cycle manner). For example, an infinitely +recursive expression typically has the return type \lam{[a]}, while the rest +of the system would expect just \lam{a} (since the recursive expression +generates just a single element each cycle). + +Naively, this matching could be done using a (builtin) function with a +signature like \lam{[a] -> a}, which also serves as an indicator to the +compiler that some expanding over time is required. However, this poses a +problem for simulation: How will our Haskell implementation of this magical +builtin function know which element of the input list to return. This +obviously depends on the current cycle number, but there is no way for this +function to know the current cycle without breaking all kinds of safety and +purity properties. Making this function have some state input and output could +help, though this state is not present in the actual hardware (or perhaps +there is some state, if there are value passed from one recursion step to the +next, but how can the type of that state be determined by the typechecker?). + +It seems that this is the most pressing problem for multicycle descriptions: +How to interface with the rest of the system, without sacrificing safety and +simulation capabilities? + +\section{Higher order values in state} +Another interesting idea is putting a higher order value inside a function's +state value. Since we can use higher order values anywhere, why not in the +state? + +As a (contrived) example, consider the following accumulator: + +\starthaskell +-- The accumulator function that takes a word and returns a new accumulator +-- and the result so far. This is the function we want to put inside the +-- state. +type Acc = Word -> (Acc, Word) +acc = \a -> (\b -> acc ( a + b ), a ) + +main :: Word -> State Acc -> (State Acc, Word) +main a s = (State s', out) + where (s', out) = s a +\stophaskell + +This code uses a function as its state, which implicitely contains the value +accumulated so far. This is a fairly trivial example, that is more easy to +write with a simple \hs{Word} state value, but for more complex descriptions +this style might pay off. Note that in a way we are using the +\emph{continuation passing style} of writing functions, where we pass a +sort of \emph{continuation} from one cycle to the next. + +Our normalization process completely removes higher order values inside a +function by moving applications and higher order values around so that every +higher order value will eventually be full applied. However, if we would put a +higher order value inside our state, the path from the higher order value +definition to its application runs through the state, which is external to the +function. A higher order value defined in one cycle is not applied until a +later cycle. Our normalization process only works within the function, so it +cannot remove this use of a higher order value. + +However, we cannot leave a higher order value in our state value, since it is +impossible to generate a register containing a higher order value, we simply +can't translate a function type to a hardware type. To solve this, we must +replace the higher order value inside our state with some other value +representing these higher order values. + +On obvious approach here is to use an algebraic datatype where each +constructor represents one possible higher order value that can end up in the +state and each constructor has an argument for each free variable of the +higher order value replaced. This allows us to completely reconstruct the +higher order value at the spot where it is applied, and thus the higher order +value disappears. + +This approach is commonly known as the \quote{Reynolds approach to +defuntionalization}, first described by J.C. Reynolds and seems to apply well +to this situation. One note here is that Reynolds' approach puts all the +higher order values in a single datatype. For a typed language, we will at +least have to have a single datatype for each function type, since we can't +mix them. It would be even better to split these datatypes a bit further, so +that such a datatype will never hold a constructor that is never used for a +particular state variable. This separation is probably a non-trivial problem, +though. + +TODO: Reference "Definitional interpreters for higher-order programming +languages": +http://portal.acm.org/citation.cfm?id=805852\&dl=GUIDE\&coll=GUIDE\&CFID=58835435\&CFTOKEN=81856623 + +\section{New language} +During the development of the prototype, it has become clear that Haskell is +not the perfect language for this work. There are two main problems: +\startitemize +\item Haskell is not expressive enough. Haskell is still quite new in the area +of dependent typing, something we use extensively. Also, Haskell has some +special syntax to write monadic composition and arrow composition, which is +well suited to normal Haskell programs. However, for our hardware +descriptions, we could use some similar, but other, syntactic sugar (as we +have seen above). +\item Haskell is too expressive. There are some things that Haskell can +express, but we can never properly translate. There are certain types (both +primitive types and certain type constructions like recursive types) we can +never translate, as well as certain forms of recursion. +\stopitemize + +It might be good to reevaluate the choice of language for Cλash, perhaps there +are other languages which are better suited to describe hardware, now that +we've learned a lot about it. + +Perhaps Haskell (or some other language) can be extended by using a +preprocessor. There has been some (point of proof) work on a the Strathclyde +Haskell Enhancement (\small{SHE}) preprocessor for type-level programming, +perhaps we can extend that, or create something similar for hardware-specific +notation. + +It is not unlikely that the most flexible way +forward would be to define a completely new language with exactly the needed +features. This is of course an enormous effort, which should not be taken +lightly.