From: Matthijs Kooijman Date: Wed, 11 Nov 2009 16:43:58 +0000 (+0100) Subject: Expand the improved notation for state and pipelining sections. X-Git-Tag: final-thesis~158 X-Git-Url: https://git.stderr.nl/gitweb?a=commitdiff_plain;h=4a1874f15102d7ad201d9f1f7fa8800bebff4ffc;p=matthijs%2Fmaster-project%2Freport.git Expand the improved notation for state and pipelining sections. --- diff --git a/Chapters/Future.tex b/Chapters/Future.tex index 6be0ba8..f59e034 100644 --- a/Chapters/Future.tex +++ b/Chapters/Future.tex @@ -2,12 +2,162 @@ \section{Improved notation for hierarchical state} The hierarchic state model requires quite some boilerplate code for unpacking and distributing the input state and collecting and repacking the output -state. There is really only one way in which this state handling can be done, -so it would make sense to hide this boilerplate. This would incur no -flexibility cost at all, since there are no other ways that would work. +state. + +\in{Example}[ex:NestedState] shows a simple composition of two stateful +functions, \hs{funca} and \hs{funcb}. The state annotation using the +\hs{State} newtype has been left out, for clarity and because the proposed +approach below cannot handle that (yet). + +\startbuffer[NestedState] + type FooState = ( AState, BState ) + foo :: Word -> FooState -> (FooState, Word) + foo in s = (s', outb) + where + (sa, sb) = s + (sa', outa) = funca in sa + (sb', outb) = funcb outa sb + s' = (sa', sb') +\stopbuffer +\placeexample[here][ex:NestedState]{Simple function composing two stateful + functions.}{\typebufferhs{NestedState}} + +Since state handling always follows strict rules, there is really no other way +in which this state handling can be done (you can of course move some things +around from the where clause to the pattern or vice versa, but it would still +be effectively the same thing). This means it makes extra sense to hide this +boilerplate away. This would incur no flexibility cost at all, since there are +no other ways that would work. + +One particular notation in Haskell that seemed promising, whas he \hs{do} +notation. This is meant to simplify Monad notation by hiding away some +details. It allows one to write a list of expressions, which are composited +using the monadic \emph{bind} operator, written in Haskell as \hs{>>}. For +example, the snippet: -Options: Misuse the do notation in Haskell, create some abstraction in -Haskell or add new syntax. +\starthaskell +do + somefunc a + otherfunc b +\stophaskell + +will be desugared into: + +\starthaskell +(somefunc a b) >> (otherfunc b c) +\stophaskell + +There is also the \hs{>>=} operator, which allows for passing variables from +one expression to the next. If we could use this notation to compose a +stateful computation from a number of other stateful functions, this could +move all the boilerplate code into the \hs{>>} operator. Perhaps the compiler +should be taught to always inline the \hs{>>} operator, but after that there +should be no further changes required to the compiler. + +This is highlights an important aspect of using a functional language for our +descriptions: We can use the language itself to provide abstractions of common +patterns, making our code smaller. + +\subsection{Outside the Monad} +However, simply using the monad notation is not as easy as it sounds. The main +problem is that the Monad type class poses a number of limitations on the +bind operator \hs{>>}. Most importantly, it has the following type signature: + +\starthaskell +(>>) :: (Monad m) => m a -> m b -> m b +\stophaskell + +This means that any expression in our composition must use the same Monad +instance as its type, only the "return" value can be different between +expressions. + +Ideally, we would like the \hs{>>} operator to have a type like the following +\starthaskell +type Stateful s r = s -> (s, r) +(>>) :: Stateful s1 r1 -> Stateful s2 r2 -> Stateful (s1, s2) r2 +\stophaskell + +What we see here, is that when we compose two stateful functions (whose inputs +have already been applied, leaving just the state argument to be applied), the +result is again a stateful function whose state is composed of the two +\emph{substates}. The return value is simply the return value of the second +function, discarding the first (to preserve that one, the \hs{>>=} operator +can be used). + +There is a trick we can apply to change the signature of the \hs{>>} operator. +\small{GHC} does not require the bind operators to be part of the \hs{Monad} +type class, as long as it can use them to translate the do notation. This +means we can define our own \hs{>>} and \hs{>>=} operators, outside of the +\hs{Monad} typeclass. This does conflict with the existing methods of the +\hs{Monad} typeclass, so we should prevent \small{GHC} from loading those (and +all of the Prelude) by passing \type{-XNoImplicitPrelude} to \type{ghc}. This +is slightly inconvenient, but since we hardly using anything from the prelude, +this is not a big problem. We might even be able to replace some of the +Prelude with hardware-translateable versions by doing this. + +We can now define the following binding operators. For completeness, we also +supply the return function, which is not called by \small{GHC} implicitely, +but can be called explicitely by a hardware description. + +\starthaskell +(>>) :: Stateful s1 r1 -> Stateful s2 r2 -> Stateful (s1, s2) r2 +f1 >> f2 = f1 >>= \_ -> f2 + +(>>=) :: Stateful s1 r1 -> (r1 -> Stateful s2 r2) -> Stateful (s1, s2) r2 +f1 >>= f2 = \\(s1, s2) -> let (s1', r1) = f1 s1 + (s2', r2) = f2 r1 s2 + in ((s1', s2'), r2) + +return :: r -> Stateful () r +return r = \s -> (s, r) +\stophaskell + +As you can see, this closely resembles the boilerplate of unpacking state, +passing it to two functions and repacking the new state. With these +definitions, we could have writting \in{example}[ex:NestedState] a lot +shorter, see \in{example}[ex:DoState]. In this example the type signature of +foo is the same (though it is now written using the \hs{Stateful} type +synonym, it is still completel equivalent to the original: \hs{foo :: Word -> +FooState -> (FooState, Word)}. + +Note that the \hs{FooState} type has changed (so indirectly the type of +\hs{foo} as well). Since the state composition by the \hs{>>} works on two +stateful functions at a time, the final state consists of nested two-tuples. +The final \hs{()} in the state originates from the fact that the \hs{return} +function has no real state, but is part of the composition. We could have left +out the return statement (and the \hs{outb <-} part) to make \hs{foo}'s return +value equal to \hs{funcb}'s, but this approach makes it clearer what is +happening. + +\startbuffer[DoState] + type FooState = ( AState, (BState, ()) ) + foo :: Word -> Stateful FooState Word + foo in = do + outa <- funca in sa + outb <- funcb outa sb + return outb +\stopbuffer +\placeexample[here][ex:DoState]{Simple function composing two stateful + functions, using do notation.} + {\typebufferhs{DoState}} + +An important implication of this approach is that the order of writing +function applications affects the state type. Fortunately, this problem can be +localized by consistently using type synonyms for state types, which should +prevent changes in other function's source when a function changes. + +A less obvous implications of this approach is that the scope of variables +produced by each of these expressions (using the \hs{<-} syntax) is limited to +the expressions that come after it. This prevents values from flowing between +two functions (components) in two directions. For most Monad instances, this +is a requirement, but here it could have been different. + +\subsection{Alternative syntax} +Because of the above issues, misusing Haskell's do notation is probably not +the best solution here. However, it does show that using fairly simple +abstractions, we could hide a lot of the boilerplate code. Extending +\small{GHC} with some new syntax sugar similar to the do notation might be a +feasible. \section[sec:future:pipelining]{Improved notation or abstraction for pipelining} Since pipelining is a very common optimization for hardware systems, it should @@ -15,11 +165,53 @@ be easy to specify a pipelined system. Since it involves quite some registers into an otherwise regular combinatoric system, we might look for some way to abstract away some of the boilerplate for pipelining. -Example of naive pipelined code +Something similar to the state boilerplate removal above might be appropriate: +Abstract away some of the boilerplate code using combinators, then hide away +the combinators in special syntax. The combinators will be slightly different, +since there is a (typing) distinction between a pipeline stage and a pipeline +consisting of multiple stages. Also, it seems necessary to treat either the +first or the last pipeline stage differently, to prevent an off-by-one error +in the amount of registers (which is similar to the extra \hs{()} state type +in \in{example}[ex:DoState], which is harmless there, but would be a problem +if it introduced an extra, useless, pipeline stage). + +This problem is slightly more complex than the problem we've seen before. One +significant difference is that each variable that crosses a stage boundary +needs a registers. However, when a variable crosses multiple stage boundaries, +it must be stored for a longer period and should receive multiple registers. +Since we can't find out from the combinator code where the result of the +combined values is used (at least not without using Template Haskell to +inspect the \small{AST}), there seems to be no easy way to find how much +registers are needed. + +There seem to be two obvious ways of handling this problem: -Using monadic do does not fit the typing +\startitemize + \item Limit the scoping of each variable produced by a stage to the next + stage only. This means that any variables that are to be used in subsequent + stages should be passed on explicitely, which should allocate the required + number of registers. + + This produces cumbersome code, where there is still a lot of explicitness + (though this could be hidden in syntax sugar). + \item Scope each variable over every subsequent pipeline stage and allocate + the maximum amount of registers that \emph{could} be needed. This means we + will allocate registers that are never used, but those could be optimized + away later. This does mean we need some way to introduce a variable number + of variables (depending on the total number of stages), assign the output of + a different register to each (\eg., a different part of the state) and scope + a different one of them over each the subsequent stages. + + This also means that when adding a stage to an existing pipeline will change + the state type of each of the subsequent pipeline stages, and the state type + ofthe added stage depends on the number of subsequent stages. + + Properly describing this will probably also require quite explicit syntax, + meaning this is not feasible without some special syntax. +\stopitemize -Using custom combinators would work +Some other interesting issues include pipeline stages which are already +stateful, mixing pipelined with normal computation, etc. \section{Recursion} The main problems of recursion have been described in