From 84d984a32326eefa22249bd377e452aed14671df Mon Sep 17 00:00:00 2001 From: Matthijs Kooijman Date: Wed, 20 May 2009 11:50:12 +0200 Subject: [PATCH] Add a few progress documents. These documents contain miscellaneous notes from the early stages of the project. --- progress-2009.01.14.txt | 125 ++++++++++++++++++++ progress-2009.01.28.txt | 245 ++++++++++++++++++++++++++++++++++++++++ progress-2009.02.04.txt | 211 ++++++++++++++++++++++++++++++++++ 3 files changed, 581 insertions(+) create mode 100644 progress-2009.01.14.txt create mode 100644 progress-2009.01.28.txt create mode 100644 progress-2009.02.04.txt diff --git a/progress-2009.01.14.txt b/progress-2009.01.14.txt new file mode 100644 index 0000000..cc70be2 --- /dev/null +++ b/progress-2009.01.14.txt @@ -0,0 +1,125 @@ +Language Design +=============== +A central question is, what do we want our language to look like? Deciding to +use haskell would be a first step, but that still leaves a lot of +possibilities. However, why would we want to use Haskell? + + * Lots of existing support: Language specification, compilers, libraries, + userbase. All of this reduces the workload to implement something. + * All of Haskell is usable in simulation / testing: When not compiling to + VHDL, but when doing functional tests, we can use all of Haskell's syntax, + libraries, functions, etc. even when we don't support all of it in our + compiler. + * Haskell is probably expressive enough to be able to express a lot of + different things and different approaches. By defining new (algebraic) + types and operators, we can extend this even further. Finally, we could + even use Template Haskell to do almost anything (The ForSyDe project uses + Template Haskell extensively). + +Typing +------ +What kind of types will we be using in our programs? The base type should be +some kind of bit type, but there are two main options for these: + + * Use a single bit as a type. Calculations on these types represent the + calculations in a single time slice (ie, during one clock cycle). Any + state in the calculation is explicit, since the input state must be a + function argument and the output state part of the function's value. + + Through the state values, these functions are closely tied to the clock + domain in use (or whatever other model is used for the time domain). Ie, + having different clock domains is non-trivial and will require + modification to the function. + + Composition can be done by simply connection functions together. This does + require that the "state" of a function contains the states of each of the + function it calls. These are extracted from its state argument and passed + on to the callees, while the states returned are again inserted into the + resulting state of the outer function. + + This approach is the closest to normal functional programming and shows + state very explicitly. + + * Use a stream of bits as a type. Each value models a continuous (infinite) + stream of bits. Functions have streams as arguments and value and state + could probably be implicit in the calculation or could be made explicit + using delay elements (this is what Lava does). + + The primitive operations all operate on streams, which seems the + fundamental difference between this and the previous approach. + + This approach shows structure more explicitly when delay elements are + used, since the registers end up exactly there. + +It is probably possible to mix these approaches. In particular, the first +approache eventually needs to do some I/O, which can be modeled as recursion +over an infinite list of inputs resulting in an infinite list of outputs. This +can be done at the top level always, but it is perhaps possible to move this +recursion a bit downward and do composition based on stream functions instead +of "single clock" functions. One problem that comes to mind is that one needs +to guarantee that a function does not look into the future (ie, uses multiple +input elements to generate one output element) and the state becomes a lot +less explicit. This needs more thought. + +Implementation +============== +Assume we will want to translate a (subset of) haskell into something else. +We will need to do parsing, typechecking, interpretation, etc. A few options +for this are available: + + * Doing everything manually, possibly using a parser generator. + Lot's of work. Requires us to write down the haskell grammar, create a + parser, do all kinds of funky type checking -> Too much work. + * Use the Language.Haskell library and write something in haskell. This + library contains a Lexer, Parser and AST, but still does not do any type + checking, desugaring and simplification. Still lots of work. + * Hack into an existing compiler. By writing a backend for an existing + compiler, we can save most of the haskell-specific work. This requires a + compiler that is modular enough to have its backend replaced by another + implementation. Also, the border between backend and frontend must be + suited for the work we would like to do. This approach would use the + original compiler's driver for the most part + * Use components from an existing compiler. By reusing selected components + and creating an own driver that uses these components, we can work with + the haskell source without doing all the work ourselves. This requires a + compiler that is very modular, so we can decide exactly which parts we + would like to use and which we don't need. + +After some looking around it seems GHC provides a consistent and (as of +version 6.10) documented API into its internals This allows us to write + + core <- compileToCoreSimplified "adder.hs" + +to compile the file `adder.hs` into a version in GHC's simplified CoreSyn +representation. This is essentially an AST with type annotations, all +syntactic sugar removed, redundant expressions simplified, etc. This seems to +be a very useful form to work with. However, GHC provides other functions to +access other parts of the compiler, so if we would like to access the source +at another stage, or do some custom procesing in between, this is also +possible (albeit a bit more complex than this). + +Problems +-------- +Some problems encountered or expected when trying to do some initial +translator hackup: + + * Signal naming. In the generated VHDL the signals and ports need names. + These can of course be generated, but if we want to do some actual + debugging, we want useful signal names. For input ports generated from + function arguments, the binder name could perhaps be used, but that is not + always possible when patterns are more complex. + + For output ports, a name could be deduced when the output value is always + computed in a where or let clause. However the simplification phase + removes these clauses where possible, so this information is lost. + + Perhaps some way of explicitely labeling ports could be used. + * Type classes and universal qualification types are a powerful way of + making functions generic, which greatly increases their reusability. + However, this extra genericity makes the resulting Core representation + more complex. Since the actual types used are modeled as extra function + arguments, this shouldn't be too hard, but this is something that probably + needs implementation right away (because a lot of builtin haskell + constructs, such as tuples, use it). + + diff --git a/progress-2009.01.28.txt b/progress-2009.01.28.txt new file mode 100644 index 0000000..e690697 --- /dev/null +++ b/progress-2009.01.28.txt @@ -0,0 +1,245 @@ +How to model hardware? +====================== +So far, we've had some ideas about how to model hardware using a functional +language. The general idea is that every function that models hardware has +some input signals, some "input" state and a few output signals and "output" +state. But how do we represent this exactly in our haskell code? + +The output part is probably easiest, since a function always returns a single +value. Since it needs to return both output signals and a state, it seems +obvious to make it always return a 2-tuple containing the state and output +signals. So: + + circuit :: ? -> (State, Signals) (1) + +It's interesting to note that the State and Signal types here are not really +important when simulating, so let's look at the inputs first. + +Again, we have both input signals and a state to pass to the function. It +seems consistent to again do this a single tuple, so a circuit becomes: + + circuit :: (State, Signals) -> (State, Signals) (2) + +I've not given this variant a lot of thought yet, but i have the feeling this +is not all to useful. The variant I've been using so far separates the state +and signals into separate arguments: + + circuit :: Signals -> State -> (State, Signals) (4) + +Note that this uses a seemingly inconsistent order for the arguments, but +that's mostly because the current implementation uses this order (and wasn't +well thought out). The exact order is mostly relevant for partial application, +but I'm not quite sure if that's possible or useful at all. + +This variant makes it easy to define a dependent type for all circuits, which +you pass the input, state and output types as an argument: + + type Circuit a s b = a -> s -> (s, b) (5) + +A last variant would be to separate each (or perhaps only some) input signals +into different arguments, such that each input has its own argument. For +example, for a two input port circuit: + + circuit :: Signal -> Signal -> State -> (State, Signals) (6) + +This has the obvious advantage that it could support partial application of +the input ports (ie, connect a few input pins to a circuit and pass the +resulting function elsewhere). I'm not sure how that would or should work out, +but it seems useful. + +However, this partial application seems only possible with input ports, which +might limit its usefuleness. Also, this type signature makes it harder to +define a single Circuit type that catches all circuits as with (4) and (5). +This might make simulation more tricky, though it's probably not critical. + +Signals (or Ports?) +------------------- +Now, regardless of which of the above approaches we choose, there should be +some way to model these signals and ports. So far, a basic Bit signal type +has been used: + + data Bit = High | Low | DontCare (7) + +For now, this is the only primitive type supported by the (Haskell-to-VHDL) +translator, but as pointed out above the simulator doesn't really care about +these types (as long as the input provided is of the right type). Supporting +native Haskell types (or perhaps our own variants, *e.g.*, an Int with a fixed +bit width) is probably needed later on as well. + +To allow for multiple input and output ports in a single value, tuples can be +used. This allows for grouping multiple signals while still allowing each +signal to have a different type. + +An alternative that has not been explored yet is the use of an algebraic type +to group signals. This seems to have the direct advantage of allowing names +for each element to be specified, but does add some overhead in type +declarations. It's likely that a combination of tuples and algebraic types are +best. + +It appears that lists might become useful to model VHDL (bit)vectors, since +both lists and vectors have elements of the same type. This does require that +the length of a list can be deduced at compile time. + +DontCare +-------- +In the above definition of the Bit datatype a DontCare element was present. +The idea behind this element was to assign it anywhere the value of the bit +shouldn't matter (in the alu example, this is but on the input signals that +should be ignored). Since it is not used in any other place (*e.g.*, patterns +usually only match Low and High, including the display functions), Haskell +should error out as soon as a DontCare value is actually used. Due to the lazy +nature of Haskell's evaluation, this should never happen as long as the value +is not really used. But as soon as the value is (indirectly) needed, *i.e.*, +contributes to the output of a circuit, Haskell errors out. + +Some functions ignore some inputs. For example, hwor is defined as: + + High `hwor` _ = High (8) + _ `hwor` High = High + Low `hwor` Low = Low + +This means that High `hwxor` DontCare will still be High (with no error), but +that DontCare `hwand` DontCare will immediately throw an error. This seems to +be desired behaviour, but this needs more investigation. + +State +----- +For state, we can mostly do the same reasoning as for signals (from the +circuit description point of view, input state and output state is hardly any +differen from input signals and output signals). For combining multiple +elements of state (either because a circuit has multiple Bits of state or +because it combines the states of each of its subcircuits), tuples can be +used. Algebraic types might have a use here as well. + +Stateless circuits +------------------ +Above we've concentrated on stateful circuits, but probably a lot of circuits +will be stateless. This can simply be modeled by having an empty state +(*i.e.*, () ), but that's not very elegant. However, not all of the above +approaches are usable for supporting stateless circuits. For example, if we +adapt (2) to be stateless: + + stateless_circuit :: Signals -> Signals (9) + +Now, considering that we allowed Signals to be a tuple of multiple signals, +the following two circuit types are indistinguishable, since the State and +Signal types are the same. + + stateless_circuit :: (Signal, Signal) -> (Signal, Signal) (10) + stateful_circuit :: (State, Signal) -> (State, Signal) + +Something similar goes for (6), where leaving out the state for stateless +circuits would make a stateless circuit with two inputs and two output +indistinguishable from a stateful circuit with a single input and a single +output. + +(4) above seems best suited for adapting for stateless circuits, since it +always has two arguments (inputs and state). A stateless variant would always +have a single argument. + +Examples +-------- +To get a feeling for things, here are some example using approach (6). + +Some stateless adders: + + -- Combinatoric stateless no-carry adder + -- A -> B -> S + no_carry_adder :: (Bit, Bit) -> Bit + no_carry_adder (a, b) = a `hwxor` b + + -- Combinatoric stateless half adder + -- A -> B -> (S, C) + half_adder :: (Bit, Bit) -> (Bit, Bit) + half_adder (a, b) = + ( a `hwxor` b, a `hwand` b ) + + -- Combinatoric stateless full adder + -- (A, B, C) -> (S, C) + full_adder :: (Bit, Bit, Bit) -> (Bit, Bit) + full_adder (a, b, cin) = (s, c) + where + s = a `hwxor` b `hwxor` cin + c = a `hwand` b `hwor` (cin `hwand` (a `hwxor` b)) + +A simple four bit cyclic shift register, which has an input that can be used +to toggle (xor) the first position. + + type ShifterState = [Bit] + shifter :: Bit -> ShifterState -> (ShifterState, Bit) + shifter i s = + (s', o) + where + s' = (o `hwxor` i) : (init s) + o = last s + +An implementation of a simple ALU (supporting only two operations: or & and on +two one-bit inputs) combined with a register bank containing two one-bit +registers. It also contains a list of inputs (the program), an initial state +and some wrappers around the simulation functions (main and mainIO). + + main = Sim.simulate exec program initial_state + mainIO = Sim.simulateIO exec initial_state + + program = [ + -- (addr, we, op) + (High, Low, High), -- z = r1 and t (0) ; t = r1 (1) + (Low, Low, Low), -- z = r0 or t (1); t = r0 (0) + (Low, High, DontCare), -- r0 = z (1) + (High, Low, High), -- z = r1 and t (0); t = r1 (1) + (High, High, DontCare) -- r1 = z (0) + ] + + initial_state = ((Low, High), (), Low, Low) + + -- Register bank + + type RegAddr = Bit + type RegisterBankState = (Bit, Bit) + register_bank :: + (RegAddr, Bit, Bit) -> -- (addr, we, d) + RegisterBankState -> -- s + (RegisterBankState, Bit) -- (s', o) + + register_bank (Low, Low, _) s = -- Read r0 + (s, fst s) + + register_bank (High, Low, _) s = -- Read r1 + (s, snd s) + + register_bank (addr, High, d) s = -- Write + (s', DontCare) + where + (r0, r1) = s + r0' = if addr == Low then d else r0 + r1' = if addr == High then d else r1 + s' = (r0', r1') + + -- ALU + + type AluState = () + type AluOp = Bit + + alu :: (AluOp, Bit, Bit) -> AluState -> (AluState, Bit) + alu (High, a, b) s = ((), a `hwand` b) + alu (Low, a, b) s = ((), a `hwor` b) + + type ExecState = (RegisterBankState, AluState, Bit, Bit) + exec :: (RegAddr, Bit, AluOp) -> ExecState -> (ExecState, ()) + + -- Read & Exec + exec (addr, Low, op) s = + (s', ()) + where + (reg_s, alu_s, t, z) = s + (reg_s', t') = register_bank (addr, Low, DontCare) reg_s + (alu_s', z') = alu (op, t', t) alu_s + s' = (reg_s', alu_s', t', z') + + -- Write + exec (addr, High, op) s = + (s', ()) + where + (reg_s, alu_s, t, z) = s + (reg_s', _) = register_bank (addr, High, z) reg_s + s' = (reg_s', alu_s, t, z) diff --git a/progress-2009.02.04.txt b/progress-2009.02.04.txt new file mode 100644 index 0000000..0b475d1 --- /dev/null +++ b/progress-2009.02.04.txt @@ -0,0 +1,211 @@ +What is a function? +=================== +In general, a function is something providing a mapping from a number of input +arguments to a number of results. Each of these arguments and results can be +normal values, but also again functions. + +Function application is evaluating the mapping for a given set of input +operands. + +Looking at this from a hardware modeling perspective, we get different +results. Then, a function is essentially a template for a circuit, with each +of the arguments and results representing input and output ports. Optionally, +some of the arguments and results (always matched pairs) can become part of +the state, meaning that they generate a register that connects to the argument +and result concerned. + +Function application, then, is embedding the function's circuit template into +the callers circuit template, connecting any arguments and results to the +circuit's ports. This embedding will completely decouple the applied function +from the original function: The circuit may look alike, but different +applications have no connection or shared components in the final hardware. + +This is an important property, since it allows the caller complete freedom in +inlining, simplifying and changing any functions it applies. Or more +specifically, a caller can clone a function and modify the clone freely +(since it is the only caller of it). Since every application is unrelated to +all others, cloning a function has no extra cost whatsoever (unlike function +duplication in software, where the code size greatly increases). + +The above view works only for functions accepting and returning primitive +values, *i.e.*, that can be represented as (possibly multiple) VHDL signals +directly. In particular, functions cannot be used as arguments or returned +from functions. + +High order functions and partial evaluation +=========================================== +Before wondering how we implement high order functions and partial evaluation, +let's see where we would like to use them. + +Modifying function signature +---------------------------- +A simple application of high order functions is modifying a function's +signature. An example of this is the (un)curry function. In hardware, a +stateless function could be used to make a stateless function conform to the +standard signature and make it synthesisable / simulatable: + + stateless :: (a -> b) -> (a -> () -> ((), b)) + stateless f i s = (s, f i) + +This would be used to make some function foo statefull as follows: + + stateful_foo = stateless foo + +Another example would be to swap the order of inputs: + + swap_in :: ((a, b) -> o) -> ((b, a) -> o) + swap_in f (b, a) = f (a, b) + +Both of these examples take a function as argument, but return plain values +(though swap_in could return a high order function when the function passed in +has multiple arguments). + +Reusing common code +------------------- +When modeling an ALU, we could imagine generalizing the ALU operations and +making the control code independent of the actual operation. + +The following alu models a circuit with three input: A one bit opcode and two +inputs. The opcode selects between two operations, bitwis and and bitwise or. + + and_or_alu :: (Bit, Bit, Bit) -> BIt + and_or_alu (o, a, b) = + if o == Low + then hwand a b + else hwor a b + +Now, we might want to generalize this to be independent of the actual +operations peformed: + + type operation = Bit -> Bit -> Bit + alu :: operation -> operation -> (Bit, Bit, Bit) -> Bit + + alu op1 op2 (o, a, b) = + if o == Low + then op1 a b + else op2 a b + +Now, we can write the above specific alu as: + + and_or_alu = alu and or + +Parameterized circuits +---------------------- +This is probably a variant on the above, but using literal parameters instead +of functions as parameters. + +Finding a good example of this seems hard, since a lot of parameterized +circuits are implicitly parameterized using the length of lists in their +input, output or state. + +A possible example could be a program counter, that is a incrementing +register. The amount by which to increment the register depends on the +instruction word size, which you might want to make generic. + + pc :: Int -> Word32 -> (Word32, Word32) + pc n s = + (s', o) + where + s' = s + n + o = s + +Since this example is only useful when using some kind of integers instead of +just Bits, this assumes Word32 can actually be used. + +Partial application +------------------- +Say we have some encryption circuit, that encrypts a stream by xor'ing it with +some key stream. + + crypt :: (s -> (s, Bit)) -> Bit -> s -> (s, Bit) + crypt f i s = + (s', o) + where + (s', key) = f s + o = hwxor i key + +Now, we have some function taking in some source of entropy (modeled as a +signal of type Foo) and uses that to generate a (pseudo) random keystream. The +implementation of this function is not so important. Also, the type KeyState +represents the state of mkKey, but what it is exactly doesn't really matter. + + mkKey :: Foo -> KeyState -> (KeyState, Bit) + +Now, I can create a full crypt system as follows. This function has a Foo and +Bit input signals and a Bit output signal. + + crypt_full :: Foo -> Bit -> KeyState -> (KeyState, Bit) + crypt_full foo i s + crypt keygen i s + where + keygen = mkKey foo + +Now, this example isn't particularly good, since it would be just as easy to +simply make crypt take a Bit as input instead of a function generating bits +(This is probably also how you would synthesize something like this). Perhaps +more useful examples could be found, where passing a function makes notation +a lot easier than passing the function's (remaining) inputs and outputs +around. + +Preconnecting ports +------------------- +Sometimes it is useful to connect some ports of multiple similar pieces +of hardware to the same signal. For example, the following builds a +multibit multiplexer by mapping a bunch of single bit ones together. +Since all multiplexers use the same select input, we pass it to mplex_1 +and then pass the result to map. This makes the code clearer than +replicating the select input (length abs) times and using zipWith +instead of map. + + mplex_1 :: Bit -> (Bit, Bit) -> Bit + mplex_1 enable select (a, b) = + if select == Low then a else b + + mplex :: Bit -> [(Bit, Bit)] -> [Bit] + mplex select abs = + map (mplex_1 select) abs + +State ambiguity +=============== +Previously, we've concluded that choosing to translate an argument or result +to either a state variable or port would be possible by requiring a specific +type signature on the top level function and hierarchically looking which +variables are thus also state variables. + +In practice, this appears harder than we thought. We can observe that a state +variable should resolve to a register, but where exactly is undefined. In +particular, we could simply translate all arguments and results to ports, and +on the highest level instantiate registers for all state variables. +Functionally, this is completely correct, but this is probably not what we +want in most cases. Each function that has state will probably want to have +that state represented inside it's architecture, rather than to have two ports +which should be connected to a register externally. + +This means we need to "push" the registers down the hierarchy to get them +where they "belong". Somewhere down the line, however, a state variable will +eventually be used in some way, often by passing it to another function +(*i.e.*, connecting the output of the register to some other circuit). The +place where this happens is also the place where you want the register to end +up. + +In a lot of cases, this place is also as far as you can push the register down +(when you pass the current register value to some function a, but bind the new +register value to the result of some function b, you can't push further down +since both a and b use the register). The tricky part is now when you pass a +state value to some function a and that same function a produces the new value +of the state value. Is the state value now really a state of the function a, +or did the programmer just mean to connect a register to function a +externally? + +Note that this assumes some clever implementation for finding out if the state +can be pushed down, which takes things into account like "is the current state +only used once", "is the new state directly produced by the application that +also uses the state", "which state value in the arguments belongs to which +value in the function result". Actually implementing this will probably need +some extra thought, initially having all the state at the top level is +probably fine. + +It's probable that this case will not occur that often, and some extra +heuristics might help to guess the right thing more often. Adding some "force +state here" primitive will probably possible as well for final disambiguation. +Still, it would nicer to really solve this problem somehow. -- 2.30.2