1 \chapter[chap:prototype]{Prototype}
2 An important step in this research is the creation of a prototype compiler.
3 Having this prototype allows us to apply the ideas from the previous chapter
4 to actual hardware descriptions and evaluate their usefulness. Having a
5 prototype also helps to find new techniques and test possible
8 Obviously the prototype was not created after all research
9 ideas were formed, but its implementation has been interleaved with the
10 research itself. Also, the prototype described here is the final version, it
11 has gone through a number of design iterations which we will not completely
14 \section[sec:prototype:input]{Input language}
15 When implementing this prototype, the first question to ask is: What
16 (functional) language will we use to describe our hardware? (Note that
17 this does not concern the \emph{implementation language} of the compiler,
18 just the language \emph{translated by} the compiler).
20 Initially, we have two choices:
23 \item Create a new functional language from scratch. This has the
24 advantage of having a language that contains exactly those elements that
25 are convenient for describing hardware and can contain special
26 constructs that allows our hardware descriptions to be more powerful or
28 \item Use an existing language and create a new backend for it. This has
29 the advantage that existing tools can be reused, which will speed up
33 \todo{Sidenote: No EDSL}
35 Considering that we required a prototype which should be working quickly,
36 and that implementing parsers, semantic checkers and especially
37 typcheckers is not exactly the core of this research (but it is lots and
38 lots of work!), using an existing language is the obvious choice. This
39 also has the advantage that a large set of language features is available
40 to experiment with and it is easy to find which features apply well and
41 which don't. A possible second prototype could use a custom language with
42 just the useful features (and possibly extra features that are specific to
43 the domain of hardware description as well).
45 The second choice is which of the many existing languages to use. As
46 mentioned before, the chosen language is Haskell. This choice has not been the
47 result of a thorough comparison of languages, for the simple reason that
48 the requirements on the language were completely unclear at the start of
49 this research. The fact that Haskell is a language with a broad spectrum
50 of features, that it is commonly used in research projects and that the
51 primary compiler, \GHC, provides a high level API to its internals, made
52 Haskell an obvious choice.
54 \section[sec:prototype:output]{Output format}
55 The second important question is: What will be our output format? Since
56 our prototype won't be able to program FPGA's directly, we'll have to have
57 output our hardware in some format that can be later processed and
58 programmed by other tools.
60 Looking at other tools in the industry, the Electronic Design Interchange
61 Format (\small{EDIF}) is commonly used for storing intermediate
62 \emph{netlists} (lists of components and connections between these
63 components) and is commonly the target for \small{VHDL} and Verilog
66 However, \small{EDIF} is not completely tool-independent. It specifies a
67 meta-format, but the hardware components that can be used vary between
68 various tool and hardware vendors, as well as the interpretation of the
69 \small{EDIF} standard. \todo{Is this still true? Reference:
70 http://delivery.acm.org/10.1145/80000/74534/p803-li.pdf?key1=74534\&key2=8370537521\&coll=GUIDE\&dl=GUIDE\&CFID=61207158\&CFTOKEN=61908473}
72 This means that when working with \small{EDIF}, our prototype would become
73 technology dependent (\eg only work with \small{FPGA}s of a specific
74 vendor, or even only with specific chips). This limits the applicability
75 of our prototype. Also, the tools we'd like to use for verifying,
76 simulating and draw pretty pictures of our output (like Precision, or
77 QuestaSim) are designed for \small{VHDL} or Verilog input.
79 For these reasons, we will not use \small{EDIF}, but \small{VHDL} as our
80 output language. We choose \VHDL over Verilog simply because we are
81 familiar with \small{VHDL} already. The differences between \small{VHDL}
82 and Verilog are on the higher level, while we will be using \small{VHDL}
83 mainly to write low level, netlist-like descriptions anyway.
85 An added advantage of using VHDL is that we can profit from existing
86 optimizations in VHDL synthesizers. A lot of optimizations are done on the
87 VHDL level by existing tools. These tools have years of experience in this
88 field, so it would not be reasonable to assume we could achieve a similar
89 amount of optimization in our prototype (nor should it be a goal,
90 considering this is just a prototype).
92 Note that we will be using \small{VHDL} as our output language, but will
93 not use its full expressive power. Our output will be limited to using
94 simple, structural descriptions, without any behavioural descriptions
95 (which might not be supported by all tools). This ensures that any tool
96 that works with \VHDL will understand our output (most tools don't support
97 synthesis of more complex \VHDL). This also leaves open the option to
98 switch to \small{EDIF} in the future, with minimal changes to the
101 \section[sec:prototype:design]{Prototype design}
102 As suggested above, we will use the Glasgow Haskell Compiler (\small{GHC}) to
103 implement our prototype compiler. To understand the design of the
104 compiler, we will first dive into the \small{GHC} compiler a bit. It's
105 compilation consists of the following steps (slightly simplified):
107 \startuseMPgraphic{ghc-pipeline}
109 save inp, front, desugar, simpl, back, out;
110 newEmptyBox.inp(0,0);
111 newBox.front(btex Fronted etex);
112 newBox.desugar(btex Desugarer etex);
113 newBox.simpl(btex Simplifier etex);
114 newBox.back(btex Backend etex);
115 newEmptyBox.out(0,0);
117 % Space the boxes evenly
118 inp.c - front.c = front.c - desugar.c = desugar.c - simpl.c
119 = simpl.c - back.c = back.c - out.c = (0, 1.5cm);
122 % Draw lines between the boxes. We make these lines "deferred" and give
123 % them a name, so we can use ObjLabel to draw a label beside them.
124 ncline.inp(inp)(front) "name(haskell)";
125 ncline.front(front)(desugar) "name(ast)";
126 ncline.desugar(desugar)(simpl) "name(core)";
127 ncline.simpl(simpl)(back) "name(simplcore)";
128 ncline.back(back)(out) "name(native)";
129 ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
130 ObjLabel.front(btex Haskell AST etex) "labpathname(ast)", "labdir(rt)";
131 ObjLabel.desugar(btex Core etex) "labpathname(core)", "labdir(rt)";
132 ObjLabel.simpl(btex Simplified core etex) "labpathname(simplcore)", "labdir(rt)";
133 ObjLabel.back(btex Native code etex) "labpathname(native)", "labdir(rt)";
135 % Draw the objects (and deferred labels)
136 drawObj (inp, front, desugar, simpl, back, out);
138 \placefigure[right]{GHC compiler pipeline}{\useMPgraphic{ghc-pipeline}}
141 This step takes the Haskell source files and parses them into an
142 abstract syntax tree (\small{AST}). This \small{AST} can express the
143 complete Haskell language and is thus a very complex one (in contrast
144 with the Core \small{AST}, later on). All identifiers in this
145 \small{AST} are resolved by the renamer and all types are checked by the
148 \startdesc{Desugaring}
149 This steps takes the full \small{AST} and translates it to the
150 \emph{Core} language. Core is a very small functional language with lazy
151 semantics, that can still express everything Haskell can express. Its
152 simpleness makes Core very suitable for further simplification and
153 translation. Core is the language we will be working with as well.
155 \startdesc{Simplification}
156 Through a number of simplification steps (such as inlining, common
157 subexpression elimination, etc.) the Core program is simplified to make
158 it faster or easier to process further.
161 This step takes the simplified Core program and generates an actual
162 runnable program for it. This is a big and complicated step we will not
163 discuss it any further, since it is not required for our prototype.
166 In this process, there a number of places where we can start our work.
167 Assuming that we don't want to deal with (or modify) parsing, typechecking
168 and other frontend business and that native code isn't really a useful
169 format anymore, we are left with the choice between the full Haskell
170 \small{AST}, or the smaller (simplified) core representation.
172 The advantage of taking the full \small{AST} is that the exact structure
173 of the source program is preserved. We can see exactly what the hardware
174 descriiption looks like and which syntax constructs were used. However,
175 the full \small{AST} is a very complicated datastructure. If we are to
176 handle everything it offers, we will quickly get a big compiler.
178 Using the core representation gives us a much more compact datastructure
179 (a core expression only uses 9 constructors). Note that this does not mean
180 that the core representation itself is smaller, on the contrary. Since the
181 core language has less constructs, a lot of things will take a larger
182 expression to express.
184 However, the fact that the core language is so much smaller, means it is a
185 lot easier to analyze and translate it into something else. For the same
186 reason, \small{GHC} runs its simplifications and optimizations on the core
187 representation as well.
189 However, we will use the normal core representation, not the simplified
190 core. Reasons for this are detailed below. \todo{Ref}
192 The final prototype roughly consists of three steps:
194 \startuseMPgraphic{clash-pipeline}
196 save inp, front, norm, vhdl, out;
197 newEmptyBox.inp(0,0);
198 newBox.front(btex \small{GHC} frontend + desugarer etex);
199 newBox.norm(btex Normalization etex);
200 newBox.vhdl(btex \small{VHDL} generation etex);
201 newEmptyBox.out(0,0);
203 % Space the boxes evenly
204 inp.c - front.c = front.c - norm.c = norm.c - vhdl.c
205 = vhdl.c - out.c = (0, 1.5cm);
208 % Draw lines between the boxes. We make these lines "deferred" and give
209 % them a name, so we can use ObjLabel to draw a label beside them.
210 ncline.inp(inp)(front) "name(haskell)";
211 ncline.front(front)(norm) "name(core)";
212 ncline.norm(norm)(vhdl) "name(normal)";
213 ncline.vhdl(vhdl)(out) "name(vhdl)";
214 ObjLabel.inp(btex Haskell source etex) "labpathname(haskell)", "labdir(rt)";
215 ObjLabel.front(btex Core etex) "labpathname(core)", "labdir(rt)";
216 ObjLabel.norm(btex Normalized core etex) "labpathname(normal)", "labdir(rt)";
217 ObjLabel.vhdl(btex \small{VHDL} description etex) "labpathname(vhdl)", "labdir(rt)";
219 % Draw the objects (and deferred labels)
220 drawObj (inp, front, norm, vhdl, out);
222 \placefigure[right]{Cλash compiler pipeline}{\useMPgraphic{clash-pipeline}}
225 This is exactly the frontend and desugarer from the \small{GHC}
226 pipeline, that translates Haskell sources to a core representation.
228 \startdesc{Normalization}
229 This is a step that transforms the core representation into a normal
230 form. This normal form is still expressed in the core language, but has
231 to adhere to an extra set of constraints. This normal form is less
232 expressive than the full core language (e.g., it can have limited higher
233 order expressions, has a specific structure, etc.), but is also very
234 close to directly describing hardware.
236 \startdesc{\small{VHDL} generation}
237 The last step takes the normal formed core representation and generates
238 \small{VHDL} for it. Since the normal form has a specific, hardware-like
239 structure, this final step is very straightforward.
242 The most interesting step in this process is the normalization step. That
243 is where more complicated functional constructs, which have no direct
244 hardware interpretation, are removed and translated into hardware
245 constructs. This step is described in a lot of detail at
246 \in{chapter}[chap:normalization].
248 \section{The Core language}
249 \defreftxt{core}{the Core language}
250 Most of the prototype deals with handling the program in the Core
251 language. In this section we will show what this language looks like and
254 The Core language is a functional language that describes
255 \emph{expressions}. Every identifier used in Core is called a
256 \emph{binder}, since it is bound to a value somewhere. On the highest
257 level, a Core program is a collection of functions, each of which bind a
258 binder (the function name) to an expression (the function value, which has
261 The Core language itself does not prescribe any program structure, only
262 expression structure. In the \small{GHC} compiler, the Haskell module
263 structure is used for the resulting Core code as well. Since this is not
264 so relevant for understanding the Core language or the Normalization
265 process, we'll only look at the Core expression language here.
267 Each Core expression consists of one of these possible expressions.
269 \startdesc{Variable reference}
270 \defref{variable reference}
274 This is a reference to a binder. It's written down as the
275 name of the binder that is being referred to along with its type. The
276 binder name should of course be bound in a containing scope (including
277 top level scope, so a reference to a top level function is also a
278 variable reference). Additionally, constructors from algebraic datatypes
279 also become variable references.
281 The value of this expression is the value bound to the given binder.
283 Each binder also carries around its type (explicitly shown above), but
284 this is usually not shown in the Core expressions. Only when the type is
285 relevant (when a new binder is introduced, for example) will it be
286 shown. In other cases, the binder is either not relevant, or easily
287 derived from the context of the expression. \todo{Ref sidenote on type
296 This is a literal. Only primitive types are supported, like
297 chars, strings, ints and doubles. The types of these literals are the
298 \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
299 normal Haskell versions (but there are builtin conversion functions).
302 \startdesc{Application}
307 This is function application. Each application consists of two
308 parts: The function part and the argument part. Applications are used
309 for normal function \quote{calls}, but also for applying type
310 abstractions and data constructors.
312 The value of an application is the value of the function part, with the
313 first argument binder bound to the argument part.
316 \startdesc{Lambda abstraction}
317 \defref{lambda abstraction}
321 This is the basic lambda abstraction, as it occurs in labmda calculus.
322 It consists of a binder part and a body part. A lambda abstraction
323 creates a function, that can be applied to an argument. The binder is
324 usually a value binder, but it can also be a \emph{type binder} (or
325 \emph{type variable}). The latter case introduces a new polymorphic
326 variable, which can be used in types later on. See
327 \in{section}[sec:prototype:coretypes] for details.
329 Note that the body of a lambda abstraction extends all the way to the
330 end of the expression, or the closing bracket surrounding the lambda. In
331 other words, the lambda abstraction \quote{operator} has the lowest
334 The value of an application is the value of the body part, with the
335 binder bound to the value the entire lambda abstraction is applied to.
338 \startdesc{Non-recursive let expression}
339 \defref{let expression}
341 let bndr = value in body
343 A let expression allows you to bind a binder to some value, while
344 evaluating to some other value (where that binder is in scope). This
345 allows for sharing of subexpressions (you can use a binder twice) and
346 explicit \quote{naming} of arbitrary expressions. Note that the binder
347 is not in scope in the value bound to it, so it's not possible to make
348 recursive definitions with the normal form of the let expression (see
349 the recursive form below).
351 Even though this let expression is an extension on the basic lambda
352 calculus, it is easily translated to a lambda abstraction. The let
353 expression above would then become:
359 This notion might be useful for verifying certain properties on
360 transformations, since a lot of verification work has been done on
361 lambda calculus already.
363 The value of a let expression is the value of the body part, with the
364 binder bound to the value.
367 \startdesc{Recursive let expression}
376 This is the recursive version of the let expression. In \small{GHC}'s
377 Core implementation, non-recursive and recursive lets are not so
378 distinct as we present them here, but this provides a clearer overview.
380 The main difference with the normal let expression is that each of the
381 binders is in scope in each of the values, in addition to the body. This
382 allows for self-recursive or mutually recursive definitions.
384 It should also be possible to express a recursive let using normal
385 lambda calculus, if we use the \emph{least fixed-point operator},
386 \lam{Y}. This falls beyond the scope of this report, since it is not
387 needed for this research.
390 \startdesc{Case expression}
391 \defref{case expression}
393 case scrutinee of bndr
394 DEFAULT -> defaultbody
395 C0 bndr0,0 ... bndr0,m -> body0
397 Cn bndrn,0 ... bndrn,m -> bodyn
402 A case expression is the only way in Core to choose between values. All
403 \hs{if} expressions and pattern matchings from the original Haskell
404 PRogram have been translated to case expressions by the desugarer.
406 A case expression evaluates its scrutinee, which should have an
407 algebraic datatype, into weak head normal form (\small{WHNF}) and
408 (optionally) binds it to \lam{bndr}. It then chooses a body depending on
409 the constructor of its scrutinee. If none of the constructors match, the
410 \lam{DEFAULT} alternative is chosen. A case expression must always be
411 exhaustive, \ie it must cover all possible constructors that the
412 scrutinee can have (if all of them are covered explicitly, the
413 \lam{DEFAULT} alternative can be left out).
415 Since we can only match the top level constructor, there can be no overlap
416 in the alternatives and thus order of alternatives is not relevant (though
417 the \lam{DEFAULT} alternative must appear first for implementation
420 Any arguments to the constructor in the scrutinee are bound to each of the
421 binders after the constructor and are in scope only in the corresponding
424 To support strictness, the scrutinee is always evaluated into
425 \small{WHNF}, even when there is only a \lam{DEFAULT} alternative. This
426 allows aplication of the strict function \lam{f} to the argument \lam{a}
430 f (case a of arg DEFAULT -> arg)
433 According to the \GHC documentation, this is the only use for the extra
434 binder to which the scrutinee is bound. When not using strictness
435 annotations (which is rather pointless in hardware descriptions),
436 \small{GHC} seems to never generate any code making use of this binder.
437 In fact, \GHC has never been observed to generate code using this
438 binder, even when strictness was involved. Nonetheless, the prototype
439 handles this binder as expected.
441 Note that these case statements are less powerful than the full Haskell
442 case statements. In particular, they do not support complex patterns like
443 in Haskell. Only the constructor of an expression can be matched,
444 complex patterns are implemented using multiple nested case expressions.
446 Case statements are also used for unpacking of algebraic datatypes, even
447 when there is only a single constructor. For examples, to add the elements
448 of a tuple, the following Core is generated:
451 sum = λtuple.case tuple of
455 Here, there is only a single alternative (but no \lam{DEFAULT}
456 alternative, since the single alternative is already exhaustive). When
457 it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
458 (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
461 \startdesc{Cast expression}
462 \defref{cast expression}
466 A cast expression allows you to change the type of an expression to an
467 equivalent type. Note that this is not meant to do any actual work, like
468 conversion of data from one format to another, or force a complete type
469 change. Instead, it is meant to change between different representations
470 of the same type, \eg switch between types that are provably equal (but
473 In our hardware descriptions, we typically see casts to change between a
474 Haskell newtype and its contained type, since those are effectively
475 different types (so a cast is needed) with the same representation (but
476 no work is done by the cast).
478 More complex are types that are proven to be equal by the typechecker,
479 but look different at first glance. To ensure that, once the typechecker
480 has proven equality, this information sticks around, explicit casts are
481 added. In our notation we only write the target type, but in reality a
482 cast expressions carries around a \emph{coercion}, which can be seen as a
483 proof of equality. \todo{Example}
485 The value of a cast is the value of its body, unchanged. The type of this
486 value is equal to the target type, not the type of its body.
488 \todo{Move and update this paragraph}
489 Note that this syntax is also used sometimes to indicate that a particular
490 expression has a particular type, even when no cast expression is
491 involved. This is then purely informational, since the only elements that
492 are explicitely typed in the Core language are the binder references and
493 cast expressions, the types of all other elements are determined at
498 The Core language in \small{GHC} allows adding \emph{notes}, which serve
499 as hints to the inliner or add custom (string) annotations to a core
500 expression. These shouldn't be generated normally, so these are not
501 handled in any way in the prototype.
505 \defref{type expression}
509 It is possibly to use a Core type as a Core expression. For the actual
510 types supported by Core, see \in{section}[sec:prototype:coretypes]. This
511 \quote{lifting} of a type into the value domain is done to allow for
512 type abstractions and applications to be handled as normal lambda
513 abstractions and applications above. This means that a type expression
514 in Core can only ever occur in the argument position of an application,
515 and only if the type of the function that is applied to expects a type
516 as the first argument. This happens for all polymorphic functions, for
517 example, the \lam{fst} function:
520 fst :: \forall a. \forall b. (a, b) -> a
521 fst = λtup.case tup of (,) a b -> a
523 fstint :: (Int, Int) -> Int
524 fstint = λa.λb.fst @Int @Int a b
527 The type of \lam{fst} has two universally quantified type variables. When
528 \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
529 (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
530 the type of \lam{fst} actual type of arguments and result can be found:
531 \lam{fst @Int @Int :: (Int, Int) -> Int}).
534 \subsection[sec:prototype:coretypes]{Core type system}
535 Whereas the expression syntax of Core is very simple, its type system is
536 a bit more complicated. It turns out it is harder to \quote{desugar}
537 Haskell's complex type system into something more simple. Most of the
538 type system is thus very similar to that of Haskell.
540 We will slightly limit our view on Core's type system, since the more
541 complicated parts of it are only meant to support Haskell's (or rather,
542 \GHC's) type extensions, such as existential types, \small{GADT}s, type
543 families and other non-standard Haskell stuff which we don't (plan to)
546 In Core, every expression is typed. The translation to Core happens
547 after the typechecker, so types in Core are always correct as well
548 (though you could of course construct invalidly typed expressions).
550 Any type in core is one of the following:
552 \startdesc{A type variable}
557 This is a reference to a type defined elsewhere. This can either be a
558 polymorphic type (like the latter two \lam{t}'s in \lam{id :: \forall t.
559 t -> t}), or a type constructor (like \lam{Bool} in \lam{not :: Bool ->
560 Bool}). Like in Haskell, polymorphic type variables always
561 start with a lowercase letter, while type constructors always start
562 with an uppercase letter.
564 \todo{How to define (new) type constructors?}
566 A special case of a type constructor is the \emph{function type
567 constructor}, \lam{->}. This is a type constructor taking two arguments
568 (using application below). The function type constructor is commonly
569 written inline, so we write \lam{a -> b} when we really mean \lam{-> a
570 b}, the function type constructor applied to \lam{a} and \lam{b}.
572 Polymorphic type variables can only be defined by a lambda
573 abstraction, see the forall type below.
576 \startdesc{A type application}
581 This applies a some type to another type. This is particularly used to
582 apply type variables (type constructors) to their arguments.
584 As mentioned above, applications of some type constructors have
585 special notation. In particular, these are applications of the
586 \emph{function type constructor} and \emph{tuple type constructors}:
595 \startdesc{The forall type}
597 id :: \forall a. a -> a
599 The forall type introduces polymorphism. It is the only way to
600 introduce new type variables, which are completely unconstrained (Any
601 possible type can be assigned to it). Constraints can be added later
602 using predicate types, see below.
604 A forall type is always (and only) introduced by a type lambda
605 expression. For example, the Core translation of the
611 Here, the type of the binder \lam{x} is \lam{a}, referring to the
612 binder in the topmost lambda.
614 When using a value with a forall type, the actual type
615 used must be applied first. For example haskell expression \hs{id
616 True} (the function \hs{id} appleid to the dataconstructor \hs{True})
617 translates to the following Core:
623 Here, id is first applied to the type to work with. Note that the type
624 then changes from \lam{id :: \forall a. a -> a} to \lam{id @Bool ::
625 Bool -> Bool}. Note that the type variable \lam{a} has been
626 substituted with the actual type.
628 In Haskell, forall types are usually not explicitly specified (The use
629 of a lowercase type variable implicitly introduces a forall type for
630 that variable). In fact, in standard Haskell there is no way to
631 explicitly specify forall types. Through a language extension, the
632 \hs{forall} keyword is available, but still optional for normal forall
633 types (it is needed for \emph{existentially quantified types}, which
634 Cλash does not support).
637 \startdesc{Predicate type}
639 show :: \forall a. Show s ⇒ s → String
642 \todo{Sidenote: type classes?}
644 A predicate type introduces a constraint on a type variable introduced
645 by a forall type (or type lambda). In the example above, the type
646 variable \lam{a} can only contain types that are an \emph{instance} of
647 the \emph{type class} \lam{Show}. \refdef{type class}
649 There are other sorts of predicate types, used for the type families
650 extension, which we will not discuss here.
652 A predicate type is introduced by a lambda abstraction. Unlike with
653 the forall type, this is a value lambda abstraction, that must be
654 applied to a value. We call this value a \emph{dictionary}.
656 Without going into the implementation details, a dictionary can be
657 seen as a lookup table all the methods for a given (single) type class
658 instance. This means that all the dictionaries for the same type class
659 look the same (\eg contain methods with the same names). However,
660 dictionaries for different instances of the same class contain
661 different methods, of course.
663 A dictionary is introduced by \small{GHC} whenever it encounters an
664 instance declaration. This dictionary, as well as the binder
665 introduced by a lambda that introduces a dictionary, have the
666 predicate type as their type. These binders are usually named starting
667 with a \lam{\$}. Usually the name of the type concerned is not
668 reflected in the name of the dictionary, but the name of the type
669 class is. The Haskell expression \hs{show True} thus becomes:
672 show @Bool \$dShow True
676 Using this set of types, all types in basic Haskell can be represented.
678 \todo{Overview of polymorphism with more examples (or move examples
681 \section[sec:prototype:statetype]{State annotations in Haskell}
682 As noted in \in{section}[sec:description:stateann], Cλash needs some
683 way to let the programmer explicitly specify which of a function's
684 arguments and which part of a function's result represent the
687 Using the Haskell type systems, there are a few ways we can tackle this.
689 \subsection{Type synonyms}
690 Haskell provides type synonyms as a way to declare a new type that is
691 equal to an existing type (or rather, a new name for an existing type).
692 This allows both the original type and the synonym to be used
693 interchangedly in a Haskell program. This means no explicit conversion
694 is needed either. For example, a simple accumulator would become:
698 acc :: Word -> State Word -> (State Word, Word)
699 acc i s = let sum = s + i in (sum, sum)
702 This looks nice in Haskell, but turns out to be hard to implement. There
703 are no explicit conversion in Haskell, but not in Core either. This
704 means the type of a value might be show as \hs{AccState} in some places,
705 but \hs{Word} in others (and this can even change due to
706 transformations). Since every binder has an explicit type associated
707 with it, the type of every function type will be properly preserved and
708 could be used to track down the statefulness of each value by the
709 compiler. However, this makes the implementation a lot more complicated
710 than it currently is using \hs{newtypes}.
712 % Use \type instead of \hs here, since the latter breaks inside
714 \subsection{Type renaming (\type{newtype})}
715 Haskell also supports type renamings as a way to declare a new type that
716 has the same (runtime) representation as an existing type (but is in
717 fact a different type to the typechecker). With type renaming, an
718 explicit conversion between values of the two types is needed. The
719 accumulator would then become:
722 newtype State s = State s
723 acc :: Word -> State Word -> (State Word, Word)
724 acc i (State s) = let sum = s + i in (State sum, sum)
727 The \hs{newtype} line declares a new type \hs{State} that has one type
728 argument, \hs{s}. This type contains one \quote{constructor} \hs{State}
729 with a single argument of type \hs{s}. It is customary to name the
730 constructor the same as the type, which is allowed (since types can
731 never cause name collisions with values). The difference with the type
732 synonym example is in the explicit conversion between the \hs{State
733 Word} and \hs{Word} types by pattern matching and by using the explicit
734 the \hs{State constructor}.
736 This explicit conversion makes the \VHDL generation easier: Whenever we
737 remove (unpack) the \hs{State} type, this means we are accessing the
738 current state (\eg, accessing the register output). Whenever we are a
739 adding (packing) the \hs{State} type, we are producing a new value for
740 the state (\eg, providing the register input).
742 When dealing with nested states (a stateful function that calls stateful
743 functions, which might call stateful functions, etc.) the state type
744 could quickly grow complex because of all the \hs{State} type constructors
745 needed. For example, consider the following state type (this is just the
746 state type, not the entire function type):
749 State (State Bit, State (State Word, Bit), Word)
752 We cannot leave all these \hs{State} type constructors out, since that
753 would change the type (unlike when using type synonyms). However, when
754 using type synonyms to hide away substates (see
755 \in{section}[sec:prototype:substatesynonyms] below), this
756 disadvantage should be limited.
758 \subsubsection{Different input and output types}
759 An alternative could be to use different types for input and output
760 state (\ie current and updated state). The accumulator example would
761 then become something like:
764 newtype StateIn s = StateIn s
765 newtype StateOut s = StateOut s
766 acc :: Word -> StateIn Word -> (StateIn Word, Word)
767 acc i (StateIn s) = let sum = s + i in (StateIn sum, sum)
770 This could make the implementation easier and the hardware
771 descriptions less errorprone (you can no longer \quote{forget} to
772 unpack and repack a state variable and just return it directly, which
773 can be a problem in the current prototype). However, it also means we
774 need twice as many type synonyms to hide away substates, making this
775 approach a bit cumbersome. It also makes it harder to copmare input
776 and output state types, possible reducing the type safety of the
779 \subsection[sec:prototype:substatesynonyms]{Type synonyms for substates}
780 As noted above, when using nested (hierarchical) states, the state types
781 of the \quote{upper} functions (those that call other functions, which
782 call other functions, etc.) quickly becomes complicated. Also, when the
783 state type of one of the \quote{lower} functions changes, the state
784 types of all the upper functions changes as well. If the state type for
785 each function is explicitly and completely specified, this means that a
786 lot of code needs updating whenever a state type changes.
788 To prevent this, it is recommended (but not enforced) to use a type
789 synonym for the state type of every function. Every function calling
790 other functions will then use the state type synonym of the called
791 functions in its own type, requiring no code changes when the state type
792 of a called function changes. This approach is used in
793 \in{example}[ex:AvgState] below. The \hs{AccState} and \hs{AvgState}
794 are examples of such state type synonyms.
796 \subsection{Chosen approach}
797 To keep implementation simple, the current prototype uses the type
798 renaming approach, with a single type for both input and output
799 states. In the future, it might be worthwhile to revisit this
800 approach if more complicated flow analysis is implemented for
801 state variables. This analysis is needed to add proper error
802 checking anyway and might allow the use of type synonyms without
803 losing any expressivity.
805 \subsubsection{Example}
806 As an example of the used approach, there is a simple averaging circuit in
807 \in{example}[ex:AvgState]. This circuit lets the accumulation of the
808 inputs be done by a subcomponent, \hs{acc}, but keeps a count of value
809 accumulated in its own state.\footnote{Currently, the prototype
810 is not able to compile this example, since the builtin function
811 for division has not been added.}
813 \startbuffer[AvgState]
814 -- The state type annotation
815 newtype State s = State s
817 -- The accumulator state type
818 type AccState = State Word
820 acc :: Word -> AccState -> (AccState, Word)
821 acc i (State s) = let sum = s + i in (State sum, sum)
823 -- The averaging circuit state type
824 type AvgState = State (AccState, Word)
825 -- The averaging circuit
826 avg :: Word -> AvgState -> (AvgState, Word)
827 avg i (State s) = (State s', o)
830 -- Pass our input through the accumulator, which outputs a sum
831 (accs', sum) = acc i accs
832 -- Increment the count (which will be our new state)
834 -- Compute the average
839 \placeexample[here][ex:AvgState]{Simple stateful averaging circuit.}
840 %\startcombination[2*1]
841 {\typebufferhs{AvgState}}%{Haskell description using function applications.}
842 % {\boxedgraphic{AvgState}}{The architecture described by the Haskell description.}
846 \section{Implementing state}
847 Now its clear how to put state annotations in the Haskell source,
848 there is the question of how to implement this state translation. As
849 we've seen in \in{section}[sec:prototype:design], the translation to
850 \VHDL happens as a simple, final step in the compilation process.
851 This step works on a core expression in normal form. The specifics
852 of normal form will be explained in
853 \in{chapter}[chap:normalization], but the examples given should be
854 easy to understand using the definitin of Core given above.
856 \startbuffer[AvgStateNormal]
859 -- Remove the State newtype
863 -- Add the State newtype again
864 spacked' = s' ▶ State Word
871 s = spacked ▶ (AccState, Word)
872 accs = case s of (accs, _) -> accs
873 count = case s of (_, count) -> count
875 accs' = case accres of (accs', _) -> accs'
876 sum = case accres of (_, sum) -> sum
880 spacked' = s' ▶ State (AccState, Word)
886 \placeexample[here][ex:AvgStateNormal]{Normalized version of \in{example}[ex:AvgState]}
887 {\typebufferlam{AvgStateNormal}}
889 \subsection[sec:prototype:statelimits]{State in normal form}
890 Before describing how to translate state from normal form to
891 \VHDL, we will first see how state handling looks in normal form.
892 What limitations are there on their use to guarantee that proper
893 \VHDL can be generated?
895 We will try to formulate a number of rules about what operations are
896 allowed with state variables. These rules apply to the normalized Core
897 representation, but will in practice apply to the original Haskell
898 hardware description as well. Ideally, these rules would become part
899 of the intended normal form definition \refdef{intended normal form
900 definition}, but this is not the case right now. This can cause some
901 problems, which are detailed in
902 \in{section}[sec:normalization:stateproblems].
904 In these rules we use the terms \emph{state variable} to refer to any
905 variable that has a \lam{State} type. A \emph{state-containing
906 variable} is any variable whose type contains a \lam{State} type,
907 but is not one itself (like \lam{(AccState, Word)} in the example,
908 which is a tuple type, but contains \lam{AccState}, which is again
909 equal to \lam{State Word}).
911 We also use a distinction between \emph{input} and \emph{output
912 (state) variables} and \emph{substate variables}, which will be
913 defined in the rules themselves.
915 \startdesc{State variables can appear as an argument.}
917 avg = λi.λspacked. ...
920 Any lambda that binds a variable with a state type, creates a new
921 input state variable.
924 \startdesc{Input state variables can be unpacked.}
926 s = spacked ▶ (AccState, Word)
929 An input state variable may be unpacked using a cast operation. This
930 removes the \lam{State} type renaming and the result has no longer a
933 If the result of this unpacking does not have a state type and does
934 not contain state variables, there are no limitations on its use.
935 Otherwise if it does not have a state type but does contain
936 substates, we refer to it as a \emph{state-containing input
937 variable} and the limitations below apply. If it has a state type
938 itself, we refer to it as an \emph{input substate variable} and the
939 below limitations apply as well.
941 It may seem strange to consider a variable that still has a state
942 type directly after unpacking, but consider the case where a
943 function does not have any state of its own, but does call a single
944 stateful function. This means it must have a state argument that
945 contains just a substate. The function signature of such a function
949 type FooState = State AccState
952 Which is of course equivalent to \lam{State (State Word)}.
955 \startdesc{Variables can be extracted from state-containing input variables.}
957 accs = case s of (accs, _) -> accs
960 A state-containing input variable is typically a tuple containing
961 multiple elements (like the current function's state, substates or
962 more tuples containing substates). All of these can be extracted
963 from an input variable using an extractor case (or possibly
964 multiple, when the input variable is nested).
966 If the result has no state type and does not contain any state
967 variables either, there are no further limitations on its use. If
968 the result has no state type but does contain state variables we
969 refer to it as a \emph{state-containing input variable} and this
970 limitation keeps applying. If the variable has a state type itself,
971 we refer to it as an \emph{input substate variable} and below
974 \startdesc{Input substate variables can be passed to functions.}
977 accs' = case accres of (accs', _) -> accs'
980 An input substate variable can (only) be passed to a function.
981 Additionally, every input substate variable must be used in exactly
982 \emph{one} application, no more and no less.
984 The function result should contain exactly one state variable, which
985 can be extracted using (multiple) case statements. The extracted
986 state variable is referred to the \emph{output substate}
988 The type of this output substate must be identical to the type of
989 the input substate passed to the function.
992 \startdesc{Variables can be inserted into a state-containing output variable.}
997 A function's output state is usually a tuple containing its own
998 updated state variables and all output substates. This result is
999 built up using any single-constructor algebraic datatype.
1001 The result of these expressions is referred to as a
1002 \emph{state-containing output variable}, which are subject to these
1006 \startdesc{State containing output variables can be packed.}
1008 spacked' = s' ▶ State (AccState, Word)
1011 As soon as all a functions own update state and output substate
1012 variables have been joined together, the resulting
1013 state-containing output variable can be packed into an output
1014 state variable. Packing is done by casting into a state type.
1017 \startdesc{Output state variables can appear as (part of) a function result.}
1026 When the output state is packed, it can be returned as a part
1027 of the function result. Nothing else can be done with this
1028 value (or any value that contains it).
1031 There is one final limitation that is hard to express in the above
1032 itemization. Whenever substates are extracted from the input state
1033 to be passed to functions, the corresponding output substates
1034 should be inserted into the output state in the same way. In other
1035 words, each pair of corresponding substates in the input and
1036 output states should be passed / returned from the same called
1039 The prototype currently does not check much of the above
1040 conditions. This means that if the conditions are violated,
1041 sometimes a compile error is generated, but in other cases output
1042 can be generated that is not valid \VHDL or at the very least does
1043 not correspond to the input.
1045 \subsection{Translating to \VHDL}
1046 As noted above, the basic approach when generating \VHDL for stateful
1047 functions is to generate a single register for every stateful function.
1048 We look around the normal form to find the let binding that removes the
1049 \lam{State} newtype (using a cast). We also find the let binding that
1050 adds a \lam{State} type. These are connected to the output and the input
1051 of the generated let binding respectively. This means that there can
1052 only be one let binding that adds and one that removes the \lam{State}
1053 type. It is easy to violate this constraint. This problem is detailed in
1054 \in{section}[sec:normalization:stateproblems].
1056 This approach seems simple enough, but will this also work for more
1057 complex stateful functions involving substates? Observe that any
1058 component of a function's state that is a substate, \ie passed on as
1059 the state of another function, should have no influence on the
1060 hardware generated for the calling function. Any state-specific
1061 \small{VHDL} for this component can be generated entirely within the
1062 called function. So, we can completely ignore substates when
1063 generating \VHDL for a function.
1065 From this observation, we might think to remove the substates from a
1066 function's states alltogether, and leave only the state components
1067 which are actual states of the current function. While doing this
1068 would not remove any information needed to generate \small{VHDL} from
1069 the function, it would cause the function definition to become invalid
1070 (since we won't have any substate to pass to the functions anymore).
1071 We could solve the syntactic problems by passing \type{undefined} for
1072 state variables, but that would still break the code on the semantic
1073 level (\ie, the function would no longer be semantically equivalent to
1074 the original input).
1076 To keep the function definition correct until the very end of the
1077 process, we will not deal with (sub)states until we get to the
1078 \small{VHDL} generation. Then, we are translating from Core to
1079 \small{VHDL}, and we can simply ignore substates, effectively removing
1080 the substate components alltogether.
1082 But, how will we know what exactly is a substate? Since any state
1083 argument or return value that represents state must be of the
1084 \type{State} type, we can look at the type of a value. However, we
1085 must be careful to ignore only \emph{substates}, and not a
1086 function's own state.
1088 In \in{example}[ex:AvgStateNorm] above, we should generate a register
1089 connected with its output connected to \lam{s} and its input connected
1090 to \lam{s'}. However, \lam{s'} is build up from both \lam{accs'} and
1091 \lam{count'}, while only \lam{count'} should end up in the register.
1092 \lam{accs'} is a substate for the \lam{acc} function, for which a
1093 register will be created when generating \VHDL for the \lam{acc}
1096 Fortunately, the \lam{accs'} variable (and any other substate) has a
1097 property that we can easily check: It has a \lam{State} type
1098 annotation. This means that whenever \VHDL is generated for a tuple
1099 (or other algebraic type), we can simply leave out all elements that
1100 have a \lam{State} type. This will leave just the parts of the state
1101 that do not have a \lam{State} type themselves, like \lam{count'},
1102 which is exactly a function's own state. This approach also means that
1103 the state part of the result is automatically excluded when generating
1104 the output port, which is also required.
1106 We can formalize this translation a bit, using the following
1110 \item A state unpack operation should not generate any \small{VHDL}.
1111 The binder to which the unpacked state is bound should still be
1112 declared, this signal will become the register and will hold the
1114 \item A state pack operation should not generate any \small{VHDL}.
1115 The binder to which the packed state is bound should not be
1116 declared. The binder that is packed is the signal that will hold the
1118 \item Any values of a State type should not be translated to
1119 \small{VHDL}. In particular, State elements should be removed from
1120 tuples (and other datatypes) and arguments with a state type should
1122 \item To make the state actually work, a simple \small{VHDL} proc
1123 should be generated. This proc updates the state at every
1124 clockcycle, by assigning the new state to the current state. This
1125 will be recognized by synthesis tools as a register specification.
1128 When applying these rules to the description in
1129 \in{example}[ex:AvgStateNormal], we be left with the description
1130 in \in{example}[ex:AvgStateRemoved]. All the parts that don't
1131 generate any \VHDL directly are crossed out, leaving just the
1132 actual flow of values in the final hardware.
1135 avg = iλ.--λspacked.--
1137 s = --spacked ▶ (AccState, Word)--
1138 --accs = case s of (accs, _) -> accs--
1139 count = case s of (--_,-- count) -> count
1140 accres = acc i --accs--
1141 --accs' = case accres of (accs', _) -> accs'--
1142 sum = case accres of (--_,-- sum) -> sum
1145 s' = (--accs',-- count')
1146 --spacked' = s' ▶ State (AccState, Word)--
1147 res = (--spacked',-- o)
1152 When we would really leave out the crossed out parts, we get a slightly
1153 weird program: There is a variable \lam{s} which has no value, and there
1154 is a variable \lam{s'} that is never used. Together, these two will form
1155 the state proc of the function. \lam{s} contains the "current" state,
1156 \lam{s'} is assigned the "next" state. So, at the end of each clock
1157 cycle, \lam{s'} should be assigned to \lam{s}.
1159 As you can see, the definition of \lam{s'} is still present, since
1160 it does not have a state type. The \lam{accums'} substate has been
1161 removed, leaving us just with the state of \lam{avg} itself.
1163 As an illustration of the result of this function,
1164 \in{example}[ex:AccStateVHDL] and \in{example}[ex:AvgStateVHDL] show the the \VHDL that is
1165 generated from the examples is this section.
1167 \startbuffer[AvgStateVHDL]
1168 entity avgComponent_0 is
1169 port (\izAlE2\ : in \unsigned_31\;
1170 \foozAo1zAo12\ : out \(,)unsigned_31\;
1171 clock : in std_logic;
1172 resetn : in std_logic);
1173 end entity avgComponent_0;
1176 architecture structural of avgComponent_0 is
1177 signal \szAlG2\ : \(,)unsigned_31\;
1178 signal \countzAlW2\ : \unsigned_31\;
1179 signal \dszAm62\ : \(,)unsigned_31\;
1180 signal \sumzAmk3\ : \unsigned_31\;
1181 signal \reszAnCzAnM2\ : \unsigned_31\;
1182 signal \foozAnZzAnZ2\ : \unsigned_31\;
1183 signal \reszAnfzAnj3\ : \unsigned_31\;
1184 signal \s'zAmC2\ : \(,)unsigned_31\;
1186 \countzAlW2\ <= \szAlG2\.A;
1188 \comp_ins_dszAm62\ : entity accComponent_1
1189 port map (\izAob3\ => \izAlE2\,
1190 \foozAoBzAoB2\ => \dszAm62\,
1194 \sumzAmk3\ <= \dszAm62\.A;
1196 \reszAnCzAnM2\ <= to_unsigned(1, 32);
1198 \foozAnZzAnZ2\ <= \countzAlW2\ + \reszAnCzAnM2\;
1200 \reszAnfzAnj3\ <= \sumzAmk3\ * \foozAnZzAnZ2\;
1202 \s'zAmC2\.A <= \foozAnZzAnZ2\;
1204 \foozAo1zAo12\.A <= \reszAnfzAnj3\;
1206 state : process (clock, resetn)
1208 if resetn = '0' then
1209 elseif rising_edge(clock) then
1210 \szAlG2\ <= \s'zAmC2\;
1213 end architecture structural;
1215 \startbuffer[AccStateVHDL]
1216 entity accComponent_1 is
1217 port (\izAob3\ : in \unsigned_31\;
1218 \foozAoBzAoB2\ : out \(,)unsigned_31\;
1219 clock : in std_logic;
1220 resetn : in std_logic);
1221 end entity accComponent_1;
1224 architecture structural of accComponent_1 is
1225 signal \szAod3\ : \unsigned_31\;
1226 signal \reszAonzAor3\ : \unsigned_31\;
1228 \reszAonzAor3\ <= \szAod3\ + \izAob3\;
1230 \foozAoBzAoB2\.A <= \reszAonzAor3\;
1232 state : process (clock, resetn)
1234 if resetn = '0' then
1235 elseif rising_edge(clock) then
1236 \szAod3\ <= \reszAonzAor3\;
1239 end architecture structural;
1242 \placeexample[][ex:AccStateVHDL]{\VHDL generated for acc from \in{example}[ex:AvgState]}
1243 {\typebuffer[AccStateVHDL]}
1244 \placeexample[][ex:AvgStateVHDL]{\VHDL generated for avg from \in{example}[ex:AvgState]}
1245 {\typebuffer[AvgStateVHDL]}
1246 % \subsection{Initial state}
1247 % How to specify the initial state? Cannot be done inside a hardware
1248 % function, since the initial state is its own state argument for the first
1249 % call (unless you add an explicit, synchronous reset port).
1251 % External init state is natural for simulation.
1253 % External init state works for hardware generation as well.
1255 % Implementation issues: state splitting, linking input to output state,
1256 % checking usage constraints on state variables.
1258 % \todo{Implementation issues: Separate compilation, simplified core.}
1260 % vim: set sw=2 sts=2 expandtab: