Chapters/Normalization.tex

   1 \chapter{Normalization}
   2
   3 % A helper to print a single example in the half the page width. The example
   4 % text should be in a buffer whose name is given in an argument.
   5 %
   6 % The align=right option really does left-alignment, but without the program
   7 % will end up on a single line. The strut=no option prevents a bunch of empty
   8 % space at the start of the frame.
   9 \define[1]\example{
  10   \framed[offset=1mm,align=right,strut=no]{
  11     \setuptyping[option=LAM,style=sans,before=,after=]
  12     \typebuffer[#1]
  13     \setuptyping[option=none,style=\tttf]
  14   }
  15 }
  16
  17
  18 % A transformation example
  19 \definefloat[example][examples]
  20 \setupcaption[example][location=top] % Put captions on top
  21
  22 \define[3]\transexample{
  23   \placeexample[here]{#1}
  24   \startcombination[2*1]
  25     {\example{#2}}{Original program}
  26     {\example{#3}}{Transformed program}
  27   \stopcombination
  28 }
  29 %
  30 %\define[3]\transexampleh{
  31 %%  \placeexample[here]{#1}
  32 %%  \startcombination[1*2]
  33 %%    {\example{#2}}{Original program}
  34 %%    {\example{#3}}{Transformed program}
  35 %%  \stopcombination
  36 %}
  37
  38 The first step in the core to VHDL translation process, is normalization. We
  39 aim to bring the core description into a simpler form, which we can
  40 subsequently translate into VHDL easily. This normal form is needed because
  41 the full core language is more expressive than VHDL in some areas and because
  42 core can describe expressions that do not have a direct hardware
  43 interpretation.
  44
  45 TODO: Describe core properties not supported in VHDL, and describe how the
  46 VHDL we want to generate should look like.
  47
  48 \section{Goal}
  49 The transformations described here have a well-defined goal: To bring the
  50 program in a well-defined form that is directly translatable to hardware,
  51 while fully preserving the semantics of the program.
  52
  53 This {\em normal form} is again a Core program, but with a very specific
  54 structure. A function in normal form has nested lambda's at the top, which
  55 produce a let expression. This let expression binds every function application
  56 in the function and produces a simple identifier. Every bound value in
  57 the let expression is either a simple function application or a case
  58 expression to extract a single element from a tuple returned by a
  59 function.
  60
  61 An example of a program in canonical form would be:
  62
  63 \startlambda
  64   -- All arguments are an inital lambda
  65   λa.λd.λsp.
  66   -- There are nested let expressions at top level
  67   let
  68     -- Unpack the state by coercion
  69     s = sp :: (Word, Word)
  70     -- Extract both registers from the state
  71     r1 = case s of (fst, snd) -> fst
  72     r2 = case s of (fst, snd) -> snd
  73     -- Calling some other user-defined function.
  74     d' = foo d
  75     -- Conditional connections
  76     out = case a of
  77       High -> r1
  78       Low -> r2
  79     r1' = case a of
  80       High -> d
  81       Low -> r1
  82     r2' = case a of
  83       High -> r2
  84       Low -> d
  85     -- Packing a tuple
  86     s' = (,) r1' r2'
  87     -- Packing the state by coercion
  88     sp' = s' :: State (Word, Word)
  89     -- Pack our return value
  90     res = (,) sp' out
  91   in
  92     -- The actual result
  93     res
  94 \stoplambda
  95
  96 \startlambda
  97 \italic{normal} = \italic{lambda}
  98 \italic{lambda} = λvar.\italic{lambda} (representable(typeof(var)))
  99                 | \italic{toplet}
 100 \italic{toplet} = let \italic{binding} in \italic{toplet}
 101                 | letrec [\italic{binding}] in \italic{toplet}
 102                 | var (representable(typeof(var)), fvar(var))
 103 \italic{binding} = var = \italic{rhs} (representable(typeof(rhs)))
 104                  -- State packing and unpacking by coercion
 105                  | var0 = var1 :: State ty (fvar(var1))
 106                  | var0 = var1 :: ty (var0 :: State ty) (fvar(var1))
 107 \italic{rhs} = userapp
 108              | builtinapp
 109              -- Extractor case
 110              | case var of C a0 ... an -> ai (fvar(var))
 111              -- Selector case
 112              | case var of (fvar(var))
 113                 DEFAULT -> var0 (fvar(var0))
 114                 C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, fvar(resvar))
 115 \italic{userapp} = \italic{userfunc}
 116                  | \italic{userapp} {userarg}
 117 \italic{userfunc} = var (tvar(var))
 118 \italic{userarg} = var (fvar(var))
 119 \italic{builtinapp} = \italic{builtinfunc}
 120                     | \italic{builtinapp} \italic{builtinarg}
 121 \italic{builtinfunc} = var (bvar(var))
 122 \italic{builtinarg} = \italic{coreexpr}
 123 \stoplambda
 124
 125 -- TODO: Define tvar, fvar, typeof, representable
 126 -- TODO: Limit builtinarg further
 127
 128 -- TODO: There can still be other casts around (which the code can handle,
 129 e.g., ignore), which still need to be documented here.
 130
 131 -- TODO: Note about the selector case. It just supports Bit and Bool
 132 currently, perhaps it should be generalized in the normal form?
 133
 134 When looking at such a program from a hardware perspective, the top level
 135 lambda's define the input ports. The value produced by the let expression is
 136 the output port. Most function applications bound by the let expression
 137 define a component instantiation, where the input and output ports are mapped
 138 to local signals or arguments. Some of the others use a builtin
 139 construction (\eg the \lam{case} statement) or call a builtin function
 140 (\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is
 141 available.
 142
 143 \subsection{Normal definition}
 144 Formally, the normal form is a core program obeying the following
 145 constraints. TODO: Update this section, this is probably not completely
 146 accurate or relevant anymore.
 147
 148 \startitemize[R,inmargin]
 149 %\item All top level binds must have the form $\expr{\bind{fun}{lamexpr}}$.
 150 %$fun$ is an identifier that will be bound as a global identifier.
 151 %\item A $lamexpr$ has the form $\expr{\lam{arg}{lamexpr}}$ or
 152 %$\expr{letexpr}$. $arg$ is an identifier which will be bound as an $argument$.
 153 %\item[letexpr] A $letexpr$ has the form $\expr{\letexpr{letbinds}{retexpr}}$.
 154 %\item $letbinds$ is a list with elements of the form
 155 %$\expr{\bind{res}{appexpr}}$ or $\expr{\bind{res}{builtinexpr}}$, where $res$ is
 156 %an identifier that will be bound as local identifier. The type of the bound
 157 %value must be a $hardware\;type$.
 158 %\item[builtinexpr] A $builtinexpr$ is an expression that can be mapped to an
 159 %equivalent VHDL expression. Since there are many supported forms for this,
 160 %these are defined in a separate table.
 161 %\item An $appexpr$ has the form $\expr{fun}$ or $\expr{\app{appexpr}{x}}$,
 162 %where $fun$ is a global identifier and $x$ is a local identifier.
 163 %\item[retexpr] A $retexpr$ has the form $\expr{x}$ or $\expr{tupexpr}$, where $x$ is a local identifier that is bound as an $argument$ or $result$.  A $retexpr$ must
 164 %be of a $hardware\;type$.
 165 %\item A $tupexpr$ has the form $\expr{con}$ or $\expr{\app{tupexpr}{x}}$,
 166 %where $con$ is a tuple constructor ({\em e.g.} $(,)$ or $(,,,)$) and $x$ is
 167 %a local identifier.
 168 %\item A $hardware\;type$ is a type that can be directly translated to
 169 %hardware. This includes the types $Bit$, $SizedWord$, tuples containing
 170 %elements of $hardware\;type$s, and will include others. This explicitely
 171 %excludes function types.
 172 \stopitemize
 173
 174 TODO: Say something about uniqueness of identifiers
 175
 176 \subsection{Builtin expressions}
 177 A $builtinexpr$, as defined at \in[builtinexpr] can have any of the following forms.
 178
 179 \startitemize[m,inmargin]
 180 %\item
 181 %$tuple\_extract=\expr{\case{t}{\alt{\app{con}{x_0\;x_1\;..\;x_n}}{x_i}}}$,
 182 %where $t$ can be any local identifier, $con$ is a tuple constructor ({\em
 183 %e.g.} $(,)$ or $(,,,)$), $x_0$ to $x_n$ can be any identifier, and $x_i$ can
 184 %be any of $x_0$ to $x_n$. A case expression must have a $hardware\;type$.
 185 %\item TODO: Many more!
 186 \stopitemize
 187
 188 \section{Transform passes}
 189
 190 In this section we describe the actual transforms. Here we're using
 191 the core language in a notation that resembles lambda calculus.
 192
 193 Each of these transforms is meant to be applied to every (sub)expression
 194 in a program, for as long as it applies. Only when none of the
 195 expressions can be applied anymore, the program is in normal form. We
 196 hope to be able to prove that this form will obey all of the constraints
 197 defined above, but this has yet to happen (though it seems likely that
 198 it will).
 199
 200 Each of the transforms will be described informally first, explaining
 201 the need for and goal of the transform. Then, a formal definition is
 202 given, using a familiar syntax from the world of logic. Each transform
 203 is specified as a number of conditions (above the horizontal line) and a
 204 number of conclusions (below the horizontal line). The details of using
 205 this notation are still a bit fuzzy, so comments are welcom.
 206
 207 TODO: Formally describe the "apply to every (sub)expression" in terms of
 208 rules with full transformations in the conditions.
 209
 210 \subsection{η-abstraction}
 211 This transformation makes sure that all arguments of a function-typed
 212 expression are named, by introducing lambda expressions. When combined with
 213 β-reduction and function inlining below, all function-typed expressions should
 214 be lambda abstractions or global identifiers.
 215
 216 \starttrans
 217 E                 \lam{E :: * -> *}
 218 --------------    \lam{E} is not the first argument of an application.
 219 λx.E x            \lam{E} is not a lambda abstraction.
 220                   \lam{x} is a variable that does not occur free in \lam{E}.
 221 \stoptrans
 222
 223 \startbuffer[from]
 224 foo = λa -> case a of
 225   True -> λb.mul b b
 226   False -> id
 227 \stopbuffer
 228
 229 \startbuffer[to]
 230 foo = λa.λx -> (case a of
 231     True -> λb.mul b b
 232     False -> λy.id y) x
 233 \stopbuffer
 234
 235 \transexample{η-abstraction}{from}{to}
 236
 237 \subsection{Extended β-reduction}
 238 This transformation is meant to propagate application expressions downwards
 239 into expressions as far as possible. In lambda calculus, this reduction
 240 is known as β-reduction, but it is of course only defined for
 241 applications of lambda abstractions. We extend this reduction to also
 242 work for the rest of core (case and let expressions).
 243
 244 For let expressions:
 245 \starttrans
 246 let binds in E) M
 247 -----------------
 248 let binds in E M
 249 \stoptrans
 250
 251 For case statements:
 252 \starttrans
 253 (case x of
 254   p1 -> E1
 255   \vdots
 256   pn -> En) M
 257 -----------------
 258 case x of
 259   p1 -> E1 M
 260   \vdots
 261   pn -> En M
 262 \stoptrans
 263
 264 For lambda expressions:
 265 \starttrans
 266 (λx.E) M
 267 -----------------
 268 E[M/x]
 269 \stoptrans
 270
 271 % And an example
 272 \startbuffer[from]
 273 ( let a = (case x of
 274             True -> id
 275             False -> neg
 276           ) 1
 277       b = (let y = 3 in add y) 2
 278   in
 279     (λz.add 1 z)
 280 ) 3
 281 \stopbuffer
 282
 283 \startbuffer[to]
 284 let a = case x of
 285            True -> id 1
 286            False -> neg 1
 287     b = let y = 3 in add y 2
 288 in
 289   add 1 3
 290 \stopbuffer
 291
 292 \transexample{Extended β-reduction}{from}{to}
 293
 294 \subsection{Let derecursification}
 295 This transformation is meant to make lets non-recursive whenever possible.
 296 This might allow other optimizations to do their work better. TODO: Why is
 297 this needed exactly?
 298
 299 \subsection{Let flattening}
 300 This transformation puts nested lets in the same scope, by lifting the
 301 binding(s) of the inner let into a new let around the outer let. Eventually,
 302 this will cause all let bindings to appear in the same scope (they will all be
 303 in scope for the function return value).
 304
 305 Note that this transformation does not try to be smart when faced with
 306 recursive lets, it will just leave the lets recursive (possibly joining a
 307 recursive and non-recursive let into a single recursive let). The let
 308 rederursification transformation will do this instead.
 309
 310 \starttrans
 311 letnonrec x = (let bindings in M) in N
 312 ------------------------------------------
 313 let bindings in (letnonrec x = M) in N
 314 \stoptrans
 315
 316 \starttrans
 317 letrec
 318   \vdots
 319   x = (let bindings in M)
 320   \vdots
 321 in
 322   N
 323 ------------------------------------------
 324 letrec
 325   \vdots
 326   bindings
 327   x = M
 328   \vdots
 329 in
 330   N
 331 \stoptrans
 332
 333 \startbuffer[from]
 334 let
 335   a = letrec
 336     x = 1
 337     y = 2
 338   in
 339     x + y
 340 in
 341   letrec
 342     b = let c = 3 in a + c
 343     d = 4
 344   in
 345     d + b
 346 \stopbuffer
 347 \startbuffer[to]
 348 letrec
 349   x = 1
 350   y = 2
 351 in
 352   let
 353     a = x + y
 354   in
 355     letrec
 356       c = 3
 357       b = a + c
 358       d = 4
 359     in
 360       d + b
 361 \stopbuffer
 362
 363 \transexample{Let flattening}{from}{to}
 364
 365 \subsection{Empty let removal}
 366 This transformation is simple: It removes recursive lets that have no bindings
 367 (which usually occurs when let derecursification removes the last binding from
 368 it).
 369
 370 \starttrans
 371 letrec in M
 372 --------------
 373 M
 374 \stoptrans
 375
 376 \subsection{Simple let binding removal}
 377 This transformation inlines simple let bindings (\eg a = b).
 378
 379 This transformation is not needed to get into normal form, but makes the
 380 resulting VHDL a lot shorter.
 381
 382 \starttrans
 383 letnonrec
 384   a = b
 385 in
 386   M
 387 -----------------
 388 M[b/a]
 389 \stoptrans
 390
 391 \starttrans
 392 letrec
 393   \vdots
 394   a = b
 395   \vdots
 396 in
 397   M
 398 -----------------
 399 let
 400   \vdots [b/a]
 401   \vdots [b/a]
 402 in
 403   M[b/a]
 404 \stoptrans
 405
 406 \subsection{Unused let binding removal}
 407 This transformation removes let bindings that are never used. Usually,
 408 the desugarer introduces some unused let bindings.
 409
 410 This normalization pass should really be unneeded to get into normal form
 411 (since ununsed bindings are not forbidden by the normal form), but in practice
 412 the desugarer or simplifier emits some unused bindings that cannot be
 413 normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
 414 this transformation makes the resulting VHDL a lot shorter.
 415
 416 \starttrans
 417 let a = E in M
 418 ----------------------------    \lam{a} does not occur free in \lam{M}
 419 M
 420 \stoptrans
 421
 422 \starttrans
 423 letrec
 424   \vdots
 425   a = E
 426   \vdots
 427 in
 428   M
 429 ----------------------------    \lam{a} does not occur free in \lam{M}
 430 letrec
 431   \vdots
 432   \vdots
 433 in
 434   M
 435 \stoptrans
 436
 437 \subsection{Non-representable binding inlining}
 438 This transform inlines let bindings that have a non-representable type. Since
 439 we can never generate a signal assignment for these bindings (we cannot
 440 declare a signal assignment with a non-representable type, for obvious
 441 reasons), we have no choice but to inline the binding to remove it.
 442
 443 If the binding is non-representable because it is a lambda abstraction, it is
 444 likely that it will inlined into an application and β-reduction will remove
 445 the lambda abstraction and turn it into a representable expression at the
 446 inline site. The same holds for partial applications, which can be turned into
 447 full applications by inlining.
 448
 449 Other cases of non-representable bindings we see in practice are primitive
 450 Haskell types. In most cases, these will not result in a valid normalized
 451 output, but then the input would have been invalid to start with. There is one
 452 exception to this: When a builtin function is applied to a non-representable
 453 expression, things might work out in some cases. For example, when you write a
 454 literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
 455 the following core: \lam{fromInteger (smallInteger 10)}, where for example
 456 \lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
 457 non-representable types. TODO: This/these paragraph(s) should probably become a
 458 separate discussion somewhere else.
 459
 460 \starttrans
 461 letnonrec a = E in M
 462 --------------------------    \lam{E} has a non-representable type.
 463 M[E/a]
 464 \stoptrans
 465
 466 \starttrans
 467 letrec
 468   \vdots
 469   a = E
 470   \vdots
 471 in
 472   M
 473 --------------------------    \lam{E} has a non-representable type.
 474 letrec
 475   \vdots [E/a]
 476   \vdots [E/a]
 477 in
 478   M[E/a]
 479 \stoptrans
 480
 481 \startbuffer[from]
 482 letrec
 483   a = smallInteger 10
 484   inc = λa -> add a 1
 485   inc' = add 1
 486   x = fromInteger a
 487 in
 488   inc (inc' x)
 489 \stopbuffer
 490
 491 \startbuffer[to]
 492 letrec
 493   x = fromInteger (smallInteger 10)
 494 in
 495   (λa -> add a 1) (add 1 x)
 496 \stopbuffer
 497
 498 \transexample{Let flattening}{from}{to}
 499
 500 \subsection{Compiler generated top level binding inlining}
 501 TODO
 502
 503 \subsection{Scrutinee simplification}
 504 This transform ensures that the scrutinee of a case expression is always
 505 a simple variable reference.
 506
 507 \starttrans
 508 case E of
 509   alts
 510 -----------------        \lam{E} is not a local variable reference
 511 let x = E in
 512   case E of
 513     alts
 514 \stoptrans
 515
 516 \startbuffer[from]
 517 case (foo a) of
 518   True -> a
 519   False -> b
 520 \stopbuffer
 521
 522 \startbuffer[to]
 523 let x = foo a in
 524   case x of
 525     True -> a
 526     False -> b
 527 \stopbuffer
 528
 529 \transexample{Let flattening}{from}{to}
 530
 531
 532 \subsection{Case simplification}
 533 This transformation ensures that all case expressions become normal form. This
 534 means they will become one of:
 535 \startitemize
 536 \item An extractor case with a single alternative that picks a single field
 537 from a datatype, \eg \lam{case x of (a, b) -> a}.
 538 \item A selector case with multiple alternatives and only wild binders, that
 539 makes a choice between expressions based on the constructor of another
 540 expression, \eg \lam{case x of Low -> a; High -> b}.
 541 \stopitemize
 542
 543 \starttrans
 544 case E of
 545   C0 v0,0 ... v0,m -> E0
 546   \vdots
 547   Cn vn,0 ... vn,m -> En
 548 --------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
 549 letnonrec
 550   v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
 551   \vdots
 552   v0,m = case x of C0 v0,0 .. v0,m -> v0,m
 553   x0 = E0
 554   \dots
 555   vn,m = case x of Cn vn,0 .. vn,m -> vn,m
 556   xn = En
 557 in
 558   case E of
 559     C0 w0,0 ... w0,m -> x0
 560     \vdots
 561     Cn wn,0 ... wn,m -> xn
 562 \stoptrans
 563
 564 TODO: This transformation specified like this is complicated and misses
 565 conditions to prevent looping with itself. Perhaps we should split it here for
 566 discussion?
 567
 568 \startbuffer[from]
 569 case a of
 570   True -> add b 1
 571   False -> add b 2
 572 \stopbuffer
 573
 574 \startbuffer[to]
 575 letnonrec
 576   x0 = add b 1
 577   x1 = add b 2
 578 in
 579   case a of
 580     True -> x0
 581     False -> x1
 582 \stopbuffer
 583
 584 \transexample{Selector case simplification}{from}{to}
 585
 586 \startbuffer[from]
 587 case a of
 588   (,) b c -> add b c
 589 \stopbuffer
 590 \startbuffer[to]
 591 letnonrec
 592   b = case a of (,) b c -> b
 593   c = case a of (,) b c -> c
 594   x0 = add b c
 595 in
 596   case a of
 597     (,) w0 w1 -> x0
 598 \stopbuffer
 599
 600 \transexample{Extractor case simplification}{from}{to}
 601
 602 \subsection{Case removal}
 603 This transform removes any case statements with a single alternative and
 604 only wild binders.
 605
 606 These "useless" case statements are usually leftovers from case simplification
 607 on extractor case (see the previous example).
 608
 609 \starttrans
 610 case x of
 611   C v0 ... vm -> E
 612 ----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
 613 E
 614 \stoptrans
 615
 616 \startbuffer[from]
 617 case a of
 618   (,) w0 w1 -> x0
 619 \stopbuffer
 620
 621 \startbuffer[to]
 622 x0
 623 \stopbuffer
 624
 625 \transexample{Case removal}{from}{to}
 626
 627 \subsection{Argument simplification}
 628 The transforms in this section deal with simplifying application
 629 arguments into normal form. The goal here is to:
 630
 631 \startitemize
 632  \item Make all arguments of user-defined functions (\eg, of which
 633  we have a function body) simple variable references of a runtime
 634  representable type. This is needed, since these applications will be turned
 635  into component instantiations.
 636  \item Make all arguments of builtin functions one of:
 637    \startitemize
 638     \item A type argument.
 639     \item A dictionary argument.
 640     \item A type level expression.
 641     \item A variable reference of a runtime representable type.
 642     \item A variable reference or partial application of a function type.
 643    \stopitemize
 644 \stopitemize
 645
 646 When looking at the arguments of a user-defined function, we can
 647 divide them into two categories:
 648 \startitemize
 649   \item Arguments with a runtime representable type (\eg bits or vectors).
 650
 651         These arguments can be preserved in the program, since they can
 652         be translated to input ports later on.  However, since we can
 653         only connect signals to input ports, these arguments must be
 654         reduced to simple variables (for which signals will be
 655         produced). This is taken care of by the argument extraction
 656         transform.
 657   \item Non-runtime representable typed arguments.
 658
 659         These arguments cannot be preserved in the program, since we
 660         cannot represent them as input or output ports in the resulting
 661         VHDL. To remove them, we create a specialized version of the
 662         called function with these arguments filled in. This is done by
 663         the argument propagation transform.
 664
 665         Typically, these arguments are type and dictionary arguments that are
 666         used to make functions polymorphic. By propagating these arguments, we
 667         are essentially doing the same which GHC does when it specializes
 668         functions: Creating multiple variants of the same function, one for
 669         each type for which it is used. Other common non-representable
 670         arguments are functions, e.g. when calling a higher order function
 671         with another function or a lambda abstraction as an argument.
 672
 673         The reason for doing this is similar to the reasoning provided for
 674         the inlining of non-representable let bindings above. In fact, this
 675         argument propagation could be viewed as a form of cross-function
 676         inlining.
 677 \stopitemize
 678
 679 TODO: Check the following itemization.
 680
 681 When looking at the arguments of a builtin function, we can divide them
 682 into categories:
 683
 684 \startitemize
 685   \item Arguments with a runtime representable type.
 686
 687         As we have seen with user-defined functions, these arguments can
 688         always be reduced to a simple variable reference, by the
 689         argument extraction transform. Performing this transform for
 690         builtin functions as well, means that the translation of builtin
 691         functions can be limited to signal references, instead of
 692         needing to support all possible expressions.
 693
 694   \item Arguments with a function type.
 695
 696         These arguments are functions passed to higher order builtins,
 697         like \lam{map} and \lam{foldl}. Since implementing these
 698         functions for arbitrary function-typed expressions (\eg, lambda
 699         expressions) is rather comlex, we reduce these arguments to
 700         (partial applications of) global functions.
 701
 702         We can still support arbitrary expressions from the user code,
 703         by creating a new global function containing that expression.
 704         This way, we can simply replace the argument with a reference to
 705         that new function. However, since the expression can contain any
 706         number of free variables we also have to include partial
 707         applications in our normal form.
 708
 709         This category of arguments is handled by the function extraction
 710         transform.
 711   \item Other unrepresentable arguments.
 712
 713         These arguments can take a few different forms:
 714         \startdesc{Type arguments}
 715           In the core language, type arguments can only take a single
 716           form: A type wrapped in the Type constructor. Also, there is
 717           nothing that can be done with type expressions, except for
 718           applying functions to them, so we can simply leave type
 719           arguments as they are.
 720         \stopdesc
 721         \startdesc{Dictionary arguments}
 722           In the core language, dictionary arguments are used to find
 723           operations operating on one of the type arguments (mostly for
 724           finding class methods). Since we will not actually evaluatie
 725           the function body for builtin functions and can generate
 726           code for builtin functions by just looking at the type
 727           arguments, these arguments can be ignored and left as they
 728           are.
 729         \stopdesc
 730         \startdesc{Type level arguments}
 731           Sometimes, we want to pass a value to a builtin function, but
 732           we need to know the value at compile time. Additionally, the
 733           value has an impact on the type of the function. This is
 734           encoded using type-level values, where the actual value of the
 735           argument is not important, but the type encodes some integer,
 736           for example. Since the value is not important, the actual form
 737           of the expression does not matter either and we can leave
 738           these arguments as they are.
 739         \stopdesc
 740         \startdesc{Other arguments}
 741           Technically, there is still a wide array of arguments that can
 742           be passed, but does not fall into any of the above categories.
 743           However, none of the supported builtin functions requires such
 744           an argument. This leaves use with passing unsupported types to
 745           a function, such as calling \lam{head} on a list of functions.
 746
 747           In these cases, it would be impossible to generate hardware
 748           for such a function call anyway, so we can ignore these
 749           arguments.
 750
 751           The only way to generate hardware for builtin functions with
 752           arguments like these, is to expand the function call into an
 753           equivalent core expression (\eg, expand map into a series of
 754           function applications). But for now, we choose to simply not
 755           support expressions like these.
 756         \stopdesc
 757
 758         From the above, we can conclude that we can simply ignore these
 759         other unrepresentable arguments and focus on the first two
 760         categories instead.
 761 \stopitemize
 762
 763 \subsubsection{Argument simplification}
 764 This transform deals with arguments to functions that
 765 are of a runtime representable type. It ensures that they will all become
 766 references to global variables, or local signals in the resulting VHDL.
 767
 768 TODO: It seems we can map an expression to a port, not only a signal.
 769 Perhaps this makes this transformation not needed?
 770 TODO: Say something about dataconstructors (without arguments, like True
 771 or False), which are variable references of a runtime representable
 772 type, but do not result in a signal.
 773
 774 To reduce a complex expression to a simple variable reference, we create
 775 a new let expression around the application, which binds the complex
 776 expression to a new variable. The original function is then applied to
 777 this variable.
 778
 779 \starttrans
 780 M N
 781 --------------------    \lam{N} is of a representable type
 782 let x = N in M x        \lam{N} is not a local variable reference
 783 \stoptrans
 784
 785 \startbuffer[from]
 786 add (add a 1) 1
 787 \stopbuffer
 788
 789 \startbuffer[to]
 790 let x = add a 1 in add x 1
 791 \stopbuffer
 792
 793 \transexample{Argument extraction}{from}{to}
 794
 795 \subsubsection{Function extraction}
 796 This transform deals with function-typed arguments to builtin functions.
 797 Since these arguments cannot be propagated, we choose to extract them
 798 into a new global function instead.
 799
 800 Any free variables occuring in the extracted arguments will become
 801 parameters to the new global function. The original argument is replaced
 802 with a reference to the new function, applied to any free variables from
 803 the original argument.
 804
 805 This transformation is useful when applying higher order builtin functions
 806 like \hs{map} to a lambda abstraction, for example. In this case, the code
 807 that generates VHDL for \hs{map} only needs to handle top level functions and
 808 partial applications, not any other expression (such as lambda abstractions or
 809 even more complicated expressions).
 810
 811 \starttrans
 812 M N                     \lam{M} is a (partial aplication of a) builtin function.
 813 ---------------------   \lam{f0 ... fn} = free local variables of \lam{N}
 814 M x f0 ... fn           \lam{N :: a -> b}
 815 ~                       \lam{N} is not a (partial application of) a top level function
 816 x = λf0 ... λfn.N
 817 \stoptrans
 818
 819 \startbuffer[from]
 820 map (λa . add a b) xs
 821
 822 map (add b) ys
 823 \stopbuffer
 824
 825 \startbuffer[to]
 826 x0 = λb.λa.add a b
 827 ~
 828 map x0 xs
 829
 830 x1 = λb.add b
 831 map x1 ys
 832 \stopbuffer
 833
 834 \transexample{Function extraction}{from}{to}
 835
 836 \subsubsection{Argument propagation}
 837 This transform deals with arguments to user-defined functions that are
 838 not representable at runtime. This means these arguments cannot be
 839 preserved in the final form and most be {\em propagated}.
 840
 841 Propagation means to create a specialized version of the called
 842 function, with the propagated argument already filled in. As a simple
 843 example, in the following program:
 844
 845 \startlambda
 846 f = λa.λb.a + b
 847 inc = λa.f a 1
 848 \stoplambda
 849
 850 we could {\em propagate} the constant argument 1, with the following
 851 result:
 852
 853 \startlambda
 854 f' = λa.a + 1
 855 inc = λa.f' a
 856 \stoplambda
 857
 858 Special care must be taken when the to-be-propagated expression has any
 859 free variables. If this is the case, the original argument should not be
 860 removed alltogether, but replaced by all the free variables of the
 861 expression. In this way, the original expression can still be evaluated
 862 inside the new function. Also, this brings us closer to our goal: All
 863 these free variables will be simple variable references.
 864
 865 To prevent us from propagating the same argument over and over, a simple
 866 local variable reference is not propagated (since is has exactly one
 867 free variable, itself, we would only replace that argument with itself).
 868
 869 This shows that any free local variables that are not runtime representable
 870 cannot be brought into normal form by this transform. We rely on an
 871 inlining transformation to replace such a variable with an expression we
 872 can propagate again.
 873
 874 TODO: Move these definitions somewhere sensible.
 875
 876 Definition: A global variable is any variable that is bound at the
 877 top level of a program. A local variable is any other variable.
 878
 879 Definition: A hardware representable type is a type that we can generate
 880 a signal for in hardware. For example, a bit, a vector of bits, a 32 bit
 881 unsigned word, etc. Types that are not runtime representable notably
 882 include (but are not limited to): Types, dictionaries, functions.
 883
 884 Definition: A builtin function is a function for which a builtin
 885 hardware translation is available, because its actual definition is not
 886 translatable. A user-defined function is any other function.
 887
 888 \starttrans
 889 x = E
 890 ~
 891 x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
 892 ---------------------------------------------    \lam{Yi} is not a local variable reference
 893 x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
 894 ~
 895 x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn .
 896       E y0 ... yi-1 Yi yi+1 ... yn
 897
 898 \stoptrans
 899
 900 TODO: Example
 901
 902 \subsection{Cast propagation / simplification}
 903 This transform pushes casts down into the expression as far as possible. Since
 904 its exact role and need is not clear yet, this transformation is not yet
 905 specified.
 906
 907 \subsection{Return value simplification}
 908 This transformation ensures that the return value of a function is always a
 909 simple local variable reference.
 910
 911 Currently implemented using lambda simplification, let simplification, and
 912 top simplification. Should change into something like the following, which
 913 works only on the result of a function instead of any subexpression. This is
 914 achieved by the contexts, like \lam{x = E}, though this is strictly not
 915 correct (you could read this as "if there is any function \lam{x} that binds
 916 \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
 917 is bound by \lam{x}. This might need some extra notes or something).
 918
 919 \starttrans
 920 x = E                            \lam{E} is representable
 921 ~                                \lam{E} is not a lambda abstraction
 922 E                                \lam{E} is not a let expression
 923 ---------------------------      \lam{E} is not a local variable reference
 924 let x = E in x
 925 \stoptrans
 926
 927 \starttrans
 928 x = λv0 ... λvn.E
 929 ~                                \lam{E} is representable
 930 E                                \lam{E} is not a let expression
 931 ---------------------------      \lam{E} is not a local variable reference
 932 let x = E in x
 933 \stoptrans
 934
 935 \starttrans
 936 x = λv0 ... λvn.let ... in E
 937 ~                                \lam{E} is representable
 938 E                                \lam{E} is not a local variable reference
 939 ---------------------------
 940 let x = E in x
 941 \stoptrans
 942
 943 \startbuffer[from]
 944 x = add 1 2
 945 \stopbuffer
 946
 947 \startbuffer[to]
 948 x = let x = add 1 2 in x
 949 \stopbuffer
 950
 951 \transexample{Return value simplification}{from}{to}
 952
 953 \subsection{Example sequence}
 954
 955 This section lists an example expression, with a sequence of transforms
 956 applied to it. The exact transforms given here probably don't exactly
 957 match the transforms given above anymore, but perhaps this can clarify
 958 the big picture a bit.
 959
 960 TODO: Update or remove this section.
 961
 962 \startlambda
 963   λx.
 964     let s = foo x
 965     in
 966       case s of
 967         (a, b) ->
 968           case a of
 969             High -> add
 970             Low -> let
 971               op' = case b of
 972                 High -> sub
 973                 Low  -> λc.λd.c
 974               in
 975                 λc.λd.op' d c
 976 \stoplambda
 977
 978 After top-level η-abstraction:
 979
 980 \startlambda
 981   λx.λc.λd.
 982     (let s = foo x
 983     in
 984       case s of
 985         (a, b) ->
 986           case a of
 987             High -> add
 988             Low -> let
 989               op' = case b of
 990                 High -> sub
 991                 Low  -> λc.λd.c
 992               in
 993                 λc.λd.op' d c
 994     ) c d
 995 \stoplambda
 996
 997 After (extended) β-reduction:
 998
 999 \startlambda
1000   λx.λc.λd.
1001     let s = foo x
1002     in
1003       case s of
1004         (a, b) ->
1005           case a of
1006             High -> add c d
1007             Low -> let
1008               op' = case b of
1009                 High -> sub
1010                 Low  -> λc.λd.c
1011               in
1012                 op' d c
1013 \stoplambda
1014
1015 After return value extraction:
1016
1017 \startlambda
1018   λx.λc.λd.
1019     let s = foo x
1020         r = case s of
1021               (a, b) ->
1022                 case a of
1023                   High -> add c d
1024                   Low -> let
1025                     op' = case b of
1026                       High -> sub
1027                       Low  -> λc.λd.c
1028                     in
1029                       op' d c
1030     in
1031       r
1032 \stoplambda
1033
1034 Scrutinee simplification does not apply.
1035
1036 After case binder wildening:
1037
1038 \startlambda
1039   λx.λc.λd.
1040     let s = foo x
1041         a = case s of (a, _) -> a
1042         b = case s of (_, b) -> b
1043         r = case s of (_, _) ->
1044               case a of
1045                 High -> add c d
1046                 Low -> let op' = case b of
1047                              High -> sub
1048                              Low  -> λc.λd.c
1049                        in
1050                          op' d c
1051     in
1052       r
1053 \stoplambda
1054
1055 After case value simplification
1056
1057 \startlambda
1058   λx.λc.λd.
1059     let s = foo x
1060         a = case s of (a, _) -> a
1061         b = case s of (_, b) -> b
1062         r = case s of (_, _) -> r'
1063         rh = add c d
1064         rl = let rll = λc.λd.c
1065                  op' = case b of
1066                    High -> sub
1067                    Low  -> rll
1068              in
1069                op' d c
1070         r' = case a of
1071                High -> rh
1072                Low -> rl
1073     in
1074       r
1075 \stoplambda
1076
1077 After let flattening:
1078
1079 \startlambda
1080   λx.λc.λd.
1081     let s = foo x
1082         a = case s of (a, _) -> a
1083         b = case s of (_, b) -> b
1084         r = case s of (_, _) -> r'
1085         rh = add c d
1086         rl = op' d c
1087         rll = λc.λd.c
1088         op' = case b of
1089           High -> sub
1090           Low  -> rll
1091         r' = case a of
1092                High -> rh
1093                Low -> rl
1094     in
1095       r
1096 \stoplambda
1097
1098 After function inlining:
1099
1100 \startlambda
1101   λx.λc.λd.
1102     let s = foo x
1103         a = case s of (a, _) -> a
1104         b = case s of (_, b) -> b
1105         r = case s of (_, _) -> r'
1106         rh = add c d
1107         rl = (case b of
1108           High -> sub
1109           Low  -> λc.λd.c) d c
1110         r' = case a of
1111           High -> rh
1112           Low -> rl
1113     in
1114       r
1115 \stoplambda
1116
1117 After (extended) β-reduction again:
1118
1119 \startlambda
1120   λx.λc.λd.
1121     let s = foo x
1122         a = case s of (a, _) -> a
1123         b = case s of (_, b) -> b
1124         r = case s of (_, _) -> r'
1125         rh = add c d
1126         rl = case b of
1127           High -> sub d c
1128           Low  -> d
1129         r' = case a of
1130           High -> rh
1131           Low -> rl
1132     in
1133       r
1134 \stoplambda
1135
1136 After case value simplification again:
1137
1138 \startlambda
1139   λx.λc.λd.
1140     let s = foo x
1141         a = case s of (a, _) -> a
1142         b = case s of (_, b) -> b
1143         r = case s of (_, _) -> r'
1144         rh = add c d
1145         rlh = sub d c
1146         rl = case b of
1147           High -> rlh
1148           Low  -> d
1149         r' = case a of
1150           High -> rh
1151           Low -> rl
1152     in
1153       r
1154 \stoplambda
1155
1156 After case removal:
1157
1158 \startlambda
1159   λx.λc.λd.
1160     let s = foo x
1161         a = case s of (a, _) -> a
1162         b = case s of (_, b) -> b
1163         r = r'
1164         rh = add c d
1165         rlh = sub d c
1166         rl = case b of
1167           High -> rlh
1168           Low  -> d
1169         r' = case a of
1170           High -> rh
1171           Low -> rl
1172     in
1173       r
1174 \stoplambda
1175
1176 After let bind removal:
1177
1178 \startlambda
1179   λx.λc.λd.
1180     let s = foo x
1181         a = case s of (a, _) -> a
1182         b = case s of (_, b) -> b
1183         rh = add c d
1184         rlh = sub d c
1185         rl = case b of
1186           High -> rlh
1187           Low  -> d
1188         r' = case a of
1189           High -> rh
1190           Low -> rl
1191     in
1192       r'
1193 \stoplambda
1194
1195 Application simplification is not applicable.