Normalization.tex

   1 \chapter{Normalization}
   2
   3 % A helper to print a single example in the half the page width. The example
   4 % text should be in a buffer whose name is given in an argument.
   5 %
   6 % The align=right option really does left-alignment, but without the program
   7 % will end up on a single line. The strut=no option prevents a bunch of empty
   8 % space at the start of the frame.
   9 \define[1]\example{
  10   \framed[offset=1mm,align=right,strut=no]{
  11     \setuptyping[option=LAM,style=sans,before=,after=]
  12     \typebuffer[#1]
  13     \setuptyping[option=none,style=\tttf]
  14   }
  15 }
  16
  17
  18 % A transformation example
  19 \definefloat[example][examples]
  20 \setupcaption[example][location=top] % Put captions on top
  21
  22 \define[3]\transexample{
  23   \placeexample[here]{#1}
  24   \startcombination[2*1]
  25     {\example{#2}}{Original program}
  26     {\example{#3}}{Transformed program}
  27   \stopcombination
  28 }
  29 %
  30 %\define[3]\transexampleh{
  31 %%  \placeexample[here]{#1}
  32 %%  \startcombination[1*2]
  33 %%    {\example{#2}}{Original program}
  34 %%    {\example{#3}}{Transformed program}
  35 %%  \stopcombination
  36 %}
  37
  38 The first step in the core to VHDL translation process, is normalization. We
  39 aim to bring the core description into a simpler form, which we can
  40 subsequently translate into VHDL easily. This normal form is needed because
  41 the full core language is more expressive than VHDL in some areas and because
  42 core can describe expressions that do not have a direct hardware
  43 interpretation.
  44
  45 TODO: Describe core properties not supported in VHDL, and describe how the
  46 VHDL we want to generate should look like.
  47
  48 \section{Goal}
  49 The transformations described here have a well-defined goal: To bring the
  50 program in a well-defined form that is directly translatable to hardware,
  51 while fully preserving the semantics of the program.
  52
  53 This {\em normal form} is again a Core program, but with a very specific
  54 structure. A function in normal form has nested lambda's at the top, which
  55 produce a number of nested let expressions. These let expressions binds a
  56 number of simple expressions in the function and produces a simple identifier.
  57 Every bound value in the let expression is either a simple function
  58 application, a case expression to extract a single element from a tuple
  59 returned by a function or a case expression to choose between two signals
  60 based on some other signal.
  61
  62 This structure is easy to translate to VHDL, since each top level lambda will
  63 be an input port, every bound value will become a concurrent statement (such
  64 as a component instantiation or conditional signal assignment) and the result
  65 variable will become the output port.
  66
  67 An example of a program in canonical form would be:
  68
  69 \startlambda
  70   -- All arguments are an inital lambda
  71   λa.λd.λsp.
  72   -- There are nested let expressions at top level
  73   let
  74     -- Unpack the state by coercion
  75     s = sp :: (Word, Word)
  76     -- Extract both registers from the state
  77     r1 = case s of (fst, snd) -> fst
  78     r2 = case s of (fst, snd) -> snd
  79     -- Calling some other user-defined function.
  80     d' = foo d
  81     -- Conditional connections
  82     out = case a of
  83       High -> r1
  84       Low -> r2
  85     r1' = case a of
  86       High -> d
  87       Low -> r1
  88     r2' = case a of
  89       High -> r2
  90       Low -> d
  91     -- Packing a tuple
  92     s' = (,) r1' r2'
  93     -- Packing the state by coercion
  94     sp' = s' :: State (Word, Word)
  95     -- Pack our return value
  96     res = (,) sp' out
  97   in
  98     -- The actual result
  99     res
 100 \stoplambda
 101
 102 \subsection{Definitions}
 103 In the following sections, we will be using a number of functions and
 104 notations, which we will define here.
 105
 106 \subsubsection{Transformations}
 107 The most important notation is the one for transformation, which looks like
 108 the following:
 109
 110 \starttrans
 111 context conditions
 112 ~
 113 from
 114 ------------------------            expression conditions
 115 to
 116 ~
 117 context additions
 118 \stoptrans
 119
 120 Here, we describe a transformation. The most import parts are \lam{from} and
 121 \lam{to}, which describe the Core expresssion that should be matched and the
 122 expression that it should be replaced with. This matching can occur anywhere
 123 in function that is being normalized, so it applies to any subexpression as
 124 well.
 125
 126 The \lam{expression conditions} list a number of conditions on the \lam{from}
 127 expression that must hold for the transformation to apply.
 128
 129 Furthermore, there is some way to look into the environment (\eg, other top
 130 level bindings).  The \lam{context conditions} part specifies any number of
 131 top level bindings that must be present for the transformation to apply.
 132 Usually, this lists a top level binding that binds an identfier that is also
 133 used in the \lam{from} expression, allowing us to "access" the value of a top
 134 level binding in the \lam{to} expression (\eg, for inlining).
 135
 136 Finally, there is a way to influence the environment. The \lam{context
 137 additions} part lists any number of new top level bindings that should be
 138 added.
 139
 140 If there are no \lam{context conditions} or \lam{context additions}, they can
 141 be left out alltogether, along with the separator \lam{~}.
 142
 143 TODO: Example
 144
 145 \subsubsection{Other concepts}
 146 A \emph{global variable} is any variable that is bound at the
 147 top level of a program, or an external module. A local variable is any other
 148 variable (\eg, variables local to a function, which can be bound by lambda
 149 abstractions, let expressions and case expressions).
 150
 151 A \emph{hardware representable} type is a type that we can generate
 152 a signal for in hardware. For example, a bit, a vector of bits, a 32 bit
 153 unsigned word, etc. Types that are not runtime representable notably
 154 include (but are not limited to): Types, dictionaries, functions.
 155
 156 A \emph{builtin function} is a function for which a builtin
 157 hardware translation is available, because its actual definition is not
 158 translatable. A user-defined function is any other function.
 159
 160 \subsubsection{Functions}
 161 Here, we define a number of functions that can be used below to concisely
 162 specify conditions.
 163
 164 \emph{gvar(expr)} is true when \emph{expr} is a variable that references a
 165 global variable. It is false when it references a local variable.
 166
 167 \emph{lvar(expr)} is the inverse of \emph{gvar}; it is true when \emph{expr}
 168 references a local variable, false when it references a global variable.
 169
 170 \emph{representable(expr)} or \emph{representable(var)} is true when
 171 \emph{expr} or \emph{var} has a type that is representable at runtime.
 172
 173 \subsection{Normal form definition}
 174 We can describe this normal form in a slightly more formal manner. The
 175 following EBNF-like description completely captures the intended structure
 176 (and generates a subset of GHC's core format).
 177
 178 Some clauses have an expression listed in parentheses. These are conditions
 179 that need to apply to the clause.
 180
 181 \startlambda
 182 \italic{normal} = \italic{lambda}
 183 \italic{lambda} = λvar.\italic{lambda} (representable(var))
 184                 | \italic{toplet}
 185 \italic{toplet} = let \italic{binding} in \italic{toplet}
 186                 | letrec [\italic{binding}] in \italic{toplet}
 187                 | var (representable(varvar))
 188 \italic{binding} = var = \italic{rhs} (representable(rhs))
 189                  -- State packing and unpacking by coercion
 190                  | var0 = var1 :: State ty (lvar(var1))
 191                  | var0 = var1 :: ty (var0 :: State ty) (lvar(var1))
 192 \italic{rhs} = userapp
 193              | builtinapp
 194              -- Extractor case
 195              | case var of C a0 ... an -> ai (lvar(var))
 196              -- Selector case
 197              | case var of (lvar(var))
 198                 DEFAULT -> var0 (lvar(var0))
 199                 C w0 ... wn -> resvar (\forall{}i, wi \neq resvar, lvar(resvar))
 200 \italic{userapp} = \italic{userfunc}
 201                  | \italic{userapp} {userarg}
 202 \italic{userfunc} = var (gvar(var))
 203 \italic{userarg} = var (lvar(var))
 204 \italic{builtinapp} = \italic{builtinfunc}
 205                     | \italic{builtinapp} \italic{builtinarg}
 206 \italic{builtinfunc} = var (bvar(var))
 207 \italic{builtinarg} = \italic{coreexpr}
 208 \stoplambda
 209
 210 -- TODO: Limit builtinarg further
 211
 212 -- TODO: There can still be other casts around (which the code can handle,
 213 e.g., ignore), which still need to be documented here.
 214
 215 -- TODO: Note about the selector case. It just supports Bit and Bool
 216 currently, perhaps it should be generalized in the normal form?
 217
 218 When looking at such a program from a hardware perspective, the top level
 219 lambda's define the input ports. The value produced by the let expression is
 220 the output port. Most function applications bound by the let expression
 221 define a component instantiation, where the input and output ports are mapped
 222 to local signals or arguments. Some of the others use a builtin
 223 construction (\eg the \lam{case} statement) or call a builtin function
 224 (\eg \lam{add} or \lam{sub}). For these, a hardcoded VHDL translation is
 225 available.
 226
 227 \section{Transform passes}
 228 In this section we describe the actual transforms. Here we're using
 229 the core language in a notation that resembles lambda calculus.
 230
 231 Each of these transforms is meant to be applied to every (sub)expression
 232 in a program, for as long as it applies. Only when none of the
 233 expressions can be applied anymore, the program is in normal form. We
 234 hope to be able to prove that this form will obey all of the constraints
 235 defined above, but this has yet to happen (though it seems likely that
 236 it will).
 237
 238 Each of the transforms will be described informally first, explaining
 239 the need for and goal of the transform. Then, a formal definition is
 240 given, using a familiar syntax from the world of logic. Each transform
 241 is specified as a number of conditions (above the horizontal line) and a
 242 number of conclusions (below the horizontal line). The details of using
 243 this notation are still a bit fuzzy, so comments are welcom.
 244
 245 TODO: Formally describe the "apply to every (sub)expression" in terms of
 246 rules with full transformations in the conditions.
 247
 248 \subsection{η-abstraction}
 249 This transformation makes sure that all arguments of a function-typed
 250 expression are named, by introducing lambda expressions. When combined with
 251 β-reduction and function inlining below, all function-typed expressions should
 252 be lambda abstractions or global identifiers.
 253
 254 \starttrans
 255 E                 \lam{E :: * -> *}
 256 --------------    \lam{E} is not the first argument of an application.
 257 λx.E x            \lam{E} is not a lambda abstraction.
 258                   \lam{x} is a variable that does not occur free in \lam{E}.
 259 \stoptrans
 260
 261 \startbuffer[from]
 262 foo = λa.case a of
 263   True -> λb.mul b b
 264   False -> id
 265 \stopbuffer
 266
 267 \startbuffer[to]
 268 foo = λa.λx.(case a of
 269     True -> λb.mul b b
 270     False -> λy.id y) x
 271 \stopbuffer
 272
 273 \transexample{η-abstraction}{from}{to}
 274
 275 \subsection{Extended β-reduction}
 276 This transformation is meant to propagate application expressions downwards
 277 into expressions as far as possible. In lambda calculus, this reduction
 278 is known as β-reduction, but it is of course only defined for
 279 applications of lambda abstractions. We extend this reduction to also
 280 work for the rest of core (case and let expressions).
 281
 282 For let expressions:
 283 \starttrans
 284 let binds in E) M
 285 -----------------
 286 let binds in E M
 287 \stoptrans
 288
 289 For case statements:
 290 \starttrans
 291 (case x of
 292   p1 -> E1
 293   \vdots
 294   pn -> En) M
 295 -----------------
 296 case x of
 297   p1 -> E1 M
 298   \vdots
 299   pn -> En M
 300 \stoptrans
 301
 302 For lambda expressions:
 303 \starttrans
 304 (λx.E) M
 305 -----------------
 306 E[M/x]
 307 \stoptrans
 308
 309 % And an example
 310 \startbuffer[from]
 311 ( let a = (case x of
 312             True -> id
 313             False -> neg
 314           ) 1
 315       b = (let y = 3 in add y) 2
 316   in
 317     (λz.add 1 z)
 318 ) 3
 319 \stopbuffer
 320
 321 \startbuffer[to]
 322 let a = case x of
 323            True -> id 1
 324            False -> neg 1
 325     b = let y = 3 in add y 2
 326 in
 327   add 1 3
 328 \stopbuffer
 329
 330 \transexample{Extended β-reduction}{from}{to}
 331
 332 \subsection{Let derecursification}
 333 This transformation is meant to make lets non-recursive whenever possible.
 334 This might allow other optimizations to do their work better. TODO: Why is
 335 this needed exactly?
 336
 337 \subsection{Let flattening}
 338 This transformation puts nested lets in the same scope, by lifting the
 339 binding(s) of the inner let into a new let around the outer let. Eventually,
 340 this will cause all let bindings to appear in the same scope (they will all be
 341 in scope for the function return value).
 342
 343 Note that this transformation does not try to be smart when faced with
 344 recursive lets, it will just leave the lets recursive (possibly joining a
 345 recursive and non-recursive let into a single recursive let). The let
 346 rederursification transformation will do this instead.
 347
 348 \starttrans
 349 letnonrec x = (let bindings in M) in N
 350 ------------------------------------------
 351 let bindings in (letnonrec x = M) in N
 352 \stoptrans
 353
 354 \starttrans
 355 letrec
 356   \vdots
 357   x = (let bindings in M)
 358   \vdots
 359 in
 360   N
 361 ------------------------------------------
 362 letrec
 363   \vdots
 364   bindings
 365   x = M
 366   \vdots
 367 in
 368   N
 369 \stoptrans
 370
 371 \startbuffer[from]
 372 let
 373   a = letrec
 374     x = 1
 375     y = 2
 376   in
 377     x + y
 378 in
 379   letrec
 380     b = let c = 3 in a + c
 381     d = 4
 382   in
 383     d + b
 384 \stopbuffer
 385 \startbuffer[to]
 386 letrec
 387   x = 1
 388   y = 2
 389 in
 390   let
 391     a = x + y
 392   in
 393     letrec
 394       c = 3
 395       b = a + c
 396       d = 4
 397     in
 398       d + b
 399 \stopbuffer
 400
 401 \transexample{Let flattening}{from}{to}
 402
 403 \subsection{Empty let removal}
 404 This transformation is simple: It removes recursive lets that have no bindings
 405 (which usually occurs when let derecursification removes the last binding from
 406 it).
 407
 408 \starttrans
 409 letrec in M
 410 --------------
 411 M
 412 \stoptrans
 413
 414 \subsection{Simple let binding removal}
 415 This transformation inlines simple let bindings (\eg a = b).
 416
 417 This transformation is not needed to get into normal form, but makes the
 418 resulting VHDL a lot shorter.
 419
 420 \starttrans
 421 letnonrec
 422   a = b
 423 in
 424   M
 425 -----------------
 426 M[b/a]
 427 \stoptrans
 428
 429 \starttrans
 430 letrec
 431   \vdots
 432   a = b
 433   \vdots
 434 in
 435   M
 436 -----------------
 437 let
 438   \vdots [b/a]
 439   \vdots [b/a]
 440 in
 441   M[b/a]
 442 \stoptrans
 443
 444 \subsection{Unused let binding removal}
 445 This transformation removes let bindings that are never used. Usually,
 446 the desugarer introduces some unused let bindings.
 447
 448 This normalization pass should really be unneeded to get into normal form
 449 (since ununsed bindings are not forbidden by the normal form), but in practice
 450 the desugarer or simplifier emits some unused bindings that cannot be
 451 normalized (e.g., calls to a \type{PatError} (TODO: Check this name)). Also,
 452 this transformation makes the resulting VHDL a lot shorter.
 453
 454 \starttrans
 455 let a = E in M
 456 ----------------------------    \lam{a} does not occur free in \lam{M}
 457 M
 458 \stoptrans
 459
 460 \starttrans
 461 letrec
 462   \vdots
 463   a = E
 464   \vdots
 465 in
 466   M
 467 ----------------------------    \lam{a} does not occur free in \lam{M}
 468 letrec
 469   \vdots
 470   \vdots
 471 in
 472   M
 473 \stoptrans
 474
 475 \subsection{Non-representable binding inlining}
 476 This transform inlines let bindings that have a non-representable type. Since
 477 we can never generate a signal assignment for these bindings (we cannot
 478 declare a signal assignment with a non-representable type, for obvious
 479 reasons), we have no choice but to inline the binding to remove it.
 480
 481 If the binding is non-representable because it is a lambda abstraction, it is
 482 likely that it will inlined into an application and β-reduction will remove
 483 the lambda abstraction and turn it into a representable expression at the
 484 inline site. The same holds for partial applications, which can be turned into
 485 full applications by inlining.
 486
 487 Other cases of non-representable bindings we see in practice are primitive
 488 Haskell types. In most cases, these will not result in a valid normalized
 489 output, but then the input would have been invalid to start with. There is one
 490 exception to this: When a builtin function is applied to a non-representable
 491 expression, things might work out in some cases. For example, when you write a
 492 literal \hs{SizedInt} in Haskell, like \hs{1 :: SizedInt D8}, this results in
 493 the following core: \lam{fromInteger (smallInteger 10)}, where for example
 494 \lam{10 :: GHC.Prim.Int\#} and \lam{smallInteger 10 :: Integer} have
 495 non-representable types. TODO: This/these paragraph(s) should probably become a
 496 separate discussion somewhere else.
 497
 498 \starttrans
 499 letnonrec a = E in M
 500 --------------------------    \lam{E} has a non-representable type.
 501 M[E/a]
 502 \stoptrans
 503
 504 \starttrans
 505 letrec
 506   \vdots
 507   a = E
 508   \vdots
 509 in
 510   M
 511 --------------------------    \lam{E} has a non-representable type.
 512 letrec
 513   \vdots [E/a]
 514   \vdots [E/a]
 515 in
 516   M[E/a]
 517 \stoptrans
 518
 519 \startbuffer[from]
 520 letrec
 521   a = smallInteger 10
 522   inc = λa -> add a 1
 523   inc' = add 1
 524   x = fromInteger a
 525 in
 526   inc (inc' x)
 527 \stopbuffer
 528
 529 \startbuffer[to]
 530 letrec
 531   x = fromInteger (smallInteger 10)
 532 in
 533   (λa -> add a 1) (add 1 x)
 534 \stopbuffer
 535
 536 \transexample{Let flattening}{from}{to}
 537
 538 \subsection{Compiler generated top level binding inlining}
 539 TODO
 540
 541 \subsection{Scrutinee simplification}
 542 This transform ensures that the scrutinee of a case expression is always
 543 a simple variable reference.
 544
 545 \starttrans
 546 case E of
 547   alts
 548 -----------------        \lam{E} is not a local variable reference
 549 let x = E in
 550   case E of
 551     alts
 552 \stoptrans
 553
 554 \startbuffer[from]
 555 case (foo a) of
 556   True -> a
 557   False -> b
 558 \stopbuffer
 559
 560 \startbuffer[to]
 561 let x = foo a in
 562   case x of
 563     True -> a
 564     False -> b
 565 \stopbuffer
 566
 567 \transexample{Let flattening}{from}{to}
 568
 569
 570 \subsection{Case simplification}
 571 This transformation ensures that all case expressions become normal form. This
 572 means they will become one of:
 573 \startitemize
 574 \item An extractor case with a single alternative that picks a single field
 575 from a datatype, \eg \lam{case x of (a, b) -> a}.
 576 \item A selector case with multiple alternatives and only wild binders, that
 577 makes a choice between expressions based on the constructor of another
 578 expression, \eg \lam{case x of Low -> a; High -> b}.
 579 \stopitemize
 580
 581 \starttrans
 582 case E of
 583   C0 v0,0 ... v0,m -> E0
 584   \vdots
 585   Cn vn,0 ... vn,m -> En
 586 --------------------------------------------------- \forall i \forall j, 0 <= i <= n, 0 <= i < m (\lam{wi,j} is a wild (unused) binder)
 587 letnonrec
 588   v0,0 = case x of C0 v0,0 .. v0,m -> v0,0
 589   \vdots
 590   v0,m = case x of C0 v0,0 .. v0,m -> v0,m
 591   x0 = E0
 592   \dots
 593   vn,m = case x of Cn vn,0 .. vn,m -> vn,m
 594   xn = En
 595 in
 596   case E of
 597     C0 w0,0 ... w0,m -> x0
 598     \vdots
 599     Cn wn,0 ... wn,m -> xn
 600 \stoptrans
 601
 602 TODO: This transformation specified like this is complicated and misses
 603 conditions to prevent looping with itself. Perhaps we should split it here for
 604 discussion?
 605
 606 \startbuffer[from]
 607 case a of
 608   True -> add b 1
 609   False -> add b 2
 610 \stopbuffer
 611
 612 \startbuffer[to]
 613 letnonrec
 614   x0 = add b 1
 615   x1 = add b 2
 616 in
 617   case a of
 618     True -> x0
 619     False -> x1
 620 \stopbuffer
 621
 622 \transexample{Selector case simplification}{from}{to}
 623
 624 \startbuffer[from]
 625 case a of
 626   (,) b c -> add b c
 627 \stopbuffer
 628 \startbuffer[to]
 629 letnonrec
 630   b = case a of (,) b c -> b
 631   c = case a of (,) b c -> c
 632   x0 = add b c
 633 in
 634   case a of
 635     (,) w0 w1 -> x0
 636 \stopbuffer
 637
 638 \transexample{Extractor case simplification}{from}{to}
 639
 640 \subsection{Case removal}
 641 This transform removes any case statements with a single alternative and
 642 only wild binders.
 643
 644 These "useless" case statements are usually leftovers from case simplification
 645 on extractor case (see the previous example).
 646
 647 \starttrans
 648 case x of
 649   C v0 ... vm -> E
 650 ----------------------     \lam{\forall i, 0 <= i <= m} (\lam{vi} does not occur free in E)
 651 E
 652 \stoptrans
 653
 654 \startbuffer[from]
 655 case a of
 656   (,) w0 w1 -> x0
 657 \stopbuffer
 658
 659 \startbuffer[to]
 660 x0
 661 \stopbuffer
 662
 663 \transexample{Case removal}{from}{to}
 664
 665 \subsection{Argument simplification}
 666 The transforms in this section deal with simplifying application
 667 arguments into normal form. The goal here is to:
 668
 669 \startitemize
 670  \item Make all arguments of user-defined functions (\eg, of which
 671  we have a function body) simple variable references of a runtime
 672  representable type. This is needed, since these applications will be turned
 673  into component instantiations.
 674  \item Make all arguments of builtin functions one of:
 675    \startitemize
 676     \item A type argument.
 677     \item A dictionary argument.
 678     \item A type level expression.
 679     \item A variable reference of a runtime representable type.
 680     \item A variable reference or partial application of a function type.
 681    \stopitemize
 682 \stopitemize
 683
 684 When looking at the arguments of a user-defined function, we can
 685 divide them into two categories:
 686 \startitemize
 687   \item Arguments with a runtime representable type (\eg bits or vectors).
 688
 689         These arguments can be preserved in the program, since they can
 690         be translated to input ports later on.  However, since we can
 691         only connect signals to input ports, these arguments must be
 692         reduced to simple variables (for which signals will be
 693         produced). This is taken care of by the argument extraction
 694         transform.
 695   \item Non-runtime representable typed arguments.
 696
 697         These arguments cannot be preserved in the program, since we
 698         cannot represent them as input or output ports in the resulting
 699         VHDL. To remove them, we create a specialized version of the
 700         called function with these arguments filled in. This is done by
 701         the argument propagation transform.
 702
 703         Typically, these arguments are type and dictionary arguments that are
 704         used to make functions polymorphic. By propagating these arguments, we
 705         are essentially doing the same which GHC does when it specializes
 706         functions: Creating multiple variants of the same function, one for
 707         each type for which it is used. Other common non-representable
 708         arguments are functions, e.g. when calling a higher order function
 709         with another function or a lambda abstraction as an argument.
 710
 711         The reason for doing this is similar to the reasoning provided for
 712         the inlining of non-representable let bindings above. In fact, this
 713         argument propagation could be viewed as a form of cross-function
 714         inlining.
 715 \stopitemize
 716
 717 TODO: Check the following itemization.
 718
 719 When looking at the arguments of a builtin function, we can divide them
 720 into categories:
 721
 722 \startitemize
 723   \item Arguments with a runtime representable type.
 724
 725         As we have seen with user-defined functions, these arguments can
 726         always be reduced to a simple variable reference, by the
 727         argument extraction transform. Performing this transform for
 728         builtin functions as well, means that the translation of builtin
 729         functions can be limited to signal references, instead of
 730         needing to support all possible expressions.
 731
 732   \item Arguments with a function type.
 733
 734         These arguments are functions passed to higher order builtins,
 735         like \lam{map} and \lam{foldl}. Since implementing these
 736         functions for arbitrary function-typed expressions (\eg, lambda
 737         expressions) is rather comlex, we reduce these arguments to
 738         (partial applications of) global functions.
 739
 740         We can still support arbitrary expressions from the user code,
 741         by creating a new global function containing that expression.
 742         This way, we can simply replace the argument with a reference to
 743         that new function. However, since the expression can contain any
 744         number of free variables we also have to include partial
 745         applications in our normal form.
 746
 747         This category of arguments is handled by the function extraction
 748         transform.
 749   \item Other unrepresentable arguments.
 750
 751         These arguments can take a few different forms:
 752         \startdesc{Type arguments}
 753           In the core language, type arguments can only take a single
 754           form: A type wrapped in the Type constructor. Also, there is
 755           nothing that can be done with type expressions, except for
 756           applying functions to them, so we can simply leave type
 757           arguments as they are.
 758         \stopdesc
 759         \startdesc{Dictionary arguments}
 760           In the core language, dictionary arguments are used to find
 761           operations operating on one of the type arguments (mostly for
 762           finding class methods). Since we will not actually evaluatie
 763           the function body for builtin functions and can generate
 764           code for builtin functions by just looking at the type
 765           arguments, these arguments can be ignored and left as they
 766           are.
 767         \stopdesc
 768         \startdesc{Type level arguments}
 769           Sometimes, we want to pass a value to a builtin function, but
 770           we need to know the value at compile time. Additionally, the
 771           value has an impact on the type of the function. This is
 772           encoded using type-level values, where the actual value of the
 773           argument is not important, but the type encodes some integer,
 774           for example. Since the value is not important, the actual form
 775           of the expression does not matter either and we can leave
 776           these arguments as they are.
 777         \stopdesc
 778         \startdesc{Other arguments}
 779           Technically, there is still a wide array of arguments that can
 780           be passed, but does not fall into any of the above categories.
 781           However, none of the supported builtin functions requires such
 782           an argument. This leaves use with passing unsupported types to
 783           a function, such as calling \lam{head} on a list of functions.
 784
 785           In these cases, it would be impossible to generate hardware
 786           for such a function call anyway, so we can ignore these
 787           arguments.
 788
 789           The only way to generate hardware for builtin functions with
 790           arguments like these, is to expand the function call into an
 791           equivalent core expression (\eg, expand map into a series of
 792           function applications). But for now, we choose to simply not
 793           support expressions like these.
 794         \stopdesc
 795
 796         From the above, we can conclude that we can simply ignore these
 797         other unrepresentable arguments and focus on the first two
 798         categories instead.
 799 \stopitemize
 800
 801 \subsubsection{Argument simplification}
 802 This transform deals with arguments to functions that
 803 are of a runtime representable type. It ensures that they will all become
 804 references to global variables, or local signals in the resulting VHDL.
 805
 806 TODO: It seems we can map an expression to a port, not only a signal.
 807 Perhaps this makes this transformation not needed?
 808 TODO: Say something about dataconstructors (without arguments, like True
 809 or False), which are variable references of a runtime representable
 810 type, but do not result in a signal.
 811
 812 To reduce a complex expression to a simple variable reference, we create
 813 a new let expression around the application, which binds the complex
 814 expression to a new variable. The original function is then applied to
 815 this variable.
 816
 817 \starttrans
 818 M N
 819 --------------------    \lam{N} is of a representable type
 820 let x = N in M x        \lam{N} is not a local variable reference
 821 \stoptrans
 822
 823 \startbuffer[from]
 824 add (add a 1) 1
 825 \stopbuffer
 826
 827 \startbuffer[to]
 828 let x = add a 1 in add x 1
 829 \stopbuffer
 830
 831 \transexample{Argument extraction}{from}{to}
 832
 833 \subsubsection{Function extraction}
 834 This transform deals with function-typed arguments to builtin functions.
 835 Since these arguments cannot be propagated, we choose to extract them
 836 into a new global function instead.
 837
 838 Any free variables occuring in the extracted arguments will become
 839 parameters to the new global function. The original argument is replaced
 840 with a reference to the new function, applied to any free variables from
 841 the original argument.
 842
 843 This transformation is useful when applying higher order builtin functions
 844 like \hs{map} to a lambda abstraction, for example. In this case, the code
 845 that generates VHDL for \hs{map} only needs to handle top level functions and
 846 partial applications, not any other expression (such as lambda abstractions or
 847 even more complicated expressions).
 848
 849 \starttrans
 850 M N                     \lam{M} is a (partial aplication of a) builtin function.
 851 ---------------------   \lam{f0 ... fn} = free local variables of \lam{N}
 852 M x f0 ... fn           \lam{N :: a -> b}
 853 ~                       \lam{N} is not a (partial application of) a top level function
 854 x = λf0 ... λfn.N
 855 \stoptrans
 856
 857 \startbuffer[from]
 858 map (λa . add a b) xs
 859
 860 map (add b) ys
 861 \stopbuffer
 862
 863 \startbuffer[to]
 864 x0 = λb.λa.add a b
 865 ~
 866 map x0 xs
 867
 868 x1 = λb.add b
 869 map x1 ys
 870 \stopbuffer
 871
 872 \transexample{Function extraction}{from}{to}
 873
 874 \subsubsection{Argument propagation}
 875 This transform deals with arguments to user-defined functions that are
 876 not representable at runtime. This means these arguments cannot be
 877 preserved in the final form and most be {\em propagated}.
 878
 879 Propagation means to create a specialized version of the called
 880 function, with the propagated argument already filled in. As a simple
 881 example, in the following program:
 882
 883 \startlambda
 884 f = λa.λb.a + b
 885 inc = λa.f a 1
 886 \stoplambda
 887
 888 we could {\em propagate} the constant argument 1, with the following
 889 result:
 890
 891 \startlambda
 892 f' = λa.a + 1
 893 inc = λa.f' a
 894 \stoplambda
 895
 896 Special care must be taken when the to-be-propagated expression has any
 897 free variables. If this is the case, the original argument should not be
 898 removed alltogether, but replaced by all the free variables of the
 899 expression. In this way, the original expression can still be evaluated
 900 inside the new function. Also, this brings us closer to our goal: All
 901 these free variables will be simple variable references.
 902
 903 To prevent us from propagating the same argument over and over, a simple
 904 local variable reference is not propagated (since is has exactly one
 905 free variable, itself, we would only replace that argument with itself).
 906
 907 This shows that any free local variables that are not runtime representable
 908 cannot be brought into normal form by this transform. We rely on an
 909 inlining transformation to replace such a variable with an expression we
 910 can propagate again.
 911
 912 \starttrans
 913 x = E
 914 ~
 915 x Y0 ... Yi ... Yn                               \lam{Yi} is not of a runtime representable type
 916 ---------------------------------------------    \lam{Yi} is not a local variable reference
 917 x' y0 ... yi-1 f0 ...  fm Yi+1 ... Yn            \lam{f0 ... fm} = free local vars of \lam{Yi}
 918 ~
 919 x' = λy0 ... yi-1 f0 ... fm yi+1 ... yn .
 920       E y0 ... yi-1 Yi yi+1 ... yn
 921
 922 \stoptrans
 923
 924 TODO: Example
 925
 926 \subsection{Cast propagation / simplification}
 927 This transform pushes casts down into the expression as far as possible. Since
 928 its exact role and need is not clear yet, this transformation is not yet
 929 specified.
 930
 931 \subsection{Return value simplification}
 932 This transformation ensures that the return value of a function is always a
 933 simple local variable reference.
 934
 935 Currently implemented using lambda simplification, let simplification, and
 936 top simplification. Should change into something like the following, which
 937 works only on the result of a function instead of any subexpression. This is
 938 achieved by the contexts, like \lam{x = E}, though this is strictly not
 939 correct (you could read this as "if there is any function \lam{x} that binds
 940 \lam{E}, any \lam{E} can be transformed, while we only mean the \lam{E} that
 941 is bound by \lam{x}. This might need some extra notes or something).
 942
 943 \starttrans
 944 x = E                            \lam{E} is representable
 945 ~                                \lam{E} is not a lambda abstraction
 946 E                                \lam{E} is not a let expression
 947 ---------------------------      \lam{E} is not a local variable reference
 948 let x = E in x
 949 \stoptrans
 950
 951 \starttrans
 952 x = λv0 ... λvn.E
 953 ~                                \lam{E} is representable
 954 E                                \lam{E} is not a let expression
 955 ---------------------------      \lam{E} is not a local variable reference
 956 let x = E in x
 957 \stoptrans
 958
 959 \starttrans
 960 x = λv0 ... λvn.let ... in E
 961 ~                                \lam{E} is representable
 962 E                                \lam{E} is not a local variable reference
 963 ---------------------------
 964 let x = E in x
 965 \stoptrans
 966
 967 \startbuffer[from]
 968 x = add 1 2
 969 \stopbuffer
 970
 971 \startbuffer[to]
 972 x = let x = add 1 2 in x
 973 \stopbuffer
 974
 975 \transexample{Return value simplification}{from}{to}