constructs. This step is described in a lot of detail at
\in{chapter}[chap:normalization].
+ \section{The Core language}
+ Most of the prototype deals with handling the program in the Core
+ language. In this section we will show what this language looks like and
+ how it works.
+
+ The Core language is a functional language that describes
+ \emph{expressions}. Every identifier used in Core is called a
+ \emph{binder}, since it is bound to a value somewhere. On the highest
+ level, a Core program is a collection of functions, each of which bind a
+ binder (the function name) to an expression (the function value, which has
+ a function type).
+
+ The Core language itself does not prescribe any program structure, only
+ expression structure. In the \small{GHC} compiler, the Haskell module
+ structure is used for the resulting Core code as well. Since this is not
+ so relevant for understanding the Core language or the Normalization
+ process, we'll only look at the Core expression language here.
+
+ Each Core expression consists of one of these possible expressions.
+
+ \startdesc{Variable reference}
+\startlambda
+a
+\stoplambda
+ This is a simple reference to a binder. It's written down as the
+ name of the binder that is being referred to, which should of course be
+ bound in a containing scope (including top level scope, so a reference
+ to a top level function is also a variable reference). Additionally,
+ constructors from algebraic datatypes also become variable references.
+
+ The value of this expression is the value bound to the given binder.
+
+ Each binder also carries around its type, but this is usually not shown
+ in the Core expressions. Occasionally, the type of an entire expression
+ or function is shown for clarity, but this is only informational. In
+ practice, the type of an expression is easily determined from the
+ structure of the expression and the types of the binders and occasional
+ cast expressions. This minimize the amount of bookkeeping needed to keep
+ the typing consistent.
+ \stopdesc
+ \startdesc{Literal}
+\startlambda
+10
+\stoplambda
+ This is a simple literal. Only primitive types are supported, like
+ chars, strings, ints and doubles. The types of these literals are the
+ \quote{primitive} versions, like \lam{Char\#} and \lam{Word\#}, not the
+ normal Haskell versions (but there are builtin conversion functions).
+ \stopdesc
+ \startdesc{Application}
+\startlambda
+func arg
+\stoplambda
+ This is simple function application. Each application consists of two
+ parts: The function part and the argument part. Applications are used
+ for normal function \quote{calls}, but also for applying type
+ abstractions and data constructors.
+
+ The value of an application is the value of the function part, with the
+ first argument binder bound to the argument part.
+ \stopdesc
+ \startdesc{Lambda abstraction}
+\startlambda
+λbndr.body
+\stoplambda
+ This is the basic lambda abstraction, as it occurs in labmda calculus.
+ It consists of a binder part and a body part. A lambda abstraction
+ creates a function, that can be applied to an argument.
+
+ Note that the body of a lambda abstraction extends all the way to the
+ end of the expression, or the closing bracket surrounding the lambda. In
+ other words, the lambda abstraction \quote{operator} has the lowest
+ priority of all.
+
+ The value of an application is the value of the body part, with the
+ binder bound to the value the entire lambda abstraction is applied to.
+ \stopdesc
+ \startdesc{Non-recursive let expression}
+\startlambda
+let bndr = value in body
+\stoplambda
+ A let expression allows you to bind a binder to some value, while
+ evaluating to some other value (where that binder is in scope). This
+ allows for sharing of subexpressions (you can use a binder twice) and
+ explicit \quote{naming} of arbitrary expressions. Note that the binder
+ is not in scope in the value bound to it, so it's not possible to make
+ recursive definitions with the normal form of the let expression (see
+ the recursive form below).
+
+ Even though this let expression is an extension on the basic lambda
+ calculus, it is easily translated to a lambda abstraction. The let
+ expression above would then become:
+
+\startlambda
+(λbndr.body) value
+\stoplambda
+
+ This notion might be useful for verifying certain properties on
+ transformations, since a lot of verification work has been done on
+ lambda calculus already.
+
+ The value of a let expression is the value of the body part, with the
+ binder bound to the value.
+ \stopdesc
+ \startdesc{Recursive let expression}
+\startlambda
+let
+ bndr1 = value1
+ \vdots
+ bndrn = valuen
+in
+ body
+\stoplambda
+
+ This is the recursive version of the let expression. In \small{GHC}'s
+ Core implementation, non-recursive and recursive lets are not so
+ distinct as we present them here, but this provides a clearer overview.
+
+ The main difference with the normal let expression is that each of the
+ binders is in scope in each of the values, in addition to the body. This
+ allows for self-recursive definitions or mutually recursive definitions.
+
+ It should also be possible to express a recursive let using normal
+ lambda calculus, if we use the \emph{least fixed-point operator},
+ \lam{Y}.
+ \stopdesc
+ \startdesc{Case expression}
+\startlambda
+ case scrut of bndr
+ DEFAULT -> defaultbody
+ C0 bndr0,0 ... bndr0,m -> body0
+ \vdots
+ Cn bndrn,0 ... bndrn,m -> bodyn
+\stoplambda
+
+TODO: Define WHNF
+
+ A case expression is the only way in Core to choose between values. A case
+ expression evaluates its scrutinee, which should have an algebraic
+ datatype, into weak head normal form (\small{WHNF}) and (optionally) binds
+ it to \lam{bndr}. It then chooses a body depending on the constructor of
+ its scrutinee. If none of the constructors match, the \lam{DEFAULT}
+ alternative is chosen.
+
+ Since we can only match the top level constructor, there can be no overlap
+ in the alternatives and thus order of alternatives is not relevant (though
+ the \lam{DEFAULT} alternative must appear first for implementation
+ efficiency).
+
+ Any arguments to the constructor in the scrutinee are bound to each of the
+ binders after the constructor and are in scope only in the corresponding
+ body.
+
+ To support strictness, the scrutinee is always evaluated into WHNF, even
+ when there is only a \lam{DEFAULT} alternative. This allows a strict
+ function argument to be written like:
+
+\startlambda
+function (case argument of arg
+ DEFAULT -> arg)
+\stoplambda
+
+ This seems to be the only use for the extra binder to which the scrutinee
+ is bound. When not using strictness annotations (which is rather pointless
+ in hardware descriptions), \small{GHC} seems to never generate any code
+ making use of this binder. The current prototype does not handle it
+ either, which probably means that code using it would break.
+
+ Note that these case statements are less powerful than the full Haskell
+ case statements. In particular, they do not support complex patterns like
+ in Haskell. Only the constructor of an expression can be matched, complex
+ patterns are implemented using multiple nested case expressions.
+
+ Case statements are also used for unpacking of algebraic datatypes, even
+ when there is only a single constructor. For examples, to add the elements
+ of a tuple, the following Core is generated:
+
+\startlambda
+sum = λtuple.case tuple of
+ (,) a b -> a + b
+\stoplambda
+
+ Here, there is only a single alternative (but no \lam{DEFAULT}
+ alternative, since the single alternative is already exhaustive). When
+ it's body is evaluated, the arguments to the tuple constructor \lam{(,)}
+ (\eg, the elements of the tuple) are bound to \lam{a} and \lam{b}.
+ \stopdesc
+ \startdesc{Cast expression}
+\startlambda
+body :: targettype
+\stoplambda
+ A cast expression allows you to change the type of an expression to an
+ equivalent type. Note that this is not meant to do any actual work, like
+ conversion of data from one format to another, or force a complete type
+ change. Instead, it is meant to change between different representations
+ of the same type, \eg switch between types that are provably equal (but
+ look different).
+
+ In our hardware descriptions, we typically see casts to change between a
+ Haskell newtype and its contained type, since those are effectively
+ different representations of the same type.
+
+ More complex are types that are proven to be equal by the typechecker,
+ but look different at first glance. To ensure that, once the typechecker
+ has proven equality, this information sticks around, explicit casts are
+ added. In our notation we only write the target type, but in reality a
+ cast expressions carries around a \emph{coercion}, which can be seen as a
+ proof of equality. TODO: Example
+
+ The value of a cast is the value of its body, unchanged. The type of this
+ value is equal to the target type, not the type of its body.
+
+ Note that this syntax is also used sometimes to indicate that a particular
+ expression has a particular type, even when no cast expression is
+ involved. This is then purely informational, since the only elements that
+ are explicitely typed in the Core language are the binder references and
+ cast expressions, the types of all other elements are determined at
+ runtime.
+ \stopdesc
+ \startdesc{Note}
+
+ The Core language in \small{GHC} allows adding \emph{notes}, which serve
+ as hints to the inliner or add custom (string) annotations to a core
+ expression. These shouldn't be generated normally, so these are not
+ handled in any way in the prototype.
+ \stopdesc
+ \startdesc{Type}
+\startlambda
+@type
+\stoplambda
+ It is possibly to use a Core type as a Core expression. This is done to
+ allow for type abstractions and applications to be handled as normal
+ lambda abstractions and applications above. This means that a type
+ expression in Core can only ever occur in the argument position of an
+ application, and only if the type of the function that is applied to
+ expects a type as the first argument. This happens for all polymorphic
+ functions, for example, the \lam{fst} function:
+
+\startlambda
+fst :: \forall a. \forall b. (a, b) -> a
+fst = λtup.case tup of (,) a b -> a
+
+fstint :: (Int, Int) -> Int
+fstint = λa.λb.fst @Int @Int a b
+\stoplambda
+
+ The type of \lam{fst} has two universally quantified type variables. When
+ \lam{fst} is applied in \lam{fstint}, it is first applied to two types.
+ (which are substitued for \lam{a} and \lam{b} in the type of \lam{fst}, so
+ the type of \lam{fst} actual type of arguments and result can be found:
+ \lam{fst @Int @Int :: (Int, Int) -> Int}).
+ \stopdesc
+
+ TODO: Core type system
- Core - description of the language (appendix?)
Implementation issues
Simplified core?