Add a few progress documents.

[matthijs/master-project/report.git] / progress-2009.01.14.txt
diff --git a/progress-2009.01.14.txt b/progress-2009.01.14.txt

new file mode 100644 (file)

index 0000000..cc70be2
--- /dev/null
+++ b/progress-2009.01.14.txt
@@ -0,0 +1,125 @@
+Language Design
+===============
+A central question is, what do we want our language to look like? Deciding to
+use haskell would be a first step, but that still leaves a lot of
+possibilities. However, why would we want to use Haskell?
+
+  * Lots of existing support: Language specification, compilers, libraries,
+    userbase. All of this reduces the workload to implement something.
+  * All of Haskell is usable in simulation / testing: When not compiling to
+    VHDL, but when doing functional tests, we can use all of Haskell's syntax,
+    libraries, functions, etc. even when we don't support all of it in our
+    compiler.
+  * Haskell is probably expressive enough to be able to express a lot of
+    different things and different approaches. By defining new (algebraic)
+    types and operators, we can extend this even further. Finally, we could
+    even use Template Haskell to do almost anything (The ForSyDe project uses
+    Template Haskell extensively).
+
+Typing
+------
+What kind of types will we be using in our programs? The base type should be
+some kind of bit type, but there are two main options for these:
+
+  * Use a single bit as a type. Calculations on these types represent the
+    calculations in a single time slice (ie, during one clock cycle). Any
+    state in the calculation is explicit, since the input state must be a
+    function argument and the output state part of the function's value.
+
+    Through the state values, these functions are closely tied to the clock
+    domain in use (or whatever other model is used for the time domain). Ie,
+    having different clock domains is non-trivial and will require
+    modification to the function.
+
+    Composition can be done by simply connection functions together. This does
+    require that the "state" of a function contains the states of each of the
+    function it calls. These are extracted from its state argument and passed
+    on to the callees, while the states returned are again inserted into the
+    resulting state of the outer function.
+
+    This approach is the closest to normal functional programming and shows
+    state very explicitly.
+
+  * Use a stream of bits as a type. Each value models a continuous (infinite)
+    stream of bits. Functions have streams as arguments and value and state
+    could probably be implicit in the calculation or could be made explicit
+    using delay elements (this is what Lava does).
+    
+    The primitive operations all operate on streams, which seems the
+    fundamental difference between this and the previous approach.
+    
+    This approach shows structure more explicitly when delay elements are
+    used, since the registers end up exactly there.
+
+It is probably possible to mix these approaches. In particular, the first
+approache eventually needs to do some I/O, which can be modeled as recursion
+over an infinite list of inputs resulting in an infinite list of outputs. This
+can be done at the top level always, but it is perhaps possible to move this
+recursion a bit downward and do composition based on stream functions instead
+of "single clock" functions. One problem that comes to mind is that one needs
+to guarantee that a function does not look into the future (ie, uses multiple
+input elements to generate one output element) and the state becomes a lot
+less explicit. This needs more thought.
+
+Implementation
+==============
+Assume we will want to translate a (subset of) haskell into something else.
+We will need to do parsing, typechecking, interpretation, etc. A few options
+for this are available:
+
+  * Doing everything manually, possibly using a parser generator.
+    Lot's of work. Requires us to write down the haskell grammar, create a
+    parser, do all kinds of funky type checking -> Too much work.
+  * Use the Language.Haskell library and write something in haskell. This
+    library contains a Lexer, Parser and AST, but still does not do any type
+    checking, desugaring and simplification. Still lots of work.
+  * Hack into an existing compiler. By writing a backend for an existing
+    compiler, we can save most of the haskell-specific work. This requires a
+    compiler that is modular enough to have its backend replaced by another
+    implementation. Also, the border between backend and frontend must be
+    suited for the work we would like to do. This approach would use the
+    original compiler's driver for the most part
+  * Use components from an existing compiler. By reusing selected components
+    and creating an own driver that uses these components, we can work with
+    the haskell source without doing all the work ourselves. This requires a
+    compiler that is very modular, so we can decide exactly which parts we
+    would like to use and which we don't need.
+
+After some looking around it seems GHC provides a consistent and (as of
+version 6.10) documented API into its internals This allows us to write
+
+    core <- compileToCoreSimplified "adder.hs"
+
+to compile the file `adder.hs` into a version in GHC's simplified CoreSyn
+representation. This is essentially an AST with type annotations, all
+syntactic sugar removed, redundant expressions simplified, etc. This seems to
+be a very useful form to work with. However, GHC provides other functions to
+access other parts of the compiler, so if we would like to access the source
+at another stage, or do some custom procesing in between, this is also
+possible (albeit a bit more complex than this).
+
+Problems
+--------
+Some problems encountered or expected when trying to do some initial
+translator hackup:
+
+  * Signal naming. In the generated VHDL the signals and ports need names.
+    These can of course be generated, but if we want to do some actual
+    debugging, we want useful signal names. For input ports generated from
+    function arguments, the binder name could perhaps be used, but that is not
+    always possible when patterns are more complex.
+
+    For output ports, a name could be deduced when the output value is always
+    computed in a where or let clause. However the simplification phase
+    removes these clauses where possible, so this information is lost.
+
+    Perhaps some way of explicitely labeling ports could be used.
+  * Type classes and universal qualification types are a powerful way of
+    making functions generic, which greatly increases their reusability.
+    However, this extra genericity makes the resulting Core representation
+    more complex. Since the actual types used are modeled as extra function
+    arguments, this shouldn't be too hard, but this is something that probably
+    needs implementation right away (because a lot of builtin haskell
+    constructs, such as tuples, use it).
+
+<!-- vim: set sw=2 sts=2 ts=8: -->