\newcommand{\fref}[1]{\cref{#1}}
\newcommand{\Fref}[1]{\Cref{#1}}
+\usepackage{epstopdf}
+
+\epstopdfDeclareGraphicsRule{.svg}{pdf}{.pdf}{rsvg-convert --format=pdf < #1 > \noexpand\OutputFile}
%include polycode.fmt
%include clash.fmt
% author names and affiliations
% use a multiple column layout for up to three different
% affiliations
-\author{\IEEEauthorblockN{Christiaan P.R. Baaij, Matthijs Kooijman, Jan Kuper, Marco E.T. Gerards, Bert Molenkamp, Sabih H. Gerez}
-\IEEEauthorblockA{University of Twente, Department of EEMCS\\
+\author{\IEEEauthorblockN{Christiaan P.R. Baaij, Matthijs Kooijman, Jan Kuper, Marco E.T. Gerards}%, Bert Molenkamp, Sabih H. Gerez}
+\IEEEauthorblockA{Computer Architecture for Embedded Systems (CAES)\\
+Department of EEMCS, University of Twente\\
P.O. Box 217, 7500 AE, Enschede, The Netherlands\\
c.p.r.baaij@@utwente.nl, matthijs@@stdin.nl, j.kuper@@utwente.nl}}
% \and
\begin{abstract}
%\boldmath
-The abstract goes here.
+\CLaSH\ is a functional hardware description language that borrows both its
+syntax and semantics from the functional programming language Haskell. Circuit
+descriptions can be translated to synthesizable VHDL using the prototype
+\CLaSH\ compiler. As the circuit descriptions are made in plain Haskell,
+simulations can also be compiled by a Haskell compiler.
+
+The use of polymorphism and higher-order functions allow a circuit designer to
+describe more abstract and general specifications than are possible in the
+traditional hardware description languages.
\end{abstract}
% IEEEtran.cls defaults to using nonbold math in the Abstract.
% This preserves the distinction between vectors and scalars. However,
\section{Introduction}
-Hardware description languages has allowed the productivity of hardware
+Hardware description languages have allowed the productivity of hardware
engineers to keep pace with the development of chip technology. Standard
Hardware description languages, like \VHDL~\cite{VHDL2008} and
Verilog~\cite{Verilog}, allowed an engineer to describe circuits using a
and types that together form the language primitives of the domain specific
language. As a result of how the signals are modeled and abstracted, the
functions used to describe a circuit also build a large domain-specific
-datatype (hidden from the designer) which can be further processed by an
+datatype (hidden from the designer) which can then be processed further by an
embedded compiler. This compiler actually runs in the same environment as the
description; as a result compile-time and run-time become hard to define, as
the embedded compiler is usually compiled by the same Haskell compiler as the
capture certain language constructs, such as Haskell's choice elements
(if-constructs, case-constructs, pattern matching, etc.), which are not
available in the functional hardware description languages that are embedded
-in Haskell as a domain specific languages. As far as the authors know, such
+in Haskell as a domain specific language. As far as the authors know, such
extensive support for choice-elements is new in the domain of functional
hardware description languages. As the hardware descriptions are plain Haskell
functions, these descriptions can be compiled for simulation using an
-optimizing Haskell compiler such as the Glasgow Haskell Compiler (\GHC).
+optimizing Haskell compiler such as the Glasgow Haskell Compiler (\GHC)~\cite{ghc}.
Where descriptions in a conventional hardware description language have an
explicit clock for the purpose state and synchronicity, the clock is implied
in this research. A developer describes the behavior of the hardware between
-clock cycles, as such, only synchronous systems can be described. Many
-functional hardware description model signals as a stream of all values over
-time; state is then modeled as a delay on this stream of values. The approach
-taken in this research is to make the current state of a circuit part of the
-input of the function and the updated state part of the output.
+clock cycles. The current abstraction of state and time limits the
+descriptions to synchronous hardware, there however is room within the
+language to eventually add a different abstraction mechanism that will allow
+for the modeling of asynchronous systems. Many functional hardware description
+model signals as a stream of all values over time; state is then modeled as a
+delay on this stream of values. The approach taken in this research is to make
+the current state of a circuit part of the input of the function and the
+updated state part of the output.
Like the standard hardware description languages, descriptions made in a
functional hardware description language must eventually be converted into a
-netlist. This research also features a prototype translator called \CLaSH\
-(pronounced: clash), which converts the Haskell code to equivalently behaving
-synthesizable \VHDL\ code, ready to be converted to an actual netlist format
-by any (optimizing) \VHDL\ synthesis tool.
+netlist. This research also features a prototype translator, which has the
+same name as the language: \CLaSH\footnote{C$\lambda$aSH: CAES Language for
+Synchronous Hardware} (pronounced: clash). This compiler converts the Haskell
+code to equivalently behaving synthesizable \VHDL\ code, ready to be converted
+to an actual netlist format by an (optimizing) \VHDL\ synthesis tool.
+
+Besides trivial circuits such as variants of both the FIR filter and the
+simple CPU shown in \Cref{sec:usecases}, the \CLaSH\ compiler has also been
+shown to work for non-trivial descriptions. \CLaSH\ has been able to
+successfully translate the functional description of a streaming reduction
+circuit~\cite{reductioncircuit} for floating point numbers.
\section{Hardware description in Haskell}
\end{code}
\begin{figure}
- \centerline{\includegraphics{mac}}
+ \centerline{\includegraphics{mac.svg}}
\caption{Combinatorial Multiply-Accumulate}
\label{img:mac-comb}
\end{figure}
\end{code}
\begin{figure}
- \centerline{\includegraphics{mac-nocurry}}
+ \centerline{\includegraphics{mac-nocurry.svg}}
\caption{Combinatorial Multiply-Accumulate (complex input)}
\label{img:mac-comb-nocurry}
\end{figure}
% against the constructors in the \hs{case} expressions.
We can see two versions of a contrived example below, the first
using a \hs{case} construct and the other using a \hs{if-then-else}
- constructs, in the code below. The example sums two values when they are
- equal or non-equal (depending on the predicate given) and returns 0
- otherwise. Both versions of the example roughly correspond to the same
- netlist, which is depicted in \Cref{img:choice}.
+ constructs, in the code below.
\begin{code}
sumif pred a b = case pred of
\end{code}
\begin{figure}
- \centerline{\includegraphics{choice-case}}
+ \centerline{\includegraphics{choice-case.svg}}
\caption{Choice - sumif}
\label{img:choice}
\end{figure}
+
+ The example sums two values when they are equal or non-equal (depending on
+ the predicate given) and returns 0 otherwise. Both versions of the example
+ roughly correspond to the same netlist, which is depicted in
+ \Cref{img:choice}.
A slightly more complex (but very powerful) form of choice is pattern
matching. A function can be defined in multiple clauses, where each clause
(\Cref{img:choice}) as the earlier two versions of the example.
\begin{code}
- sumif Eq a b | a == b = a + b
- sumif Neq a b | a != b = a + b
- sumif _ _ _ = 0
+ sumif Eq a b | a == b = a + b
+ | otherwise = 0
+ sumif Neq a b | a != b = a + b
+ | otherwise = 0
\end{code}
% \begin{figure}
% value.
\item[\bf{Multiple constructors with fields}]
Algebraic datatypes with multiple constructors, where at least
- one of these constructors has one or more fields are not
- currently supported.
+ one of these constructors has one or more fields are currently not
+ supported.
\end{xlist}
\subsection{Polymorphism}
type classes, where a class definition provides the general interface of a
function, and class instances define the functionality for the specific
types. An example of such a type class is the \hs{Num} class, which
- contains all of Haskell's numerical operation. A developer can make use of
- this ad-hoc polymorphism by adding a constraint to a parametrically
+ contains all of Haskell's numerical operations. A developer can make use
+ of this ad-hoc polymorphism by adding a constraint to a parametrically
polymorphic type variable. Such a constraint indicates that the type
variable can only be instantiated to a type whose members supports the
overloaded functions associated with the type class.
for numerical operations, \hs{Eq} for the equality operators, and
\hs{Ord} for the comparison/order operators.
- \subsection{Higher order}
+ \subsection{Higher-order functions \& values}
Another powerful abstraction mechanism in functional languages, is
- the concept of \emph{higher order functions}, or \emph{functions as
+ the concept of \emph{higher-order functions}, or \emph{functions as
a first class value}. This allows a function to be treated as a
value and be passed around, even as the argument of another
- function. Let's clarify that with an example:
+ function. The following example should clarify this concept:
\begin{code}
- notList xs = map not xs
+ negVector xs = map not xs
\end{code}
- This defines a function \hs{notList}, with a single list of booleans
- \hs{xs} as an argument, which simply negates all of the booleans in
- the list. To do this, it uses the function \hs{map}, which takes
- \emph{another function} as its first argument and applies that other
- function to each element in the list, returning again a list of the
- results.
-
- As you can see, the \hs{map} function is a higher order function,
- since it takes another function as an argument. Also note that
- \hs{map} is again a polymorphic function: It does not pose any
- constraints on the type of elements in the list passed, other than
- that it must be the same as the type of the argument the passed
- function accepts. The type of elements in the resulting list is of
- course equal to the return type of the function passed (which need
- not be the same as the type of elements in the input list). Both of
- these can be readily seen from the type of \hs{map}:
-
- \begin{code}
- map :: (a -> b) -> [a] -> [b]
- \end{code}
-
- As an example from a common hardware design, let's look at the
- equation of a FIR filter.
-
- \begin{equation}
- y_t = \sum\nolimits_{i = 0}^{n - 1} {x_{t - i} \cdot h_i }
- \end{equation}
-
- A FIR filter multiplies fixed constants ($h$) with the current and
- a few previous input samples ($x$). Each of these multiplications
- are summed, to produce the result at time $t$.
-
- This is easily and directly implemented using higher order
- functions. Consider that the vector \hs{hs} contains the FIR
- coefficients and the vector \hs{xs} contains the current input sample
- in front and older samples behind. How \hs{xs} gets its value will be
- show in the next section about state.
+ The code above defines a function \hs{negVector}, which takes a vector of
+ booleans, and returns a vector where all the values are negated. It
+ achieves this by calling the \hs{map} function, and passing it
+ \emph{another function}, boolean negation, and the vector of booleans,
+ \hs{xs}. The \hs{map} function applies the negation function to all the
+ elements in the vector.
+
+ The \hs{map} function is called a higher-order function, since it takes
+ another function as an argument. Also note that \hs{map} is again a
+ parametric polymorphic function: It does not pose any constraints on the
+ type of the vector elements, other than that it must be the same type as
+ the input type of the function passed to \hs{map}. The element type of the
+ resulting vector is equal to the return type of the function passed, which
+ need not necessarily be the same as the element type of the input vector.
+ All of these characteristics can readily be inferred from the type
+ signature belonging to \hs{map}:
\begin{code}
- fir ... = foldl1 (+) (zipwith (*) xs hs)
+ map :: (a -> b) -> [a|n] -> [b|n]
\end{code}
- Here, the \hs{zipwith} function is very similar to the \hs{map}
- function: It takes a function two lists and then applies the
- function to each of the elements of the two lists pairwise
- (\emph{e.g.}, \hs{zipwith (+) [1, 2] [3, 4]} becomes
- \hs{[1 + 3, 2 + 4]}.
-
- The \hs{foldl1} function takes a function and a single list and applies the
- function to the first two elements of the list. It then applies to
- function to the result of the first application and the next element
- from the list. This continues until the end of the list is reached.
- The result of the \hs{foldl1} function is the result of the last
- application.
-
- As you can see, the \hs{zipwith (*)} function is just pairwise
- multiplication and the \hs{foldl1 (+)} function is just summation.
-
- To make the correspondence between the code and the equation even
- more obvious, we turn the list of input samples in the equation
- around. So, instead of having the the input sample received at time
- $t$ in $x_t$, $x_0$ now always stores the current sample, and $x_i$
- stores the $ith$ previous sample. This changes the equation to the
- following (Note that this is completely equivalent to the original
- equation, just with a different definition of $x$ that better suits
- the \hs{x} from the code):
-
- \begin{equation}
- y_t = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot h_i }
- \end{equation}
-
- So far, only functions have been used as higher order values. In
+ So far, only functions have been used as higher-order values. In
Haskell, there are two more ways to obtain a function-typed value:
partial application and lambda abstraction. Partial application
means that a function that takes multiple arguments can be applied
Here, the expression \hs{(+) 1} is the partial application of the
plus operator to the value \hs{1}, which is again a function that
- adds one to its argument.
-
- A labmda expression allows one to introduce an anonymous function
- in any expression. Consider the following expression, which again
- adds one to every element of a list:
+ adds one to its argument. A lambda expression allows one to introduce an
+ anonymous function in any expression. Consider the following expression,
+ which again adds one to every element of a vector:
\begin{code}
map (\x -> x + 1) xs
\end{code}
- Finally, higher order arguments are not limited to just builtin
+ Finally, higher order arguments are not limited to just built-in
functions, but any function defined in \CLaSH\ can have function
arguments. This allows the hardware designer to use a powerful
abstraction mechanism in his designs and have an optimal amount of
\item when the function is called, it should not have observable
side-effects.
\end{inparaenum}
- This purity property is important for functional languages, since it
- enables all kinds of mathematical reasoning that could not be guaranteed
- correct for impure functions. Pure functions are as such a perfect match
- for a combinatorial circuit, where the output solely depends on the
- inputs. When a circuit has state however, it can no longer be simply
- described by a pure function. Simply removing the purity property is not a
- valid option, as the language would then lose many of it mathematical
- properties. In an effort to include the concept of state in pure
+ % This purity property is important for functional languages, since it
+ % enables all kinds of mathematical reasoning that could not be guaranteed
+ % correct for impure functions.
+ Pure functions are as such a perfect match or a combinatorial circuit,
+ where the output solely depends on the inputs. When a circuit has state
+ however, it can no longer be simply described by a pure function.
+ % Simply removing the purity property is not a valid option, as the
+ % language would then lose many of it mathematical properties.
+ In an effort to include the concept of state in pure
functions, the current value of the state is made an argument of the
- function; the updated state becomes part of the result. A simple example
- is adding an accumulator register to the earlier multiply-accumulate
- circuit, of which the resulting netlist can be seen in
+ function; the updated state becomes part of the result. In this sense the
+ descriptions made in \CLaSH are the describing the combinatorial parts of
+ a mealy machine.
+
+ A simple example is adding an accumulator register to the earlier
+ multiply-accumulate circuit, of which the resulting netlist can be seen in
\Cref{img:mac-state}:
\begin{code}
- macS a b (State c) = (State c', outp)
+ macS (State c) a b = (State c', outp)
where
outp = mac a b c
c' = outp
\end{code}
\begin{figure}
- \centerline{\includegraphics{mac-state}}
+ \centerline{\includegraphics{mac-state.svg}}
\caption{Stateful Multiply-Accumulate}
\label{img:mac-state}
\end{figure}
- This approach makes the state of a circuit very explicit: which variables
- are part of the state is completely determined by the type signature. This
- approach to state is well suited to be used in combination with the
- existing code and language features, such as all the choice constructs, as
- state values are just normal values.
+ The \hs{State} keyword indicates which arguments are part of the current
+ state, and what part of the output is part of the updated state. This
+ aspect will also reflected in the type signature of the function.
+ Abstracting the state of a circuit in this way makes it very explicit:
+ which variables are part of the state is completely determined by the
+ type signature. This approach to state is well suited to be used in
+ combination with the existing code and language features, such as all the
+ choice constructs, as state values are just normal values. We can simulate
+ stateful descriptions using the recursive \hs{run} function:
+
+ \begin{code}
+ run f s (i:inps) = o : (run f s' inps)
+ where
+ (s', o) = f s i
+ \end{code}
+
+ The \hs{run} function maps a list of inputs over the function that a
+ developer wants to simulate, passing the state to each new iteration. Each
+ value in the input list corresponds to exactly one cycle of the (implicit)
+ clock. The result of the simulation is a list of outputs for every clock
+ cycle. As both the \hs{run} function and the hardware description are
+ plain Haskell, the complete simulation can be compiled by an optimizing
+ Haskell compiler.
+
\section{\CLaSH\ prototype}
-foo\par bar
+The \CLaSH\ language as presented above can be translated to \VHDL\ using
+the prototype \CLaSH\ compiler. This compiler allows experimentation with
+the \CLaSH\ language and allows for running \CLaSH\ designs on actual FPGA
+hardware.
+
+\begin{figure}
+\centerline{\includegraphics{compilerpipeline.svg}}
+\caption{\CLaSH\ compiler pipeline}
+\label{img:compilerpipeline}
+\end{figure}
+
+The prototype heavily uses \GHC, the Glasgow Haskell Compiler.
+\Cref{img:compilerpipeline} shows the \CLaSH\ compiler pipeline. As you can
+see, the front-end is completely reused from \GHC, which allows the \CLaSH\
+prototype to support most of the Haskell Language. The \GHC\ front-end
+produces the program in the \emph{Core} format, which is a very small,
+functional, typed language which is relatively easy to process.
+
+The second step in the compilation process is \emph{normalization}. This
+step runs a number of \emph{meaning preserving} transformations on the
+Core program, to bring it into a \emph{normal form}. This normal form
+has a number of restrictions that make the program similar to hardware.
+In particular, a program in normal form no longer has any polymorphism
+or higher order functions.
+
+The final step is a simple translation to \VHDL.
+
+\section{Use cases}
+\label{sec:usecases}
+As an example of a common hardware design where the use of higher-order
+functions leads to a very natural description is a FIR filter, which is
+basically the dot-product of two vectors:
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_{t - i} \cdot h_i }
+\end{equation}
+
+A FIR filter multiplies fixed constants ($h$) with the current
+and a few previous input samples ($x$). Each of these multiplications
+are summed, to produce the result at time $t$. The equation of a FIR
+filter is indeed equivalent to the equation of the dot-product, which is
+shown below:
+
+\begin{equation}
+\mathbf{x}\bullet\mathbf{y} = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot y_i }
+\end{equation}
+
+We can easily and directly implement the equation for the dot-product
+using higher-order functions:
+
+\begin{code}
+xs *+* ys = foldl1 (+) (zipWith (*) xs hs)
+\end{code}
+
+The \hs{zipWith} function is very similar to the \hs{map} function seen
+earlier: It takes a function, two vectors, and then applies the function to
+each of the elements in the two vectors pairwise (\emph{e.g.}, \hs{zipWith (*)
+[1, 2] [3, 4]} becomes \hs{[1 * 3, 2 * 4]} $\equiv$ \hs{[3,8]}).
+
+The \hs{foldl1} function takes a function, a single vector, and applies
+the function to the first two elements of the vector. It then applies the
+function to the result of the first application and the next element from
+the vector. This continues until the end of the vector is reached. The
+result of the \hs{foldl1} function is the result of the last application.
+As you can see, the \hs{zipWith (*)} function is just pairwise
+multiplication and the \hs{foldl1 (+)} function is just summation.
+
+Returning to the actual FIR filter, we will slightly change the
+equation belong to it, so as to make the translation to code more obvious.
+What we will do is change the definition of the vector of input samples.
+So, instead of having the input sample received at time
+$t$ stored in $x_t$, $x_0$ now always stores the current sample, and $x_i$
+stores the $ith$ previous sample. This changes the equation to the
+following (Note that this is completely equivalent to the original
+equation, just with a different definition of $x$ that will better suit
+the transformation to code):
+
+\begin{equation}
+y_t = \sum\nolimits_{i = 0}^{n - 1} {x_i \cdot h_i }
+\end{equation}
+
+Consider that the vector \hs{hs} contains the FIR coefficients and the
+vector \hs{xs} contains the current input sample in front and older
+samples behind. The function that shifts the input samples is shown below:
+
+\begin{code}
+x >> xs = x +> tail xs
+\end{code}
+
+Where the \hs{tail} function returns all but the first element of a
+vector, and the concatenate operator ($\succ$) adds a new element to the
+left of a vector. The complete definition of the FIR filter then becomes:
+
+\begin{code}
+fir (State (xs,hs)) x = (State (x >> xs,hs), xs *+* hs)
+\end{code}
+
+The resulting netlist of a 4-taps FIR filter based on the above definition
+is depicted in \Cref{img:4tapfir}.
+
+\begin{figure}
+\centerline{\includegraphics{4tapfir.svg}}
+\caption{4-taps FIR Filter}
+\label{img:4tapfir}
+\end{figure}
\section{Related work}
Many functional hardware description languages have been developed over the
extension of Backus' \acro{FP} language to synchronous streams, designed
particularly for describing and reasoning about regular circuits. The
Ruby~\cite{Ruby} language uses relations, instead of functions, to describe
-circuits, and has a particular focus on layout. \acro{HML}~\cite{HML2} is a
-hardware modeling language based on the strict functional language
-\acro{ML}, and has support for polymorphic types and higher-order functions.
-Published work suggests that there is no direct simulation support for
-\acro{HML}, and that the translation to \VHDL\ is only partial.
+circuits, and has a particular focus on layout.
+
+\acro{HML}~\cite{HML2} is a hardware modeling language based on the strict
+functional language \acro{ML}, and has support for polymorphic types and
+higher-order functions. Published work suggests that there is no direct
+simulation support for \acro{HML}, but that a description in \acro{HML} has to
+be translated to \VHDL\ and that the translated description can than be
+simulated in a \VHDL\ simulator. Also not all of the mentioned language
+features of \acro{HML} could be translated to hardware. The \CLaSH\ compiler
+on the other hand can correctly translate all of the language constructs
+mentioned in this paper to a netlist format.
Like this work, many functional hardware description languages have some sort
of foundation in the functional programming language Haskell.
Hawk~\cite{Hawk1} uses Haskell to describe system-level executable
specifications used to model the behavior of superscalar microprocessors. Hawk
specifications can be simulated, but there seems to be no support for
-automated circuit synthesis. The ForSyDe~\cite{ForSyDe2} system uses Haskell
-to specify abstract system models, which can (manually) be transformed into an
-implementation model using semantic preserving transformations. ForSyDe has
-several simulation and synthesis backends, though synthesis is restricted to
-the synchronous subset of the ForSyDe language.
+automated circuit synthesis.
+
+The ForSyDe~\cite{ForSyDe2} system uses Haskell to specify abstract system
+models, which can (manually) be transformed into an implementation model using
+semantic preserving transformations. A designer can model systems using
+heterogeneous models of computation, which include continuous time,
+synchronous and untimed models of computation. Using so-called domain
+interfaces a designer can simulate electronic systems which have both analog
+as digital parts. ForSyDe has several backends including simulation and
+automated synthesis, though automated synthesis is restricted to the
+synchronous model of computation within ForSyDe. Unlike \CLaSH\ there is no
+support for the automated synthesis of descriptions that contain polymorphism
+or higher-order functions.
Lava~\cite{Lava} is a hardware description language that focuses on the
structural representation of hardware. Besides support for simulation and
generators when viewed from a synthesis viewpoint, in that the language
elements of Haskell, such as choice, can be used to guide the circuit
generation. If a developer wants to insert a choice element inside an actual
-circuit he will have to specify this explicitly as a component. In this
-respect \CLaSH\ differs from Lava, in that all the choice elements, such as
-case-statements and pattern matching, are synthesized to choice elements in the
-eventual circuit. As such, richer control structures can both be specified and
-synthesized in \CLaSH\ compared to any of the languages mentioned in this
-section.
+circuit he will have to explicitly instantiate a multiplexer-like component.
+
+In this respect \CLaSH\ differs from Lava, in that all the choice elements,
+such as case-statements and pattern matching, are synthesized to choice
+elements in the eventual circuit. As such, richer control structures can both
+be specified and synthesized in \CLaSH\ compared to any of the languages
+mentioned in this section.
The merits of polymorphic typing, combined with higher-order functions, are
now also recognized in the `main-stream' hardware description languages,
-exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 has
-support to specify types as generics, thus allowing a developer to describe
+exemplified by the new \VHDL-2008 standard~\cite{VHDL2008}. \VHDL-2008 support for generics has been extended to types, allowing a developer to describe
polymorphic components. Note that those types still require an explicit
-generic map, whereas type-inference and type-specialization are implicit in
-\CLaSH.
+generic map, whereas types can be automatically inferred in \CLaSH.
% Wired~\cite{Wired},, T-Ruby~\cite{T-Ruby}, Hydra~\cite{Hydra}.
%
% http://www.michaelshell.org/tex/ieeetran/bibtex/
\bibliographystyle{IEEEtran}
% argument is your BibTeX string definitions and bibliography database(s)
-\bibliography{IEEEabrv,clash.bib}
+\bibliography{clash}
%
% <OR> manually copy in the resultant .bbl file
% set second argument of \begin to the number of references