I’ve been interested in approachces to digital logic that narrow the divide between hardware and software design.
From a small exercise in using Haskell to design a toy digital circuit, I want to present the possibility that hardware can be treated like abstract programs, meaning, dedicated hardware description languages can be treated as purely intermediate formats.
Instead of covering too much “why?”—and a limited “how?”—I’m hoping to give a minimal example of two things:
- Only Haskell code is read and written. Standard hardware description languages are relegated to a role akin to that of Assembly in the context of ‘traditional’ Haskell development.
- A straightforward mapping from Haskell to digital circuits. To demonstrate the simplicity of the output, the resulting circuit only uses the spartan (and readily breadboard-able!) 74-series ICs.
Background: Clash
Clash is a “functional hardware description language that borrows both its syntax and semantics from the functional programming language Haskell.” So, Clash programs represent digital circuits using [ADJ] Haskell. Clash compiles to, and can interoperate with, standard hardware description languages like (System)Verilog or VHDL.
Clash is not unique in bending or embedding a high-level language into a HDL:
- Amaranth (Python)
- Bluespec (Haskell inspired)
- Chisel (Scala)
- Hardcaml (OCaml)
- RustHDL (…Rust)
- Silver Oak (Coq)
“The way one of our engineers describes it, ‘Bluespec is to SystemVerilog what Rust is to Assembly. It is a much higher-level way of thinking about the system, using types in the compiler to actually generate a reliable system, a verifiable system.’”
–Bryan Cantrill
Implementation
As a running example, this post will write, test, and synthesize a Linear Feedback Shift Register (LFSR) for use as the core of a primitive pseudo-random number generator. LFSR design isn’t a focus, beyond being an application with interesting properties to test and verify1.
This write-up is generated from the Literate Haskell source code.
module LFSR where
import Clash.Prelude hiding ((!!), elem, iterate, length, map, take)
import Data.List (group, sort)
import Prelude ((!!), elem, iterate, length, map, take)
import Test.Hspec (describe, hspec, it)
import Test.Hspec.QuickCheck (prop)
import Test.QuickCheck
An 8-bit LFSR,
The XOR gates are “tapped” into intermediate bits of a shift register. This configuration generates a sequence of 255 pseudo-random bits before repeating (“maximal”).
In Clash, a function that takes an initial 8-bit state and calculates the resulting state,
lfsr :: BitVector 8 -> BitVector 8
= pack feedback ++# slice d7 d1 s
lfsr s where
= (s ! 0) `xor` (s ! 6) `xor` (s ! 5) `xor` (s ! 4) feedback
Implementation Details
feedback
is the calculated feedback bit, as a result of
all of the XORs.
(!)
: indexing intoBitVector
,(s ! m)
in more standard syntax would bes[m]
.
to construct the next 8-bit state, take the bottom seven bits and prepend the feedback bit.
pack
: for types that implementBitPack
, converts a value of that type into aBitVector
. Here packing aBit
returns aBitVector 1
.(++#)
: appendBitVector
s.slice
:slice d7 d1 s
is the bottom 7 bits ofs
with typeBitVector 7
. Something something type theory, so this requiresSNat
in place normal integer literals.
Testing
Before moving on to representations beyond Haskell, tests should verify properties of this function.
Using this function, for example to calculate the fourth state for an
initial value 0xdead
, looks like
lfsr (lfsr (lfsr (lfsr 0xdead)))
To facilitate testing, a helper function from an initial seed to an stream of pseudo-random bits:
lfsrList :: BitVector 8 -> [Bit]
= map msb . iterate lfsr lfsrList
Implementation Details
This is an infinite (lazily evaluated) list; where the nth element is the state after n transitions.
iterate lfsr seed = [seed, lfsr seed, lfsr (lfsr seed), ...]
Note this lazy infinite list feature of Haskell that is not synthesizable! Later, state will be ended by actual memory.
A specification for LFSR properties will act as a test corpus.
=
spec "LFSR properties" $ do describe
- a seed of zero is a fixed point
"zero_fix" $ lfsr 0 == 0 it
- a non-zero seed is not a fixed point, as a property over positive seeds:
"nonzero_nonfix" $
prop Positive seed) -> lfsr seed /= seed \(
- for any non-zero seed, this “maximal” 8-bit LFSR repeats with a period of 255 steps:
"periodicity" $
prop Positive seed) (Positive n) -> do
\(let trace = lfsrList seed
!! n == trace !! (n + 255) trace
- the output stream is psuedo-random by “runs”: in one period, the number of length 2 runs (same bit twice in a row) occurs twice as often as length 3 runs, etc. For an n bit LFSR, as n increases, the number of runs approaches 2^(n-1):
"runs" $
prop Positive seed) ->
\(length . group . take 255) (lfsrList seed) `elem` [127..129] (
(here it converges to within one of the expected value, depending on the initial seed.)
Running the spec,
$ clashi
clashi> :l LFSR.lhs
(cut)
clashi> hspec spec
LFSR properties
zero_fix
nonzero_nonfix
+++ OK, passed 100 tests.
periodicity
+++ OK, passed 100 tests.
runs
+++ OK, passed 100 tests.
Finished in 0.2829 seconds
4 examples, 0 failures
For the property-based tests, 100 test cases validate when contructed with the given constraints. This gives me enough confidence in this small example to move onto hardware representations.
Compiling to Hardware
While the implementation revolved around a lot of circuit-y terms, so far everything has been written unconstrained from hardware synthesis.
Similar to lfsrList
, but now handling state via a
hardware-oriented memory primitive register
:
lfsrRegister :: HiddenClockResetEnable dom => BitVector 8 -> Signal dom Bit
= msb <$> r
lfsrRegister seed where
= register seed (lfsr <$> r) r
Implementation Details
The Signal
and HiddenClockResetEnable dom
in the type signature are now concepts specific to digital circuits,
where multiple clocks be present for different components. Not important
here – HiddenClockResetEnable
expresses that any domain
“dom
” will work.
This is, in effect, the same function as lfsrList
:
lfsr
is made ‘stateful’ in lfsrRegister
with
memory (register
) compared to iterated applications in
lfsrList
.
Lastly, Clash requires an entry point topEntity
:
topEntity ::
Clock System ->
Signal System Bit ->
Signal System Bit
= ciphertext
topEntity clk plaintext where
= withClockResetEnable clk resetGen enableGen (lfsrRegister 0xff)
keystream = xor <$> plaintext <*> keystream ciphertext
For hardware more exciting than something like a randomly blinking
LED, a signal of input bits xor
-ed with the LFSR signal
acts as really basic cipher. The type signature represents the total
purpose of the circuit: a clock, reset, and enable signal combined with
an input signal of plaintext bits results in an output signal of
ciphered bits.
Back in the REPL, reloading and compiling,
clashi> :r
(cut)
clashi> :verilog
(cut)
Clash: Compiling LFSR.topEntity
Clash: Normalization took 0.060s
Clash: Netlist generation took 0.000s
Clash: Compiling LFSR.topEntity took 0.062s
Clash: Total compilation took 1.396s
With :verilog
, the Haskell topEntity
is now
compiled into HDL.
Yosys
Yosys the Open SYnthesis Suite does all the heavy lifting from here. First, what does the compiled Verilog look like2?
$ yosys
yosys> read_verilog LFSR.v
yosys> hierarchy; proc; flatten; opt; opt_clean -purge
(cut)
yosys> show -width
This makes some intuitive sense:
- the
register
became a$dff
(D flip-flop), - the
$xor
cells line up with the LFSR taps, - there’s a big feedback arrow going from the end back to the beginning,
- the top bit of
r
getsxor
-ed withplaintext
for theresult
.
Synthesis
Finally, synthesizing to logic gates:
$ git submodule add https://github.com/jleightcap/74xx-liberty
$ export TECHMAP_74XX=74xx-liberty
$ export TOP=topEntity
$ yosys -c "$(TECHMAP_74XX)"/synth_74.tcl LFSR.v -o LFSR_74.v
74xx-liberty
maps each primitive cell in a circuit (like $dff
or
$xor
) onto equivalent 74-series ICs3.
The result,
This does… not… make intuitive sense. Cleaning it up with some generic digital circuit symbols,
Oh hey! That’s pretty close to the abstract representation from the start.
Layout
This is a quick sketch at a realizable circuit. Wire this up on a breadboard, etch some copper, or fab a PCB:
Discussion
This is a small example neglecting the vast majority of complexity involved in hardware design. In reality for larger circuits, additional steps might include:
- demonstrate behavioral equivalence between the Haskell and HDL; port the test corpus to execute against the HDL
- electrical simulation of synthesized gates
- formal verification, SymbiYosys
- co-simulation against an independent emulator
Appendix
Generated Verilog
Source
/* AUTOMATICALLY GENERATED VERILOG-2001 SOURCE CODE.
** GENERATED BY CLASH 1.6.4. DO NOT MODIFY.
*/
`timescale 100fs/100fs
module topEntity
( // Inputs
input clk // clock
// Outputs
, output wire result
);
wire c$eta_bindCsr;
// LFSR.lhs:104:3-81
reg [7:0] r = 8'b11101111;
wire [7:0] result_1;
wire [7:0] c$app_arg;
wire [7:0] c$bv;
assign c$bv = (r);
assign result = c$bv[8-1] ;
// resetGen begin
// pragma translate_off
reg rst;
localparam reset_period = 100000 - 10 + (1 * 100000);
`ifndef VERILATOR
initial begin
#1 rst = 1 ;
#reset_period rst = 0 ;
end
`else
always begin
// The redundant (rst | ~ rst) is needed to ensure that this is
// calculated in every cycle by verilator. Without it, the reset will stop
// being updated and will be stuck as asserted forever.
rst = $c("this->reset_gen(",reset_period,",true)") & (rst | ~ rst);
end
`systemc_interface
CData reset_gen(vluint32_t reset_period, bool active_high)
static vluint32_t to_wait = reset_period;
static CData reset = active_high ? 1 : 0;
static bool finished = false;
if(!finished) {
if(to_wait == 0) {
reset = reset == 0 ? 1 : 0;
finished = true;
}
else {
to_wait = to_wait - 1;
}
}
return reset;
`verilog
`endif
assign c$eta_bindCsr = rst;
// pragma translate_on
// resetGen end
// register begin
always @(posedge clk or posedge c$eta_bindCsr) begin : r_register
if ( c$eta_bindCsr) begin
r <= 8'b11101111;
end else begin
r <= result_1;
end
end
// register end
assign result_1 = ((((((c$app_arg[64'sd0]) ^ (c$app_arg[64'sd2])) ^ (c$app_arg[64'sd3])) ^ (c$app_arg[64'sd4])))),(c$app_arg[7 : 1]);
assign c$app_arg = r;
endmodule
Footnotes
This example uses slightly modified functions from the Clash Examples (CC BY 4.0).↩︎
It actually “looks” like the listing in Generated Verilog, but this is written with the intent of treating the Verilog as an intermediate.↩︎