[Hardware] pipeline

Jecel Assumpcao Jr jecel at merlintec.com
Wed Mar 12 23:32:56 UTC 2008


Of the many interesting results mentioned in the first year report from
Viewpoints (http://vpri.org/pdf/steps_TR-2007-008.pdf), there is a block
diagram of a RISC processor by Chuck Thacker (Xerox Alto, MS TabletPC
and others). It is very simple - 100 lines of Verilog is mentioned. Some
things that are not explained are easy to figure out, like bit 24
selecting between load constant and regular instructions. The register
set seems to be a flat 128 register file, like Knuth's MMIX or the
Philips Trimedia. Having tried to develop a compiler for the latter, I
must confess that I haven't figured out yet how to make good use of such
a resource.

One of the reasons why this design is so simple is that it isn't
pipelined in the traditional RISC style. Yet it does several things
independently (in parallel) and so can execute one instruction per clock
cycle. That clock won't be as fast as would be possible in a pipelined
design, but given the fast memories available in modern FPGAs the
difference won't be too bad.

There was an interesting series of articles by Jan Gray in "Circuit
Cellar" (http://www.fpgacpu.org/xsoc/cc.html) which I really recommend
to anyone who is interested in CPUs and FPGAs. He designed a 16 bit
pipelined RISC and adapted LCC to generate code for it. The FPGAs he
used in this project were older ones that didn't have any internal
memory, but when newer models became available he did a CPU that was
very similar but was not pipelined: http://www.fpgacpu.org/gr/index.html

None of my stack processors were pipelined (the older Tachyon design
was, but it hid the complexities by multiplexing four or more threads in
what is known as the "barrel execution model"-
http://en.wikipedia.org/wiki/Barrel_processor). It seemed natural,
however, that RISC42 should have a five stage pipeline. In fact, I took
advantage of the normally invisible bypass circuit in pipelined designs
to create the "cascade" instructions with register K and so reduce the
pain of a two address instruction set.

I have not been very happy with the complexity due to the combination of
various features. RISC42 is supposed to be educational and not just
functional and fast. So it would be better for it to be a much simpler
single clock design. That eliminates the K register, but R15 can fill
that role pretty well (even better since it allows task switching
between cascaded instructions). There are no programmer visible changes
to the design except for the slower clock. FPGAs happen to be very good
for pipelined designs since they have a flip-flop for every logic block,
so this simplification won't save much space.

-- Jecel
P.S.: thanks to a tip from Reinout Heeck I seem to have fixed the
problem with my swiki (it was an "intrusion detection system" in the
ADSL modem)


More information about the Hardware mailing list