Spree tutorial
This presentation is the property of its rightful owner.
Sponsored Links
1 / 32

SPREE Tutorial PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on
  • Presentation posted in: General

SPREE Tutorial. Peter Yiannacouras April 13, 2006. Processors on FPGAs. You all used FPGAs (ECE241) Adders 7-segment decoders Etc. We are putting whole microprocessors on them We call these soft processors. Soft Processor Written in HDL Programmed onto chip. Hard Processors

Download Presentation

SPREE Tutorial

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Spree tutorial

SPREE Tutorial

Peter Yiannacouras

April 13, 2006


Processors on fpgas

Processors on FPGAs

  • You all used FPGAs (ECE241)

    • Adders

    • 7-segment decoders

    • Etc.

  • We are putting whole microprocessors on them

    • We call these soft processors


Hard versus soft processors

Soft Processor

Written in HDL

Programmed onto chip

Hard Processors

Made of transistors

Costs millions to make

Hard Versus Soft Processors

Verilog

Faster

Smaller

Less Power


Processors and fpga systems

We aim to improve soft processors by customizing them

Processors and FPGA Systems

  • FPGAs are a common platform for digital systems

UART

Soft Processor

Memory

Interface

Custom

Logic

Ethernet

  • Performs coordination and even computation

    • Better processors => less hardware to design


Our research problem

Our Research Problem

  • Soft processors have worse

    • Area

    • Speed

    • Power

  • But are

    • Flexible

use to

counteract

HOW???

Customize the processor’s architecture

ie. Intel vs AMD

ie. Motorola 68360 vs 68010

HOW????


Research goals

We developed SPREE, software to help us do both

Research Goals

  • Understand tradeoffs in soft processors

    • Eg. A hardware multiplier is big but can perform multiplies fast

  • Customize it to the application

    • Eg. Bubble sort doesn’t use multiplies, therefore remove hardware multiplier and save on area


Spree system soft processor rapid exploration environment

Processor

Description

ISA

Datapath

SPREE

SPREE System(Soft Processor Rapid Exploration Environment)

  • Input: Processor description

  • SPREE System

  • Verify ISA against datapath

  • Datapath Instantiation

  • Control Generation

Verilog

  • Output: Synthesizable Verilog


Input instruction set architecture isa description

Verilog

ISA currently fixed (subset of MIPS I)

Input: Instruction Set Architecture (ISA) Description

  • Graph of Generic Operations (GENOPs)

    • Edges indicate flow of data

  • ISA

  • Datapath

MIPS ADD – add rd, rs, rt

FETCH

SPREE

RFREAD

RFREAD

ADD

RFWRITE


Input datapath description

Mul

Ifetch

Reg

file

Write

Back

ALU

RTL

Data

Mem

Input: Datapath Description

  • Interconnection of hand-coded components

    • Allows efficient synthesis

  • Described using C++

  • ISA

  • Datapath

Ifetch

Reg File

Ifetch

Reg File

SPREE

Mul

Data

Mem

Mul

Shifter

ALU

Write

Back

ALU

SPREE

Component

Library


Component selection

Component Selection

  • Select by name

    • Names looked up in library

      • Stored in cpugen/rtl_lib

RTLComponent *ifetch=new RTLComponent("ifetch");

RTLComponent *reg_file=new RTLComponent("reg_file");


Datapath wiring example

rd

rs

rt

offset

Ifetch

ALU

opA

result

opB

Datapath Wiring Example

dst

a_reg a_data

b_reg b_data

writedata

Regfile

proc.addConnection(ifetch,"rs",reg_file,"a_reg");

proc.addConnection(ifetch,"rt",reg_file,"b_reg");


Spree system backend soft processor rapid exploration environment

SPREE System + Backend(Soft Processor Rapid Exploration Environment)

SPREE

generator

(spegen)

Processor

Description

Verilog

Benchmarks

Mint

MIPS Simulator

(simulator/run)

Modelsim

Verilog Simulator

(spebenchmark)

Quartus II

CAD Software

(specadflow)

4. Cycle Count

1. Area

2. Clock Frequency

3. Power

Compare

traces


Walking through an example see readme txt

Walking through an Example (see README.txt)

  • Choose a pre-built processor

    • cpugen/src/arch lists all the processors

  • Let’s choose pipe3_serialshift

    • 3-stage pipeline with serial shifter


Using spree on a processor

Using SPREE on a Processor

  • Generate, benchmark, synthesize

% spegen pipe3_serialshift

% spebenchmark pipe3_serialshift

% specadflow pipe3_serialshift

% specompare pipe3_serialshift

← Generates Verilog

← Runs benchmarks

← Synthesizes processor

← Display results


Spegen generating processors

spegen – Generating Processors

  • Input: Processor description

  • Syntax: spegen <processor name>

  • Output:

    • A folder named after the processor

    • Hand-coded Verilog modules

    • system.v

      • Generated hookup and control

    • OUT.cpugen

      • stages per instruction

      • Hazard window/branch penalty

    • test_bench.v

      • test bench for Modelsim simulation


Benchmarking

Benchmarking

  • Run programs on the processor

    • Measure time taken till completion

    • Verify functionality

  • Can do this without knowing anything about the benchmarks themselves


Spebenchmark benchmarking

spebenchmark – Benchmarking

  • Input: Processor implementation

  • Syntax: spebenchmark <processor>

  • Output: (ideally)

    • Cycle counts of all benchmarks

    • Traces: /tmp/modelsim_trace.txt

******* Benchmarking pipe3_serialshift ********

Simulating bubble_sort ... Success! Cycle count=2994

Simulating crc ... Success! Cycle count=112750

Simulating des ... Success! Cycle count=5129

Simulating fft ... Success! Cycle count=5077

Simulating fir ... Success! Cycle count=1214

...


Benchmarking under the hood

Verilog

Benchmarking – under the hood

C source

benchmarks

Compiler

(gcc - MIPS)

Binary

Executable

spebenchmark

Mint

MIPS Simulator

(simulator/run)

Modelsim

Verilog Simulator

(spebenchmark)

Compare

traces

Trace

Trace

Cycle Count

/tmp/modelsim_trace.txt

applications/<benchmark name>/mint

/tmp/modelsim_store_trace.txt


Specompiler setup compiler

specompiler - Setup compiler

  • Choose the path to your compiler (prebuilt)

    • Default: /jayar/b/b0/yiannac/spe/compiler

      • GCC 3.3.3, software division

    • Another: /jayar/b/b0/yiannac/spe/compiler-softmul

      • GCC 3.3.3, software division and software multiplication

  • specompiler will:

    • Compile all benchmarks (and store binaries)

    • Simulate all benchmarks (and store traces)

% specompiler /jayar/b/b0/yiannac/spe/compiler-softmul

After this point, you can just run spebenchmark


Spebenchmark failure

spebenchmark - failure

  • Shows discrepancy between MINT and Modelsim

******* Benchmarking pipe3_serialshift ********

Simulating bubble_sort ... Error: Trace does not match, Cycle count=381

Discrepancy found at 6800000 ps

Modelsim: PC=04000064 | IR=24090001 | 05: 00000000

Mint: PC=040000b8 | IR=8c47004c | 07: 00000064

value

being

written

Clues to

where the

error occurred

destination

register


Spebenchmark waveforms

spebenchmark - waveforms

  • Can see any signal within the processor

% sim_gui bubble_sort pipe3_serialshift


Modelsim

Modelsim

  • LEARN IT!!!

  • Quartus Simulator is vastly inferior, and even unusable for our purposes


The testbench test bench v

The Testbench (test_bench.v)

  • What is it?

    • The stimulus and monitor for your circuit

  • SPREE automatically generates

    • And hence it works right away

  • Handcoding your own processor means

    • You have to interface with the test bench

    • Once you have the testbench you can use spebenchmark


Manual interfacing with the testbench

Manual Interfacing with the Testbench

  • Need only 6 wires

    • To track writes to register file and data mem

test_bench.v

regfile_we

regfile_dst

regfile_data

datamem_we

datamem_addr

datamem_data

Your soft

processor


Spree system backend soft processor rapid exploration environment1

SPREE System + Backend(Soft Processor Rapid Exploration Environment)

SPREE

generator

(spegen)

Processor

Description

Verilog

Benchmarks

Mint

MIPS Simulator

(simulator/run)

Modelsim

Verilog Simulator

(spebenchmark)

Quartus II

CAD Software

(specadflow)

4. Cycle Count

1. Area

2. Clock Frequency

3. Power

Compare

traces


Specadflow synthesis

specadflow – Synthesis

  • Input: Processor implementation

  • Syntax: specadflow <processor name>

  • Performs a “seed sweep”

    • Average several runs since results are noisy

    • Run several instances of quartus

    • Across several machines in parallel


Specadflow output

specadflow Output

  • Output:

    • Synthesis results (hidden)

    • Summary output

Started Tue 6:27PM, Waiting for processes:

10.0.0.61 10.0.0.57 10.0.0.56 10.0.0.55 10.0.0.54

10.0.0.51 Finished Tue 6:33PM

1081

75.7812

0.99822

... Waiting on eda writer

Area (LEs or ALUTs)

Clock Frequency (MHz)

Estimated Energy/cycle dissipated (nJ/cycle)


Any questions

Any Questions?

  • Technical support, ask me


Extras

EXTRAS


Setup install

Setup/Install

  • Copy and unpack the SPREE tarball:

    • /jayar/b/b0/yiannac/spree.tar.gz

  • Build all the SPREE software

  • Follow instructions in INSTALL.txt

  • If there’s any errors, email me

% cd spree

% make


Spree directory structure

SPREE Directory Structure

spree

applications

compiler

cpugen

simulator

quartus

modelsim

binutils

gcc

newlib

the cpu

generator

+

processor

descriptions

Verilog

simulator

MIPS

simulator

Benchmarks

C source

synthesis


Setup cluster

Setup cluster

  • Choose the cluster you’re using

    • aenao – high performance, limited access

    • eecg – any eecg-connected machine

  • Edit quartus/machines.txt

    • Put a list of 11 or so good eecg machines

% specluster eecg

% specluster aenao

OR


  • Login