Presentation is loading. Please wait.

Presentation is loading. Please wait.

SPREE Tutorial Peter Yiannacouras April 13, 2006.

Similar presentations


Presentation on theme: "SPREE Tutorial Peter Yiannacouras April 13, 2006."— Presentation transcript:

1 SPREE Tutorial Peter Yiannacouras April 13, 2006

2 Processors on FPGAs You all used FPGAs (ECE241) Adders 7-segment decoders Etc. We are putting whole microprocessors on them We call these soft processors

3 Hard Versus Soft Processors Soft Processor Written in HDL Programmed onto chip Hard Processors Made of transistors Costs millions to make Verilog Faster Smaller Less Power

4 Processors and FPGA Systems We aim to improve soft processors by customizing them FPGAs are a common platform for digital systems Memory Interface UART Custom Logic Ethernet Performs coordination and even computation Better processors => less hardware to design Soft Processor

5 Our Research Problem Soft processors have worse  Area  Speed  Power But are Flexible use to counteract HOW??? Customize the processor’s architecture ie. Intel vs AMD ie. Motorola 68360 vs 68010 HOW????

6 Research Goals 1. Understand tradeoffs in soft processors Eg. A hardware multiplier is big but can perform multiplies fast 2. Customize it to the application Eg. Bubble sort doesn’t use multiplies, therefore remove hardware multiplier and save on area We developed SPREE, software to help us do both

7 SPREE SPREE System (Soft Processor Rapid Exploration Environment) Verilog ISADatapath ■ Input: Processor description 1. Verify ISA against datapath 2. Datapath Instantiation 3. Control Generation ■ SPREE System ■ Output: Synthesizable Verilog Processor Description

8 Input: Instruction Set Architecture (ISA) Description SPREE Verilog ■ ISA ■ Datapath FETCH RFREAD ADD RFWRITE RFREAD MIPS ADD – add rd, rs, rt ■ Graph of Generic Operations (GENOPs) ■ Edges indicate flow of data ISA currently fixed (subset of MIPS I)

9 Input: Datapath Description SPREE RTL ■ ISA ■ Datapath Mul IfetchReg File ALU Write Back Mul IfetchReg File ALU Shifter Data Mem SPREE Component Library Mul Ifetch Reg file ALU Write Back Data Mem ■ Interconnection of hand-coded components ■ Allows efficient synthesis ■ Described using C++

10 Component Selection Select by name Names looked up in library Stored in cpugen/rtl_lib RTLComponent *ifetch=new RTLComponent("ifetch"); RTLComponent *reg_file=new RTLComponent("reg_file");

11 Datapath Wiring Example rd rs rt offset Ifetch dst a_reg a_data b_reg b_data writedata Regfile proc.addConnection(ifetch,"rs",reg_file,"a_reg"); proc.addConnection(ifetch,"rt",reg_file,"b_reg"); opA result opB ALU

12 SPREE generator (spegen) SPREE System + Backend (Soft Processor Rapid Exploration Environment) Verilog Processor Description 1. Area 2. Clock Frequency 3. Power 4. Cycle Count Quartus II CAD Software (specadflow) Modelsim Verilog Simulator (spebenchmark) Benchmarks Mint MIPS Simulator (simulator/run) Compare traces 

13 Walking through an Example (see README.txt) Choose a pre-built processor cpugen/src/arch lists all the processors Let’s choose pipe3_serialshift 3-stage pipeline with serial shifter

14 Using SPREE on a Processor Generate, benchmark, synthesize % spegen pipe3_serialshift % spebenchmark pipe3_serialshift % specadflow pipe3_serialshift % specompare pipe3_serialshift ← Generates Verilog ← Runs benchmarks ← Synthesizes processor ← Display results

15 spegen – Generating Processors Input: Processor description Syntax: spegen Output: A folder named after the processor Hand-coded Verilog modules system.v Generated hookup and control OUT.cpugen stages per instruction Hazard window/branch penalty test_bench.v test bench for Modelsim simulation

16 Benchmarking Run programs on the processor Measure time taken till completion Verify functionality Can do this without knowing anything about the benchmarks themselves

17 spebenchmark – Benchmarking Input: Processor implementation Syntax: spebenchmark Output: (ideally) Cycle counts of all benchmarks Traces: /tmp/modelsim_trace.txt ******* Benchmarking pipe3_serialshift ******** Simulating bubble_sort... Success! Cycle count=2994 Simulating crc... Success! Cycle count=112750 Simulating des... Success! Cycle count=5129 Simulating fft... Success! Cycle count=5077 Simulating fir... Success! Cycle count=1214...

18 Benchmarking – under the hood Modelsim Verilog Simulator (spebenchmark) Compiler (gcc - MIPS) Mint MIPS Simulator (simulator/run) Compare traces  Verilog Binary Executable C source benchmarks Trace Cycle Count /tmp/modelsim_trace.txt /tmp/modelsim_store_trace.txt applications/ /mint spebenchmark

19 specompiler - Setup compiler Choose the path to your compiler (prebuilt) Default: /jayar/b/b0/yiannac/spe/compiler GCC 3.3.3, software division Another: /jayar/b/b0/yiannac/spe/compiler-softmul GCC 3.3.3, software division and software multiplication specompiler will: 1. Compile all benchmarks (and store binaries) 2. Simulate all benchmarks (and store traces) % specompiler /jayar/b/b0/yiannac/spe/compiler-softmul After this point, you can just run spebenchmark

20 spebenchmark - failure Shows discrepancy between MINT and Modelsim ******* Benchmarking pipe3_serialshift ******** Simulating bubble_sort... Error: Trace does not match, Cycle count=381 Discrepancy found at 6800000 ps Modelsim: PC=04000064 | IR=24090001 | 05: 00000000 Mint: PC=040000b8 | IR=8c47004c | 07: 00000064 destination register value being written Clues to where the error occurred

21 spebenchmark - waveforms Can see any signal within the processor % sim_gui bubble_sort pipe3_serialshift

22 Modelsim LEARN IT!!! Quartus Simulator is vastly inferior, and even unusable for our purposes

23 The Testbench (test_bench.v) What is it? The stimulus and monitor for your circuit SPREE automatically generates And hence it works right away Handcoding your own processor means You have to interface with the test bench Once you have the testbench you can use spebenchmark

24 Manual Interfacing with the Testbench test_bench.v regfile_we regfile_dst regfile_data datamem_we datamem_addr datamem_data Your soft processor Need only 6 wires To track writes to register file and data mem

25 SPREE generator (spegen) SPREE System + Backend (Soft Processor Rapid Exploration Environment) Verilog Processor Description 1. Area 2. Clock Frequency 3. Power 4. Cycle Count Quartus II CAD Software (specadflow) Modelsim Verilog Simulator (spebenchmark) Benchmarks Mint MIPS Simulator (simulator/run) Compare traces 

26 specadflow – Synthesis Input: Processor implementation Syntax: specadflow Performs a “seed sweep” Average several runs since results are noisy Run several instances of quartus Across several machines in parallel

27 specadflow Output Output: Synthesis results (hidden) Summary output Started Tue 6:27PM, Waiting for processes: 10.0.0.61 10.0.0.57 10.0.0.56 10.0.0.55 10.0.0.54 10.0.0.51 Finished Tue 6:33PM 1081 75.7812 0.99822... Waiting on eda writer Area (LEs or ALUTs) Clock Frequency (MHz) Estimated Energy/cycle dissipated (nJ/cycle)

28 Any Questions? Technical support, ask me

29 EXTRAS

30 Setup/Install Copy and unpack the SPREE tarball: /jayar/b/b0/yiannac/spree.tar.gz Build all the SPREE software Follow instructions in INSTALL.txt If there’s any errors, email me % cd spree % make

31 SPREE Directory Structure spree applicationscpugen modelsim quartussimulatorcompiler Benchmarks C source binutils gcc newlib the cpu generator + processor descriptions Verilog simulator MIPS simulator synthesis

32 Setup cluster Choose the cluster you’re using aenao – high performance, limited access eecg – any eecg-connected machine Edit quartus/machines.txt Put a list of 11 or so good eecg machines % specluster eecg% specluster aenao OR


Download ppt "SPREE Tutorial Peter Yiannacouras April 13, 2006."

Similar presentations


Ads by Google