Design methods to facilitate rapid growth of SoCs

Slides:



Advertisements
Similar presentations
ECE Synthesis & Verification - Lecture 2 1 ECE 667 Spring 2011 ECE 667 Spring 2011 Synthesis and Verification of Digital Circuits High-Level (Architectural)
Advertisements

An EHR based methodology for Concurrency management Arvind (with Asif Khan) Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
March 2007http://csg.csail.mit.edu/arvindSemantics-1 Scheduling Primitives for Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Behavioral Design Outline –Design Specification –Behavioral Design –Behavioral Specification –Hardware Description Languages –Behavioral Simulation –Behavioral.
Constructive Computer Architecture Tutorial 4: SMIPS on FPGA Andy Wright 6.S195 TA October 7, 2013http://csg.csail.mit.edu/6.s195T04-1.
Stmt FSM Richard S. Uhler Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology (based on a lecture prepared by Arvind)
February 21, 2007http://csg.csail.mit.edu/6.375/L07-1 Bluespec-4: Architectural exploration using IP lookup Arvind Computer Science & Artificial Intelligence.
March, 2007http://csg.csail.mit.edu/arvindIPlookup-1 IP Lookup Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
September 24, L08-1 IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 22, 2011L06-1.
IP Lookup: Some subtle concurrency issues Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 4, 2013
December 10, 2009 L29-1 The Semantics of Bluespec Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute.
Computer Architecture: A Constructive Approach Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
ASIC/FPGA design flow. FPGA Design Flow Detailed (RTL) Design Detailed (RTL) Design Ideas (Specifications) Design Ideas (Specifications) Device Programming.
February 22, 2005http://csg.csail.mit.edu/6.884/L07-1 Bluespec-1: Design Affects Everything Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
Copyright © Bluespec Inc Bluespec SystemVerilog™ Design Example A DMA Controller with a Socket Interface.
February 14, 2007L04-1http://csg.csail.mit.edu/6.375/ Bluespec-1: Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial.
September 3, 2009L02-1http://csg.csail.mit.edu/korea Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
March, 2007Intro-1http://csg.csail.mit.edu/arvind Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab.
March 6, 2006http://csg.csail.mit.edu/6.375/L10-1 Bluespec-4: Rule Scheduling and Synthesis Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Electrical and Computer Engineering University of Cyprus LAB 1: VHDL.
Constructive Computer Architecture: Guards Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September 24, 2014.
September 22, 2009http://csg.csail.mit.edu/koreaL07-1 Asynchronous Pipelines: Concurrency Issues Arvind Computer Science & Artificial Intelligence Lab.
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology
Constructive Computer Architecture Sequential Circuits Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology September.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology March 8,
October 22, 2009http://csg.csail.mit.edu/korea Modular Refinement Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Introduction to Bluespec: A new methodology for designing Hardware Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
Multiple Clock Domains (MCD) Arvind with Nirav Dave Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology.
Problem: design complexity advances in a pace that far exceeds the pace in which verification technology advances. More accurately: (verification complexity)
October 20, 2009L14-1http://csg.csail.mit.edu/korea Concurrency and Modularity Issues in Processor pipelines Arvind Computer Science & Artificial Intelligence.
February 28, 2005http://csg.csail.mit.edu/6.884/L09-1 Bluespec-3: Modules & Interfaces Arvind Computer Science & Artificial Intelligence Lab Massachusetts.
Introduction to Bluespec: A new methodology for designing Hardware
Introduction to Bluespec: A new methodology for designing Hardware
ASIC Design Methodology
Bluespec-6: Modeling Processors
Digital System Verification
Folded “Combinational” circuits
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Sequential Circuits Constructive Computer Architecture Arvind
Sequential Circuits: Constructive Computer Architecture
Introduction to Bluespec: A new methodology for designing Hardware
Stmt FSM Arvind (with the help of Nirav Dave)
Pipelining combinational circuits
Multirule Systems and Concurrent Execution of Rules
Bluespec-1: Design Affects Everything
Constructive Computer Architecture: Guards
Sequential Circuits Constructive Computer Architecture Arvind
Constructive Computer Architecture Tutorial 7 Final Project Overview
Pipelining combinational circuits
Modular Refinement Arvind
Modular Refinement Arvind
Lecture 1.3 Hardware Description Languages (HDLs)
Modules with Guarded Interfaces
Pipelining combinational circuits
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Introduction to Bluespec: A new methodology for designing Hardware
Stmt FSM Arvind (with the help of Nirav Dave)
Constructive Computer Architecture: Guards
GCD: A simple example to introduce Bluespec
Win with HDL Slide 4 System Level Design
Introduction to Bluespec: A new methodology for designing Hardware
The Verilog Hardware Description Language
Modular Refinement Arvind
Bluespec SystemVerilog™
CS295: Modern Systems Bluespec Introduction
Presentation transcript:

Design methods to facilitate rapid growth of SoCs Bluespec-1: Design methods to facilitate rapid growth of SoCs Arvind Computer Science & Artificial Intelligence Lab Massachusetts Institute of Technology February 14, 2007 http://csg.csail.mit.edu/6.375/

The biggest SoC drivers Explosive growth in markets for cell phones game boxes sensors and actuators Functionality and applications are constrained primarily by: - cost - power/energy constrains February 14, 2007 http://csg.csail.mit.edu/6.375/

Current Cellphone Architecture Comms. Processing Application Processing WLAN RF WCDMA/GSM RF Today’s chip becomes a block in tomorrow’s chip IP reuse is essential Hardware/software migration February 14, 2007 http://csg.csail.mit.edu/6.375/

An under appreciated fact If a functionality (e.g. H.264) is moved from a programmable device to a specialized hardware block, the power/energy savings are 100 to 1000 fold Power savings  more specialized hardware but our mind set Software is forgiving Hardware design is difficult, inflexible, brittle, error prone, ... February 14, 2007 http://csg.csail.mit.edu/6.375/

SoC Trajectory: multicores, heterogeneous, regular, ... IBM Cell Processor SoC Trajectory: multicores, heterogeneous, regular, ... On-chip memory banks Application-specific processing units General-purpose processors Structured on-chip networks Can we rapidly produce high-quality chips and surrounding systems and software? February 14, 2007 http://csg.csail.mit.edu/6.375/

Things to remember Design costs (hardware & software) dominate Within these costs verification and validation costs dominate IP reuse is essential to prevent design-team sizes from exploding design cost = number of engineers x time to design February 14, 2007 http://csg.csail.mit.edu/6.375/

Common quotes “Design is not a problem; design is easy” “Verification is a problem” “Timing closure is a problem” “Physical design is a problem” Almost complete reliance on post-design verification for quality Mind set February 14, 2007 http://csg.csail.mit.edu/6.375/

Through the early 1980s: and U.S. quality was… The U.S. auto industry Sought quality solely through post-build inspection Planned for defects and rework and U.S. quality was… Defect Make Inspect Rework February 14, 2007 http://csg.csail.mit.edu/6.375/

… less than world class Adding quality inspectors (“verification engineers”) and giving them better tools, was not the solution The Japanese auto industry showed the way “Zero defect” manufacturing February 14, 2007 http://csg.csail.mit.edu/6.375/

New mind set: Design affects everything! A good design methodology Can keep up with changing specs Permits architectural exploration Facilitates verification and debugging Eases changes for timing closure Eases changes for physical design Promotes reuse  It is essential to Design for Correctness February 14, 2007 http://csg.csail.mit.edu/6.375/

New ways of expressing behavior to reduce design complexity Decentralize complexity: Rule-based specifications (Guarded Atomic Actions) Lets you think one rule at a time Formalize composition: Modules with guarded interfaces Automatically manage and ensure the correctness of connectivity, i.e., correct-by-construction methodology Strong flavor of Unity Bluespec Smaller, simpler, clearer, more correct code February 14, 2007 http://csg.csail.mit.edu/6.375/

Reusing IP Blocks Example: Commercially available FIFO IP block data_in push_req_n pop_req_n clk rstn data_out full empty Example: Commercially available FIFO IP block These constraints are spread over many pages of the documentation... February 14, 2007 http://csg.csail.mit.edu/6.375/

Bluespec promotes composition through guarded interfaces Self-documenting interfaces; Automatic generation of logic to eliminate conflicts in use. theModuleA theFifo.enq(value1); theFifo.deq(); value2 = theFifo.first(); theFifo.enq(value3); value4 = theFifo.first(); Enqueue arbitration control theFifo Dequeue arbitration control n rdy enab enq not full not empty theModuleB deq FIFO n first February 14, 2007 http://csg.csail.mit.edu/6.375/

Bluespec What is it? Programming with Rules Synthesis of circuits 5-minute break to stretch you legs What is it? Programming with Rules Example GCD Synthesis of circuits Another Example: Multiplication Bluespec is available in two versions: BSV – Bluespec in System Verilog ESEPro – Bluespec in SystemC These lectures will use BSV syntax February 14, 2007 http://csg.csail.mit.edu/6.375/

Bluespec SystemVerilog (BSV) Power to express complex static structures and constraints Checked by the compiler “Micro-protocols” are managed by the compiler The necessary hardware for muxing and control is generated automatically and is correct by construction Easier to make changes while preserving correctness Smaller, simpler, clearer, more correct code not just simulation, synthesis as well February 14, 2007 http://csg.csail.mit.edu/6.375/

Bluespec: State and Rules organized into modules interface module All state (e.g., Registers, FIFOs, RAMs, ...) is explicit. Behavior is expressed in terms of atomic actions on the state: Rule: condition  action Rules can manipulate state in other modules only via their interfaces. February 14, 2007 http://csg.csail.mit.edu/6.375/

Programming with rules: A simple example Euclid’s algorithm for computing the Greatest Common Divisor (GCD): 15 6 9 6 subtract 3 6 subtract 6 3 swap 3 3 subtract 0 3 subtract answer: February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD in BSV module mkGCD (I_GCD); Reg#(int) x <- mkRegU; y swap sub module mkGCD (I_GCD); Reg#(int) x <- mkRegU; Reg#(int) y <- mkReg(0); rule swap ((x > y) && (y != 0)); x <= y; y <= x; endrule rule subtract ((x <= y) && (y != 0)); y <= y – x; method Action start(int a, int b) if (y==0); x <= a; y <= b; endmethod method int result() if (y==0); return x; endmodule State typedef int Int#(32) Internal behavior External interface Assumes x /= 0 and y /= 0 February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD Hardware Module implicit conditions interface I_GCD; rdy enab int start result module GCD y == 0 implicit conditions interface I_GCD; method Action start (int a, int b); method int result(); endinterface The module can easily be made polymorphic Many different implementations can provide the same interface: module mkGCD (I_GCD) February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD: Another implementation module mkGCD (I_GCD); Reg#(int) x <- mkRegU; Reg#(int) y <- mkReg(0); rule swapANDsub ((x > y) && (y != 0)); x <= y; y <= x - y; endrule rule subtract ((x<=y) && (y!=0)); y <= y – x; method Action start(int a, int b) if (y==0); x <= a; y <= b; endmethod method int result() if (y==0); return x; endmodule Combine swap and subtract rule Does it compute faster ? February 14, 2007 http://csg.csail.mit.edu/6.375/

Bluespec SystemVerilog source Bluespec Tool flow Blueview Bluespec SystemVerilog source Bluespec Compiler C Bluesim Cycle Accurate Verilog 95 RTL RTL synthesis gates Verilog sim VCD output files Bluespec tools 3rd party tools Legend Debussy Visualization Place & Route Physical February 14, 2007 http://csg.csail.mit.edu/6.375/ Tapeout

Generated Verilog RTL: GCD module mkGCD(CLK,RST_N,start_a,start_b,EN_start,RDY_start, result,RDY_result); input CLK; input RST_N; // action method start input [31 : 0] start_a; input [31 : 0] start_b; input EN_start; output RDY_start; // value method result output [31 : 0] result; output RDY_result; // register x and y reg [31 : 0] x; wire [31 : 0] x$D_IN; wire x$EN; reg [31 : 0] y; wire [31 : 0] y$D_IN; wire y$EN; ... // rule RL_subtract assign WILL_FIRE_RL_subtract = x_SLE_y___d3 && !y_EQ_0___d10 ; // rule RL_swap assign WILL_FIRE_RL_swap = !x_SLE_y___d3 && !y_EQ_0___d10 ; February 14, 2007 http://csg.csail.mit.edu/6.375/

Generated Hardware start result x y en rdy next state values swap? sub x_en y_en x y > !(=0) swap? subtract? predicates x_en = y_en = February 14, 2007 http://csg.csail.mit.edu/6.375/

Generated Hardware Module x y en rdy start result start_en start_en x_en y_en x y > !(=0) swap? subtract? sub x_en = y_en = rdy = February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD: A Simple Test Bench module mkTest (); Reg#(int) state <- mkReg(0); I_GCD gcd <- mkGCD(); rule go (state == 0); gcd.start (423, 142); state <= 1; endrule rule finish (state == 1); $display (“GCD of 423 & 142 =%d”,gcd.result()); state <= 2; endmodule Why do we need the state variable? February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD: Test Bench Feeds all pairs (c1,c2) 1 < c1 < 7 module mkTest (); Reg#(int) state <- mkReg(0); Reg#(Int#(4)) c1 <- mkReg(1); Reg#(Int#(7)) c2 <- mkReg(1); I_GCD gcd <- mkGCD(); rule req (state==0); gcd.start(signExtend(c1), signExtend(c2)); state <= 1; endrule rule resp (state==1); $display (“GCD of %d & %d =%d”, c1, c2, gcd.result()); if (c1==7) begin c1 <= 1; c2 <= c2+1; state <= 0; end else c1 <= c1+1; if (c2 == 63) state <= 2; endmodule Feeds all pairs (c1,c2) 1 < c1 < 7 1 < c2 < 15 to GCD February 14, 2007 http://csg.csail.mit.edu/6.375/

GCD: Synthesis results Original (16 bits) Clock Period: 1.6 ns Area: 4240 mm2 Unrolled (16 bits) Clock Period: 1.65ns Area: 5944 mm2 Unrolled takes 31% fewer cycles on the testbench February 14, 2007 http://csg.csail.mit.edu/6.375/

One step of multiplication Multiplier Example Simple binary multiplication: 1001 0101 0000 0101101 // d = 4’d9 // r = 4’d5 // d << 0 (since r[0] == 1) // 0 << 1 (since r[1] == 0) // d << 2 (since r[2] == 1) // 0 << 3 (since r[3] == 0) // product (sum of above) = 45 x What does it look like in Bluespec? A module consists of: Rules – Are the fundumental means of expressing behavior in BSV Module interface – consists of Methods and Subinterfaces which describe micro protocols for interaction of the mentioned module with the outside world. d r product One step of multiplication February 14, 2007 http://csg.csail.mit.edu/6.375/

Multiplier in Bluespec module mkMult (I_mult); Reg#(Int#(32)) product <- mkReg(0); Reg#(Int#(16)) d <- mkReg(0); Reg#(Int#(16)) r <- mkReg(0); rule cycle endrule method Action start endmethod method Int#(32) result () endmodule February 14, 2007 http://csg.csail.mit.edu/6.375/

Summary Market forces are demanding a much greater variety of SoCs The design cost for SoCs has to be brought down dramatically by facilitating IP reuse High-level synthesis tools are essential for architectural exploration and IP development Bluespec is both high-level and synthesizable Next time: Combinational Circuits and Simple pipelines February 14, 2007 http://csg.csail.mit.edu/6.375/