Register-Transfer Level (RTL) Design

Slides:



Advertisements
Similar presentations
Lecture 13: Sequential Circuits
Advertisements

CPE 201 Digital Design Lecture 25: Register Transfer Level Design (2)
CS1Q Computer Systems Lecture 12 Simon Gay. Lecture 12CS1Q Computer Systems - Simon Gay 2 Design of Sequential Circuits The systematic design of sequential.
Lecture 23: Registers and Counters (2)
Registers and Counters
1 RTL Example: Video Compression – Sum of Absolute Differences Video is a series of frames (e.g., 30 per second) Most frames similar to previous frame.
Give qualifications of instructors: DAP
1 Counter with Parallel Load Up-counter that can be loaded with external value –Designed using 2x1 mux – ld input selects incremented value or external.
1 Introduction Sequential circuit –Output depends not just on present inputs (as in combinational circuit), but on past sequence of inputs Stores bits,
Digital Design - Sequential Logic Design Chapter 3 - Sequential Logic Design.
Sequential Circuits1 DIGITAL LOGIC DESIGN by Dr. Fenghui Yao Tennessee State University Department of Computer Science Nashville, TN.
Our First Real System.
CS 151 Digital Systems Design Lecture 37 Register Transfer Level
Digital Design Copyright © 2006 Frank Vahid 1 Digital Design Chapter 5: Register-Transfer Level (RTL) Design Slides to accompany the textbook Digital Design,
Sequential Logic Design Process A sequential circuit that controls Boolean outputs and a specific time- ordered behavior is called a controller. StepDescription.
Chapter 9: Hardware Description Languages
1 Register-Transfer Level (RTL) Design Recall –Chapter 2: Combinational Logic Design First step: Capture behavior (using equation or truth table) Remaining.
Digital Design Copyright © 2006 Frank Vahid 1 a b F InputsOutput a'b'a' b Converting among Representations Can convert from any representation.
Digital Design – Optimizations and Tradeoffs
1 EECS Components and Design Techniques for Digital Systems Lec 21 – RTL Design Optimization 11/16/2004 David Culler Electrical Engineering and Computer.
Contemporary Logic Design Arithmetic Circuits © R.H. Katz Lecture #24: Arithmetic Circuits -1 Arithmetic Circuits (Part II) Randy H. Katz University of.
Basic Register Typically, we store multi-bit items
1 Introduction Chapters 2 & 3: Introduced increasingly complex digital building blocks –Gates, multiplexors, decoders, basic registers, and controllers.
CS61C L15 Synchronous Digital Systems (1) Beamer, Summer 2007 © UCB Scott Beamer, Instructor inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture.
Introduction to Registers Being just logic, ALUs require all the inputs to be present at once. They have no memory. ALU AB FS.
Controller Design Five step controller design process.
Register-Transfer Level (RTL) Design The combination of a controller and datapath is known as a processor. The most common method of designing a processor.
Chapter 11: System Design Methodology Digital System Designs and Practices Using Verilog HDL and 2008, John Wiley11-1 Ders 9: RTL Design.
Digital Design – Register-Transfer Level (RTL) Design Chapter 5 - Register-Transfer Level (RTL) Design.
Chapter 5: Register-Transfer Level (RTL) Design
Sequential Circuit  It is a type of logic circuit whose output depends not only on the present value of its input signals but on the past history of its.
Counters and Registers
Lecture 21 Overview Counters Sequential logic design.
1 COMP541 State Machines Montek Singh Feb 8, 2012.
Rabie A. Ramadan Lecture 3
Introduction to Computer Engineering ECE/CS 252, Fall 2010 Prof. Mikko Lipasti Department of Electrical and Computer Engineering University of Wisconsin.
Introduction to Sequential Logic Design Finite State-Machine Design.
Introduction to State Machine
Chapter 3 Digital Logic Structures. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3-2 Complete Example.
Chapter 3 Digital Logic Structures. 3-2 Combinational vs. Sequential Combinational Circuit always gives the same output for a given set of inputs  ex:
CS 352 : Computer Organization and Design University of Wisconsin-Eau Claire Dan Ernst Latches & Flip-Flops.
EE204 L12-Single Cycle DP PerformanceHina Anwar Khan EE204 Computer Architecture Single Cycle Data path Performance.
Instructor: Alexander Stoytchev
DLD Lecture 26 Finite State Machine Design Procedure.
Digital Design 2e Copyright © 2010 Frank Vahid 1 How To Capture Desired Behavior as FSM List states –Give meaningful names, show initial state –Optionally.
Lecture 21: Registers and Counters (1)
ECE 274 Digital Logic RTL Design using Verilog Verilog for Digital Design Ch. 5.
Chapter 3 Digital Logic Structures. Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. 3-2 Transistor: Building.
Lecture 20: Sequential Logic (5)
1 COMP541 Finite State Machines - 1 Montek Singh Sep 22, 2014.
LECTURE 4 Logic Design. LOGIC DESIGN We already know that the language of the machine is binary – that is, sequences of 1’s and 0’s. But why is this?
CS151 Introduction to Digital Design Chapter 5: Sequential Circuits 5-1 : Sequential Circuit Definition 5-2: Latches 1Created by: Ms.Amany AlSaleh.
Digital Design 2e Copyright © 2010 Frank Vahid 1 RTL Design Process: Create a datapath Sub-steps –HLSM data inputs/outputs  Datapath inputs/outputs. –HLSM.
Instructor: Alexander Stoytchev CprE 281: Digital Logic.
REGISTER TRANSFER LANGUAGE (RTL) INTRODUCTION TO REGISTER Registers1.
Verilog for Digital Design Copyright © 2007 Frank Vahid and Roman Lysecky 1 Verilog for Digital Design Chapter 5: RTL Design.
Copyright © 2001 Stephen A. Edwards All rights reserved Busses  Wires sometimes used as shared communication medium  Think “party-line telephone”  Bus.
1 COMP541 Sequential Logic – 2: Finite State Machines Montek Singh Feb 29, 2016.
Digital Design - Sequential Logic Design
COMP541 Sequential Logic – 2: Finite State Machines
Instructor: Alexander Stoytchev
CSE 370 – Winter Sequential Logic - 1
Introduction to Sequential Circuits
Lecture 17 Logistics Last lecture Today HW5 due on Wednesday
Guest Lecture by David Johnston
ECE 352 Digital System Fundamentals
Lecture 17 Logistics Last lecture Today HW5 due on Wednesday
EGR 2131 Unit 12 Synchronous Sequential Circuits
ECE 352 Digital System Fundamentals
Finite State Machine Continued
Presentation transcript:

Register-Transfer Level (RTL) Design Recall Chapter 2: Combinational Logic Design First step: Capture behavior (using equation or truth table) Remaining steps: Convert to circuit Chapter 3: Sequential Logic Design First step: Capture behavior (using FSM) RTL Design (the method for creating custom processors) First step: Capture behavior (using high-level state machine, to be introduced) Capture behavior Convert to circuit

RTL Design Method

Step 1: Laser-Based Distance Measurer Object of interest D 2D = T sec * 3*108 m/sec sensor laser T (in seconds) Example of how to create a high-level state machine to describe desired processor behavior Laser-based distance measurement – pulse laser, measure time T to sense reflection Laser light travels at speed of light, 3*108 m/sec Distance is thus D = T sec * 3*108 m/sec / 2

Step 1: Laser-Based Distance Measurer T (in seconds) Laser-based distance measurer 16 from button to display S L D B to laser from sensor laser sensor Inputs/outputs B: bit input, from button to begin measurement L: bit output, activates laser S: bit input, senses laser reflection D: 16-bit output, displays computed distance

Step 1: Laser-Based Distance Measurer 16 from button to display S L D B to laser from sensor Inputs: B , S (1 bit each) Outputs: L (bit), D (16 bits) S0 ? a L = 0 (laser off) D = 0 (distance = 0) Step 1: Create high-level state machine Begin by declaring inputs and outputs Create initial state, name it S0 Initialize laser to off (L=0) Initialize displayed distance to 0 (D=0)

Step 1: Laser-Based Distance Measurer 16 from button to display S L D B to laser from sensor Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) S1 ? B’ (button not pressed) B (button pressed) a S0 S0 L = 0 D = 0 Add another state, call S1, that waits for a button press B’ – stay in S1, keep waiting B – go to a new state S2 Q: What should S2 do? A: Turn on the laser a

Step 1: Laser-Based Distance Measurer 16 from button to display S L D B to laser from sensor Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) B’ S3 L = 0 (laser off) S0 S1 S2 B L = 0 L = 1 a D = 0 (laser on) Add a state S2 that turns on the laser (L=1) Then turn off laser (L=0) in a state S3 Q: What do next? A: Start timer, wait to sense reflection a

Step 1: Laser-Based Distance Measurer 16 f om but t on o displ a y S L D B o laser om sensor Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) Local Registers: Dctr (16 bits) B’ S’ (no reflection) S (reflection) ? S0 S1 S2 S3 B L = 0 Dctr = 0 (reset cycle count) L = 1 L = 0 a D = 0 Dctr = Dctr + 1 (count cycles) Stay in S3 until sense reflection (S) To measure time, count cycles for which we are in S3 To count, declare local register Dctr Increment Dctr each cycle in S3 Initialize Dctr to 0 in S1. S2 would have been O.K. too

Step 1: Laser-Based Distance Measurer 16 f om but t on o displ a y S L D B o laser om sensor Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) Local Registers: Dctr (16 bits) B’ S’ a D = Dctr / 2 (calculate D) S4 S0 S1 S2 S3 B S L = 0 Dctr = 0 L = 1 L=0 D = 0 Dctr = Dctr + 1 Once reflection detected (S), go to new state S4 Calculate distance Assuming clock frequency is 3x108, Dctr holds number of meters, so D=Dctr/2 After S4, go back to S1 to wait for button again

Step 2: Create a Datapath Datapath must Implement data storage Implement data computations Look at high-level state machine, do three substeps (a) Make data inputs/outputs be datapath inputs/outputs (b) Instantiate declared registers into the datapath (also instantiate a register for each data output) (c) Examine every state and transition, and instantiate datapath components and connections to implement any data computations Instantiate: to introduce a new component into a design.

Step 2: Laser-Based Distance Measurer Local Registers: Dctr (16 bits) S0 S1 S2 S3 L = 0 D = 0 L = 1 L=0 Dctr = Dctr + 1 Dctr = 0 B ‘ S D = Dctr / 2 (calculate D) S4 Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) (a) Make data inputs/outputs be datapath inputs/outputs (b) Instantiate declared registers into the datapath (also instantiate a register for each data output) (c) Examine every state and transition, and instantiate datapath components and connections to implement any data computations load Q I D r eg: 16-bit e g is t er 16 D Q D c t r : 16-bit u p - ou n er clear clear c ou n t a D a tap a th D r eg_clr c tr_clr tr_c n t eg_ld

Step 2: Laser-Based Distance Measurer Local Registers: Dctr (16 bits) S0 S1 S2 S3 L = 0 D = 0 L = 1 L=0 Dctr = Dctr + 1 Dctr = 0 B ‘ S D = Dctr / 2 (calculate D) S4 Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) (c) (continued) Examine every state and transition, and instantiate datapath components and connections to implement any data computations 16 >>1 a D a tap a th D r eg_clr D r eg_ld D c tr_clr clear clear I D c t r : 16-bit D r eg: 16-bit D c tr_c n t c ou n t load u p - c ou n t er r e g is t er Q Q 16 D

Step 3: Connecting the Datapath to a Controller 300 M H z Clock D B L S 16 to display from button Controller to laser from sensor Datapath Dreg_clr Dreg_ld Dctr_clr Dctr_cnt Laser-based distance measurer example Easy – just connect all control signals between controller and datapath clear count load Q I Dctr: 16-bit up-counter Dreg: 16-bit register 16 D Datapath Dreg_clr Dctr_clr Dctr_cnt Dreg_ld 16 >>1

Step 4: Deriving the Controller’s FSM Inputs: B, S (1 bit each) Outputs: L (bit), D (16 bits) Local Registers: Dctr (16 bits) S0 S1 S2 S3 L = 0 D = 0 L = 1 L=0 Dctr = Dctr + 1 Dctr = 0 B’ S’ B S D = Dctr / 2 (calculate D) S4 300 M H z Clock D B L S 16 t o displ a y f r om but on C o n oller o laser om sensor tap th eg_clr eg_ld c tr_clr tr_c S0 S1 S2 S3 L = 0 L = 1 B’ S’ B S S4 FSM has same structure as high-level state machine Inputs/outputs all bits now Replace data operations by bit operations using datapath Inputs: B, S Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_cnt a Dreg_clr = 1 Dreg_ld = 0 Dctr_clr = 0 Dctr_cnt = 0 (laser off) (clear D reg) Dreg_clr = 0 Dreg_ld = 0 Dctr_clr = 1 Dctr_cnt = 0 (clear count) Dreg_clr = 0 Dreg_ld = 0 Dctr_clr = 0 Dctr_cnt = 0 (laser on) Dreg_clr = 0 Dreg_ld = 0 Dctr_clr = 0 Dctr_cnt = 1 (laser off) (count up) Dreg_clr = 0 Dreg_ld = 1 Dctr_clr = 0 Dctr_cnt = 0 (load D reg with Dctr/2) (stop counting)

Step 4: Deriving the Controller’s FSM B’ S’ B S S4 Dreg_clr = 1 Dreg_ld = 0 Dctr_clr = 0 Dctr_cnt = 0 (laser off) (clear D reg) Dreg_clr = 0 Dctr_clr = 1 (clear count) (laser on) Dctr_cnt = 1 (count up) Dreg_ld = 1 (load D reg with Dctr/2) (stop counting) Step 4: Deriving the Controller’s FSM S0 S1 S2 S3 L = 0 L = 1 B’ S’ B S (laser on) S4 Inputs: B, S Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_cnt Dreg_clr = 1 (laser off) (clear D reg) Dctr_clr = 1 (clear count) Dctr_cnt = 1 (count up) Dreg_ld = 1 Dctr_cnt = 0 (load D reg with Dctr/2) (stop counting) Using shorthand of outputs not assigned implicitly assigned 0 a

Step 4 clear count load Q I Dctr: 16-bit up-counter Dreg: 16-bit register 16 D Datapath Dreg_clr Dctr_clr Dctr_cnt Dreg_ld 300 MHz Clock D B L S 16 to display from button Controller to laser from sensor Datapath Dreg_clr 16 >>1 Dreg_ld Dctr_clr Dctr_cnt S0 S1 S2 S3 L = 0 L = 1 B’ S’ B S (laser on) S4 Inputs: B, S Outputs: L, Dreg_clr, Dreg_ld, Dctr_clr, Dctr_cnt Dreg_clr = 1 (laser off) (clear D reg) Dctr_clr = 1 (clear count) Dctr_cnt = 1 (count up) Dreg_ld = 1 Dctr_cnt = 0 (load D reg with Dctr/2) (stop counting) Implement FSM as state register and logic (Ch3) to complete the design

RTL Example: Video Compression – Sum of Absolute Differences Only difference: ball moving Digitized frame 1 Frame 1 1 Mbyte ( a ) Digitized frame 2 1 Mbyte Frame 2 Digitized frame 1 Frame 1 1 Mbyte ( b ) Difference of 2 from 1 0.01 Mbyte Frame 2 Just send difference a Video is a series of frames (e.g., 30 per second) Most frames similar to previous frame Compression idea: just send difference from previous frame

RTL Example: Video Compression – Sum of Absolute Differences compare Assume each pixel is represented as 1 byte (actually, a color picture might have 3 bytes per pixel, for intensity of red, green, and blue components of pixel) Frame 1 Frame 2 Need to quickly determine whether two frames are similar enough to just send difference for second frame Compare corresponding 16x16 “blocks” Treat 16x16 block as 256-byte array Compute the absolute value of the difference of each array item Sum those differences – if above a threshold, send complete frame for second frame; if below, can use difference method (using another technique, not described)

RTL Example: Video Compression – Sum of Absolute Differences SAD 256-byte array A integer 256-byte array B sad go !(i<256) Want fast sum-of-absolute-differences (SAD) component When go=1, sums the differences of element pairs in arrays A and B, outputs that sum

RTL Example: Video Compression – Sum of Absolute Differences go SAD sad Inputs: A, B (256 byte memory); go (bit) Outputs: sad (32 bits) Local registers: sum, sad_reg (32 bits); i (9 bits) !go S0 go S4 sad_ r eg = sum S0: wait for go S1: initialize sum and index S2: check if done (i>=256) S3: add difference to sum, increment index S4: done, write to output sad_reg S1 sum = 0 i = 0 a S2 i<256 (i<256)’ !(i<256) S3 sum=sum+abs(A[i]-B[i]) i=i+1

RTL Example: Video Compression – Sum of Absolute Differences Inputs: A, B (256 byte memory); go (bit) AB_addr A_data B_data Outputs: sad (32 bits) Local registers: sum, sad_reg (32 bits); i (9 bits) i_lt_256 <256 8 8 9 S0 !go i_inc i – go i_clr sum = 0 S1 8 i = 0 sum_ld (i<256)’ 32 sum abs S2 sum_clr i<256 !(i<256) 8 32 32 sum=sum+abs(A[i]-B[i]) S3 sad_reg_ld i=i+1 + sad_reg !(i<256) (i_lt_256) S4 sad_ reg=sum 32 Datapath sad Step 2: Create datapath

RTL Example: Video Compression – Sum of Absolute Differences AB_addr A_data B_data go AB_ r d i_lt_256 <256 S0 go’ 8 8 9 go i_inc – sum=0 sum_ld=1; AB_rd=1 sad_reg_ld=1 i_inc=1 i_lt_256 i_clr=1 sum_clr=1 i S1 i_clr i=0 8 S2 sum_ld ? 32 i<256 sum abs sum_clr sum=sum+abs(A[i]-B[i]) S3 8 !(i<256) 32 32 i=i+1 sad_reg_ld S4 sad_reg=sum + sad_reg a !(i<256) (i_lt_256) !(i<256) (i_lt_256) 32 Controller sad Step 3: Connect to controller Step 4: Replace high-level state machine by FSM

RTL Example: Video Compression – Sum of Absolute Differences Comparing software and custom circuit SAD Circuit: Two states (S2 & S3) for each i, 256 i’s 512 clock cycles Software: Loop (for i = 1 to 256), but for each i, must move memory to local registers, subtract, compute absolute value, add to sum, increment i – say about 6 cycles per array item  256*6 = 1536 cycles Circuit is about 3 times (300%) faster (i<256)’ S2 i<256 sum=sum+abs(A[i]-B[i]) S3 i=i+1 !(i<256) !(i<256) (i_lt_256)

Control vs. Data Dominated RTL Design Designs often categorized as control-dominated or data-dominated Control-dominated design – Controller contains most of the complexity Data-dominated design – Datapath contains most of the complexity General, descriptive terms – no hard rule that separates the two types of designs Laser-based distance measurer – control dominated SAD circuit – mix of control and data Now let’s do a data dominated design

Data Dominated RTL Design Example: FIR Filter Filter concept Suppose X is data from a temperature sensor, and particular input sequence is 180, 180, 181, 240, 180, 181 (one per clock cycle) That 240 is probably wrong! Could be electrical noise Filter should remove such noise in its output Y Simple filter: Output average of last N values Small N: less filtering Large N: more filtering, but less sharp output X Y 12 digital filter 12 clk

Data Dominated RTL Design Example: FIR Filter “Finite Impulse Response” Simply a configurable weighted sum of past input values y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) Above known as “3 tap” Tens of taps more common Very general filter – User sets the constants (c0, c1, c2) to define specific filter RTL design Step 1: Create high-level state machine But there really is none! Data dominated indeed. Go straight to step 2 X Y 12 digital filter 12 clk y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2)

Data Dominated RTL Design Example: FIR Filter Step 2: Create datapath Begin by creating chain of xt registers to hold past values of X 12 Y clk X digital filter y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) Suppose sequence is: 180, 181, 240 180 181 240 180 181 180 a

Data Dominated RTL Design Example: FIR Filter Step 2: Create datapath (cont.) Instantiate registers for c0, c1, c2 Instantiate multipliers to compute c*x values 12 Y clk X digital filter y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) 3-tap FIR filter x(t) c1 c0 c2 x( t -1) x( t -2) x t0 * x t1 x t2 X a clk Y

Data Dominated RTL Design Example: FIR Filter Step 2: Create datapath (cont.) Instantiate adders 12 Y clk X digital filter y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) 3-tap FIR filter x(t) x( t -1) x( t -2) c0 c1 c2 x t0 x t1 x t2 X clk a * * * + Y

Data Dominated RTL Design Example: FIR Filter Step 2: Create datapath (cont.) Add circuitry to allow loading of particular c register 12 Y clk X digital filter y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) CL 3-tap FIR filter e 3 Ca1 2x4 2 Ca0 1 C x(t) x(t-1) x(t-2) c0 c1 c2 xt0 xt1 xt2 a X clk * * * + + yreg Y

Data Dominated RTL Design Example: FIR Filter y(t) = c0*x(t) + c1*x(t-1) + c2*x(t-2) Step 3 & 4: Connect to controller, Create FSM No controller needed Extreme data-dominated example (Example of an extreme control-dominated design – an FSM, with no datapath) Comparing the FIR circuit to a software implementation Circuit Assume adder has 2-gate delay, multiplier has 20-gate delay Longest past goes through one multiplier and two adders 20 + 2 + 2 = 24-gate delay 100-tap filter, following design on previous slide, would have about a 34-gate delay: 1 multiplier and 7 adders on longest path Software 100-tap filter: 100 multiplications, 100 additions. Say 2 instructions per multiplication, 2 per addition. Say 10-gate delay per instruction. (100*2 + 100*2)*10 = 4000 gate delays Circuit is more than 100 times faster (10,000% faster).