© 2003-2009 Ran Ginosar048878 Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures 048878 Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.

Slides:



Advertisements
Similar presentations
Modular Combinational Logic
Advertisements

CPE 626 CPU Resources: Adders & Multipliers Aleksandar Milenkovic Web:
1 KU College of Engineering Elec 204: Digital Systems Design Lecture 9 Programmable Configurations Read Only Memory (ROM) – –a fixed array of AND gates.
Mohamed Younis CMCS 411, Computer Architecture 1 CMCS Computer Architecture Lecture 7 Arithmetic Logic Unit February 19,
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Clock Design Adopted from David Harris of Harvey Mudd College.
CSE-221 Digital Logic Design (DLD)
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
Arithmetic II CPSC 321 E. J. Kim. Today’s Menu Arithmetic-Logic Units Logic Design Revisited Faster Addition Multiplication (if time permits)
© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.
COMP Clockless Logic and Silicon Compilers Lecture 3
30 September 2004Comp 120 Fall September 2004 Chapter 4 – Logic Gates Read in Chapter 4 pages , , section 4.8 through top of page.
Arithmetic II CPSC 321 Andreas Klappenecker. Any Questions?
Introduction to CMOS VLSI Design Lecture 11: Adders
Arithmetic-Logic Units CPSC 321 Computer Architecture Andreas Klappenecker.
Fall 2008EE VLSI Design I - © Kia Bazargan 1 EE 5323 – VLSI Design I Kia Bazargan University of Minnesota Adders.
Lecture 17: Adders.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Recap: Lectures 5 & 6 Classic Pipeline Styles 1. Williams and Horowitz’s PS0 pipeline 2. Sutherland’s micropipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
4-bit adder, multiplexer, timing diagrams, propagation delays
Adders. Full-Adder The Binary Adder Express Sum and Carry as a function of P, G, D Define 3 new variable which ONLY depend on A, B Generate (G) = AB.
Introduction to CMOS VLSI Design Lecture 11: Adders David Harris Harvey Mudd College Spring 2004.
 Arithmetic circuit  Addition  Subtraction  Division  Multiplication.
Combinational Circuits Chapter 3 S. Dandamudi To be used with S. Dandamudi, “Fundamentals of Computer Organization and Design,” Springer, 2003.
SUPLEMENTARY CHAPTER 1: An Introduction to Digital Logic The Architecture of Computer Hardware and Systems Software: An Information Technology Approach.
EE415 VLSI Design DYNAMIC LOGIC [Adapted from Rabaey’s Digital Integrated Circuits, ©2002, J. Rabaey et al.]
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
WEEK #10 FUNCTIONS OF COMBINATIONAL LOGIC (ADDERS)
Chapter 6-1 ALU, Adder and Subtractor
Arithmetic Building Blocks
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
Module 9.  Digital logic circuits can be categorized based on the nature of their inputs either: Combinational logic circuit It consists of logic gates.
CSE115: Digital Design Lecture 20: Comparators, Adders and Subtractors Faculty of Engineering.
Basic Addition Review Basic Adders and the Carry Problem
1 Clockless Computing Montek Singh Thu, Sep 6, 2007  Review: Logic Gate Families  A classic asynchronous pipeline by Williams.
CDA 3101 Fall 2013 Introduction to Computer Organization The Arithmetic Logic Unit (ALU) and MIPS ALU Support 20 September 2013.
UNIVERSITY OF ROSTOCK Institute of Applied Microelectronics and Computer Science Single-Rail Self-timed Logic Circuits in Synchronous Designs Frank Grassert,
12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.
1 Lecture 12 Time/space trade offs Adders. 2 Time vs. speed: Linear chain 8-input OR function with 2-input gates Gates: 7 Max delay: 7.
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
COMP541 Arithmetic Circuits
Digital Integrated Circuits© Prentice Hall 1995 Arithmetic Arithmetic Building Blocks.
EE466: VLSI Design Lecture 13: Adders
1 Chapter 4 Combinational Logic Logic circuits for digital systems may be combinational or sequential. A combinational circuit consists of input variables,
1 Carry Lookahead Logic Carry Generate Gi = Ai Bi must generate carry when A = B = 1 Carry Propagate Pi = Ai xor Bi carry in will equal carry out here.
CSE477 VLSI Digital Circuits Fall 2002 Lecture 20: Adder Design
1 Bridging the gap between asynchronous design and designers Peter A. BeerelFulcrum Microsystems, Calabasas Hills, CA, USA Jordi CortadellaUniversitat.
Digital Logic Design Basics Combinational Circuits Sequential Circuits Pu-Jen Cheng Adapted from the slides prepared by S. Dandamudi for the book, Fundamentals.
Arithmetic-Logic Units. Logic Gates AND gate OR gate NOT gate.
Addition and multiplication Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
Lecture #23: Arithmetic Circuits-1 Arithmetic Circuits (Part I) Randy H. Katz University of California, Berkeley Fall 2005.
How does a Computer Add ? Logic Gates within chips: AND Gate A B Output OR Gate A B Output A B A B
EE 466/586 VLSI Design Partha Pande School of EECS Washington State University
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003 Rev /05/2003.
Basic Addition Review Basic Adders and the Carry Problem Carry Propagation Speedup Speed/Cost Tradeoffs Two-operand Versus Multi-operand Adders.
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Addition and multiplication1 Arithmetic is the most basic thing you can do with a computer, but it’s not as easy as you might expect! These next few lectures.
EE141 Arithmetic Circuits 1 Chapter 14 Arithmetic Circuits Rev /12/2003.
Basics Combinational Circuits Sequential Circuits
Basics Combinational Circuits Sequential Circuits Ahmad Jawdat
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
Number Systems and Circuits for Addition
COMS 361 Computer Organization
Clockless Logic: Asynchronous Pipelines
Lecture 3 Combinational units. Adders
Clockless Computing Lecture 3
Presentation transcript:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 2 Implementations We only consider simple circuits More aggressive circuits will come later First, reminder on latches

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 3 4- & 2-phase bundled data latches

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 4 4-phase dual rail – many bits

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 5 4-phase Fork, Join

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 6 4-phase Bundled-data Mux

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 7 4-phase Bundled-data Demux

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 8 4-phase Merge

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 9 4-phase Merge Mutually exclusive inputs. Guaranteed elsewhere! (more later..) Assume X active… …C-element sees input glitch Relative Timing: x-req  < z-ack   simplify CEL

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 10 Asymmetric C Element Useful when we know the relative timing: b  < a   only a  needed to pull up Only one pMOS - faster

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 11 2-phase Merge Try it at home… This is not an assignment!

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 12 Mutual Exclusion: MUTEX

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 13 Standard Gate MUTEXs Not fully guaranteed that outputs are M/E, but highly probable ! Very low threshold

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 14 Arbiter

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 15 Arbitrating Merge

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 16 Function Blocks We said “transparent” but… –Need a matched delay for bundled-data –Need to generate completion for dual-rail –Need to join inputs, fork outputs:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 17 Transparency Revisited Function blocks must not affect how the latches “shake hands” (except for timing)

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 18 Indication Revisited FB(req_out  ) means –FB(req_in  ) –Computation finished, data out ready Simple “strong indication” for bundled data: 1: ALL DATA_IN VALID 2: REQ_IN  3: COMPUTE 4: ALL DATA_OUT VALID 5: REQ_OUT 

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 19 Strong vs. Weak Indication Strong Indication: All inputs must arrive before any output is allowed (“indicated”). –Even if some outputs are ready earlier, there is no REQ_OUT, so they cannot be used. –Implies worst-case latency Weak Indication: Some outputs are allowed even before all inputs arrived –Only makes sense in dual-rail:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 20 Weak Indication No REQ on dual-rail – each bit is “self- indicating” May lead to faster circuits Example chain of events:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 21 Composition of FBs Legal composition: –All inputs and outputs are connected –No cycles Legal composition of weekly indicating FBs is weakly indicating Legal composition of strongly indicating FBs is strongly indicating

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 22 Example: Ripple-carry

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 23 Example: Ripple-carry Full adder (a,b,c) = (s,d) –s = a  b  c –d = ab + ac + bc Shortcuts for look-ahead (prop, gen, kill): –p = a  b s = p  c –g = abd = g + pc, OR d' = k + pc' –k = a' b' Sometimes d can be made valid without waiting for c

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 24 Speculative / Strong Ripple Carry 16 bit ripple-carry adder, bundled-data Longest carry is 16 stages But if p 8 =0 then longest carry is 8 stages And if p 12 p 8 p 4 =0, then longest carry is 4 stages If willing to trade area and power for speed:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 25 Speculative / Strong Ripple Carry

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 26 ST-CL Based on David, Ginosar, Yoeli, "An Efficient Implementation of Boolean Functions as Self-Timed Circuits,'' IEEE Trans. Computers, Jan. 1992

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 27 Dual-Rail DIMS PLA Notation

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 28 Dual-Rail DIMS Adders Still slow: LF(V) = LF(E)

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 29 Transistor Level DIMS Too many P transistors - slow Some N paths can be shared:

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 30 Hybrid Adder Dual-rail carry (for flexible latency) Bundled-data data inputs and sum output (for lower area and power) Data-dependent data-forward (V) latency Constant empty-forward (E) latency

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 31 Hybrid Adder Dual-rail Bundled-data

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 32 Domino Logic Dual Rail Req Out: Either by (flexible) Completion Detection or by matched (worst case) delay

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 33 Hybrid Adder: Sum Ckt

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 34 Hybrid Adder: Two Carry Ckts Weak IndicationStrong Indication KILL GEN

© Ran Ginosar Lecture 3: Handshake Ckt Implementations 35 Hybrid Adder: Two Carry Ckts WEAK CARRY STRONG CARRY STRONG CARRY WEAK CARRY STRONG CARRY STRONG CARRY … CD Slightly faster…