© 2002-2009 Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures 048878 Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures.

Slides:



Advertisements
Similar presentations
Logic Gates.
Advertisements

CS370 – Spring 2003 Hazards/Glitches. Time Response in Combinational Networks Gate Delays and Timing Waveforms Hazards/Glitches and How To Avoid Them.
Programmable FIR Filter Design
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Introduction to CMOS VLSI Design Sequential Circuits.
VLSI Design EE 447/547 Sequential circuits 1 EE 447/547 VLSI Design Lecture 9: Sequential Circuits.
Introduction to CMOS VLSI Design Sequential Circuits
MICROELETTRONICA Sequential circuits Lection 7.
ELEC 256 / Saif Zahir UBC / 2000 Timing Methodology Overview Set of rules for interconnecting components and clocks When followed, guarantee proper operation.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
TITAC: Design of a QDI microprocessor TITAC: Tokyo Institute of Technology TITAC-1: IEEE Design & Test (Summer 94) 1. main goal: explore the design methodology.
Minimizing Clock Skew in FPGAs
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis EE4800 CMOS Digital IC Design & Analysis Lecture 11 Sequential Circuit Design Zhuo Feng.
© Ran Ginosar Lecture 3: Handshake Ckt Implementations 1 VLSI Architectures Lecture 3 S&F Ch. 5: Handshake Ckt Implementations.
1 Clockless Logic Montek Singh Tue, Mar 23, 2004.
COMP Clockless Logic and Silicon Compilers Lecture 3
ECE Synthesis & Verification1 ECE 667 Spring 2011 Synthesis and Verification of Digital Systems Verification Introduction.
Jordi Cortadella, Universitat Politècnica de Catalunya, Spain
Synthesis of synchronous elastic architectures Jordi Cortadella (Universitat Politècnica Catalunya) Mike Kishinevsky (Intel Corp.) Bill Grundmann (Intel.
VHDL Coding Exercise 4: FIR Filter. Where to start? AlgorithmArchitecture RTL- Block diagram VHDL-Code Designspace Exploration Feedback Optimization.
ECE C03 Lecture 61 Lecture 6 Delays and Timing in Multilevel Logic Synthesis Prith Banerjee ECE C03 Advanced Digital Design Spring 1998.
EE141 © Digital Integrated Circuits 2nd Combinational Circuits 1 Logical Effort - sizing for speed.
Computer ArchitectureFall 2008 © August 20 th, Introduction to Computer Architecture Lecture 2 – Digital Logic Design.
Chapter #6: Sequential Logic Design 6.2 Timing Methodologies
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits Credits: David Harris Harvey Mudd College (Material taken/adapted from Harris’ lecture.
Contemporary Logic Design Sequential Logic © R.H. Katz Transparency No Chapter #6: Sequential Logic Design Sequential Switching Networks.
Fall 2009 / Winter 2010 Ran Ginosar (
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
[M2] Traffic Control Group 2 Chun Han Chen Timothy Kwan Tom Bolds Shang Yi Lin Manager Randal Hong Mon. Nov. 24 Overall Project Objective : Dynamic Control.
Lecture 3. Boolean Algebra, Logic Gates Prof. Sin-Min Lee Department of Computer Science 2x.
Some Useful Circuits Lecture for CPSC 5155 Edward Bosworth, Ph.D. Computer Science Department Columbus State University.
Spring EE 437 Lillevik 437s06-l2 University of Portland School of Engineering Advanced Computer Architecture Lecture 2 NSD with MUX and ROM Class.
Registers CPE 49 RMUTI KOTAT.
CSC321 Where We’ve Been Binary representations Boolean logic Logic gates – combinational circuits Flip-flops – sequential circuits Complex gates – modules.
False Path. Timing analysis problems We want to determine the true critical paths of a circuit in order to: –To determine the minimum cycle time that.
EKT 221/4 DIGITAL ELECTRONICS II  Registers, Micro-operations and Implementations - Part3.
Modern VLSI Design 4e: Chapter 8 Copyright  2008 Wayne Wolf Topics Basics of register-transfer design: –data paths and controllers; –ASM charts. Pipelining.
Chap 7. Register Transfers and Datapaths. 7.1 Datapaths and Operations Two types of modules of digital systems –Datapath perform data-processing operations.
ENG241 Digital Design Week #8 Registers and Counters.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Review of Digital Logic Design Concepts OR: What I Need to Know from Digital Logic Design (EEL3705)
Computer Organization & Programming Chapter 5 Synchronous Components.
Timing Analysis Section Delay Time Def: Time required for output signal Y to change due to change in input signal X Up to now, we have assumed.
Modern VLSI Design 3e: Chapter 8 Copyright  1998, 2002 Prentice Hall PTR Topics n Basics of register-transfer design: –data paths and controllers; –ASM.
Clocking System Design
Constructive Computer Architecture Sequential Circuits - 2 Arvind Computer Science & Artificial Intelligence Lab. Massachusetts Institute of Technology.
EE3A1 Computer Hardware and Digital Design Lecture 9 Pipelining.
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
On the Relation Between Simulation-based and SAT-based Diagnosis CMPE 58Q Giray Kömürcü Boğaziçi University.
Logic Gates and Boolean Algebra
Advanced Digital Design
Registers and Counters
Sequential circuit design with metastability
Logic Gates.
DIGITAL 2 : EKT 221 RTL : Microoperations on a Single Register
Instruction Execution (Load and Store instructions)
Inst.eecs.berkeley.edu/~cs61c CS61C : Machine Structures Lecture #21 State Elements: Circuits that Remember Hello to James Muerle in the.
Hardware Description Languages
CSE 370 – Winter Sequential Logic - 1
Sequential Circuits - 2 Constructive Computer Architecture Arvind
Logic Gates.
Adders.
TA David “The Punner” Eitan Poll
Clockless Logic: Asynchronous Pipelines
Chapter 3 Overview • Multi-Level Logic
ECE 551: Digital System Design & Synthesis
Wagging Logic: Moore's Law will eventually fix it
Instructor: Michael Greenbaum
Introduction to Silicon Programming in the Tangram/Haste language
Presentation transcript:

© Ran GinosarAsynchronous Design and Synchronization 1 VLSI Architectures Lecture 2: Theoretical Aspects (S&F 2.5) Data Flow Structures (S&F 3) Performance (S&F Ch. 4)

© Ran GinosarAsynchronous Design and Synchronization 2 Classification of Async Circuits Self-timed (ST) –Requires some timing assumptions Speed-independent (SI) –Zero (ideal) wire delay, arbitrary gate delay Delay-insensitive (DI) –Arbitrary delays (gates and wires) Quasi-delay-insensitive (QDI) –DI with the Isochronic Fork assumption –Theoretically equivalent to SI SI and DI are mathematically provable

© Ran GinosarAsynchronous Design and Synchronization 3 Speed Independence A gate (Boolean function) is either: –Stable, or –Excited (inputs have changed and the output should also change to satisfy the Boolean function) A gate “fires”  the output is changed An excited gate eventually fires and become stable. SI means: Firing of one gate must never cause another excited gate to become stable without firing.

© Ran GinosarAsynchronous Design and Synchronization 4 Data Flow Structures Abstraction similar to sync RTL –Likewise, described by either schematics or HDL Applies to all (3) handshake protocols, but we assume 4- phase –Alternating VALID / EMPTY tokens Assume handshake latches and handshake-ignorant function blocks –Recall token flow rulestoken flow rules

© Ran GinosarAsynchronous Design and Synchronization 5 Abstract Pipeline Bubbles Tokens Valid (0 or 1, who cares) and Empty tokens Transparent function blocks (don’t change token flow, only introduce some delays) EVVEE

© Ran GinosarAsynchronous Design and Synchronization 6 Abstract Rings 3 stages, 1 bubble: –3 steps for token round –6 steps to cycle VEV VEE VVE EVE token bubble

© Ran GinosarAsynchronous Design and Synchronization 7 Abstract Rings 4 stages, 2 bubbles: –How many steps to cycle ? An added latch did not change the function (unlike sync pipe) VEEV VVEE EVVE VVEE

© Ran GinosarAsynchronous Design and Synchronization 8 Building Blocks LatchSourceSink ForkJoin (wait for all) Merge (wait for one) MUX 0 1 DEMUX 0 1 Function Block (Join; CL; Fork)

© Ran GinosarAsynchronous Design and Synchronization 9 Example

© Ran GinosarAsynchronous Design and Synchronization 10 Example: t0 E E E EV E E EE V

© Ran GinosarAsynchronous Design and Synchronization 11 Example: t1 V E E EV E E EE V

© Ran GinosarAsynchronous Design and Synchronization 12 Example: t2 V V E EV E E EV E

© Ran GinosarAsynchronous Design and Synchronization 13 Example: t3 E V V EV E E VV E

© Ran GinosarAsynchronous Design and Synchronization 14 Example: t4 E E V EE E E VE

© Ran GinosarAsynchronous Design and Synchronization 15 Example: t5 E E V VE V E VE

© Ran GinosarAsynchronous Design and Synchronization 16 Example: t6 E E E VE V V VE

© Ran GinosarAsynchronous Design and Synchronization 17 Example: t7 E E E VV V V EE V

© Ran GinosarAsynchronous Design and Synchronization 18 Example: t8 E E E EV E V EE

© Ran GinosarAsynchronous Design and Synchronization 19 Example: t9 E E E EV E E EE

© Ran GinosarAsynchronous Design and Synchronization 20 Example: t10 E E E EV E E EE E

© Ran GinosarAsynchronous Design and Synchronization 21 Example: t11 E E E EV E E EE

© Ran GinosarAsynchronous Design and Synchronization 22 Another Ring: Simple FSM EV F E Next StatePresent State InputOutput

© Ran GinosarAsynchronous Design and Synchronization 23 Another Ring: Iterative Computation EE F E InputOutput EE F1F2F3 Arbitrary piping also works:

© Ran GinosarAsynchronous Design and Synchronization 24 Latches don’t foul the pipe! Don’t try this with sync circuits!

© Ran GinosarAsynchronous Design and Synchronization 25 IF statement if then else 01 TRUE PART FALSE PART 01 COND Combinational logic, or latches may be added

© Ran GinosarAsynchronous Design and Synchronization 26 FOR statement for do BODY 01 COUNT 01 E0 One handshake here Results in COUNT handshakes here [1x(COUNT-1) + 0] Warning: Not all latches are shown

© Ran GinosarAsynchronous Design and Synchronization 27 WHILE statement while do BODY 01 COND 01 E0

© Ran GinosarAsynchronous Design and Synchronization 28 Async GCD input (a, b); while a  b do if a > b then a  a-b ; else b  b-a ; output (a);

© Ran GinosarAsynchronous Design and Synchronization 29 Async GCD input (a, b); while a  b do if a > b then a  a-b ; else b  b-a ; output (a); 0 1 B-A A-B 0 1 E A>B E E0 E ABAB 1 1 A,B GCD(A,B) if

© Ran GinosarAsynchronous Design and Synchronization 30 Performance Sync performance analysis is simple: –Check all register-to-register paths –Static Timing Analysis –Dynamic simulations only check correctness, not performance Async performance analysis is COMPLEX: –Many cycles –Data dependent delays –Dependency on environment and initialization –Not guaranteed to have a solution We will only consider simple examples… –Qualitative, then quantitative

© Ran GinosarAsynchronous Design and Synchronization 31 FIFO Performance: 2N on 2N

© Ran GinosarAsynchronous Design and Synchronization 32 FIFO Performance 2N=6 tokens (N Valid, N Empty) 2N=6 latches 2N=6 steps to move all tokens one step to the right But is it the best we can do with 2N latches? Let’s try a fast sink.

© Ran GinosarAsynchronous Design and Synchronization 33 E3E2EEE3EE22E3E2E1 1 EE33EE 4 44EE33 FIFO Performance: Fast sink (N on 2N) E3E22E E E33EE2 2 4EE33E 4 E E44EE3 3 EE44EE E E 5

© Ran GinosarAsynchronous Design and Synchronization 34 FIFO Performance Fast sink: –Tokens spread out –Bubble every other stage –Only N tokens in 2N stages –One step to move every token to the right Let’s try to add stages (same # of tokens)

© Ran GinosarAsynchronous Design and Synchronization 35 FIFO Performance: 2N on 3N 3N=9 stages 2N tokens (N Valid, N Empty) + N bubbles Only 2 steps to move every token one stage to the right

© Ran GinosarAsynchronous Design and Synchronization 36 Shift Register + Parallel Load CTL=0 token: –Parallel load –Old values to sink latches –Valid din[0] to output CTL=1 token: –Shift right –Valid token output CTL=Empty token –Shift right –Empty token output Two performance issues: –Too few bubbles –High fanout on CTL Time Large C-element for ACK E d E d E d din[1]din[2]din[3]din[0] do ctl

© Ran GinosarAsynchronous Design and Synchronization 37 Shift Register + Parallel Load Buffers added in the CTL path –Solves both issues together E d E d E d din[1]din[2]din[3]din[0] do ctl Ed Ed Ed din[1]din[2]din[3]din[0] do ctl EEE

© Ran GinosarAsynchronous Design and Synchronization 38 E E E din[1]din[2]din[3]din[0] do 0EE ctl 0 E E d1 E din[1]din[2]din[3]din[0] do 00E 0 d3E d2E d1E din[1]din[2]din[3]din[0] do … Parallel Load (CTL=0) Enabled, move not shown

© Ran GinosarAsynchronous Design and Synchronization 39 din, CTL Empty (slow consumer) … d3E d2E d1E EEEE E, do E00 E Ed Ed Ed EEEE E, do EEE E

© Ran GinosarAsynchronous Design and Synchronization 40 Slow Shift (CTL=1; CTL=E) Ed Ed Ed EEEE d1, E,do 1EE 1 … 0E d3E d2E EEEE d1, E,do … E Ed Ed EEEE E,d1, E,do EEE E

© Ran GinosarAsynchronous Design and Synchronization 41 Fast Shift (CTL=1;E;1;E…) Ed Ed Ed EEEE d1, E,do 1EE 1 Ed Ed E EEEE E,d1, E,do E1E E Ed E Ed EEEE d2, E,d1, E,do 1E1 1 0E Ed E EEEE E,d2, E,d1, E,do E1E E

© Ran GinosarAsynchronous Design and Synchronization 42 Shift Register Behavior Dynamic, depends on relative timing of Consumer and the shifter Every stage has 2 tokens Slow consumer: CTL has bubbles Fast consumer: CTL has tokens Nothing (V,E) moves without a CTL token