A Quasi-Delay-Insensitive Method to Overcome Transistor Variation

Slides:



Advertisements
Similar presentations
Self-Timed Logic Timing complexity growing in digital design -Wiring delays can dominate timing analysis (increasing interdependence between logical and.
Advertisements

ASYNC07 High Rate Wave-pipelined Asynchronous On-chip Bit-serial Data Link R. Dobkin, T. Liran, Y. Perelman, A. Kolodny, R. Ginosar Technion – Israel Institute.
Reading1: An Introduction to Asynchronous Circuit Design Al Davis Steve Nowick University of Utah Columbia University.
Fall EE 333 Lillevik 333f06-l20 University of Portland School of Engineering Computer Organization Lecture 20 Pipelining: “bucket brigade” MIPS.
Introduction to CMOS VLSI Design Sequential Circuits.
MICROELETTRONICA Sequential circuits Lection 7.
Lecture 11: Sequential Circuit Design. CMOS VLSI DesignCMOS VLSI Design 4th Ed. 11: Sequential Circuits2 Outline  Sequencing  Sequencing Element Design.
Slide 1/20IWLS 2003, May 30Early Output Logic with Anti-Tokens Charlie Brej, Jim Garside APT Group Manchester University.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Introduction to CMOS VLSI Design Clock Skew-tolerant circuits.
Clock Design Adopted from David Harris of Harvey Mudd College.
A 16-Bit Kogge Stone PS-CMOS adder with Signal Completion Seng-Oon Toh, Daniel Huang, Jan Rabaey May 9, 2005 EE241 Final Project.
Kevin Walsh CS 3410, Spring 2010 Computer Science Cornell University Performance See: P&H 1.4.
1 The Information School of the University of Washington Nov 8fit review © 2006 University of Washington Midterm 2 Review INFO/CSE 100, Fall 2006.
ELEC 6200, Fall 07, Oct 24 Jiang: Async. Processor 1 Asynchronous Processor Design for ELEC 6200 by Wei Jiang.
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits Credits: David Harris Harvey Mudd College (Material taken/adapted from Harris’ lecture.
Fall 2009 / Winter 2010 Ran Ginosar (
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
Multi-core processors. History In the early 1970’s the first Microprocessor was developed by Intel. It was a 4 bit machine that was named the 4004 The.
Low power CDN. SPEED Operate vdd at half rails Data should operate at full rails.
Clockless Chips Date: October 26, Presented by:
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
1 Single-ISA Heterogeneous Multi-Core Architectures: The Potential for Processor Power Reduction Rakesh Kumar, Keith I. Farkas, Norman P. Jouppi, Parthasarathy.
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
12004 MAPLD: 153Brej Early output logic and Anti-Tokens Charlie Brej APT Group Manchester University.
Reading Assignment: Rabaey: Chapter 9
Clockless Chips Under the esteemed guidance of Romy Sinha Lecturer, REC Bhalki Presented by: Lokesh S. Woldoddy 3RB05CS122 Date:11 April 2009.
CPU (Central Processing Unit). The CPU is the brain of the computer. Sometimes referred to simply as the processor or central processor, the CPU is where.
Processor Level Parallelism 1
ALPHA 21164PC. Alpha 21164PC High-performance alternative to a Windows NT Personal Computer.
COE 360 Principles of VLSI Design Delay. 2 Definitions.
CPU Central Processing Unit
Lecture 11: Sequential Circuit Design
Temperature and Power Management
Welcome To Seminar Presentation Seminar Report On Clockless Chips
Other Approaches.
Objectives Overview Differentiate among various styles of system units on desktop computers, notebook computers, and mobile devices Identify chips, adapter.
Asynchronous Interface Specification, Analysis and Synthesis
Roadmap History Synchronized vs. Asynchronous overview How it works
Pipelining and Retiming 1
Multi-core processors
Parallel and Distributed Simulation Techniques
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Assembly Language for Intel-Based Computers, 5th Edition
Guide to Operating Systems, 5th Edition
Multi-core processors
Architecture & Organization 1
Circuits and Interconnects In Aggressively Scaled CMOS
Phnom Penh International University (PPIU)
COMP2121: Microprocessors and Interfacing
Introduction to CMOS VLSI Design Lecture 10: Sequential Circuits
Edited by : Noor Alhareqi
The University of British Columbia
Blame Passing for Analysis and Optimisation
Edited by : Noor Alhareqi
Architecture & Organization 1
Clocking in High-Performance and Low-Power Systems Presentation given at: EPFL Lausanne, Switzerland June 23th, 2003 Vojin G. Oklobdzija Advanced.
Serial versus Pipelined Execution
ARM implementation the design is divided into a data path section that is described in register transfer level (RTL) notation control section that is viewed.
332:578 Deep Submicron VLSI Design Lecture 14 Design for Clock Skew
High Performance Asynchronous Circuit Design and Application
Emerging Technologies of Computation
Clockless Logic: Asynchronous Pipelines
Lecture 19 Logistics Last lecture Today
Reduction in synchronisation in bundled data systems
Wagging Logic: Moore's Law will eventually fix it
Early output logic and Anti-Tokens
Clockless Computing Lecture 3
Pipelining and Superscalar Techniques
Presentation transcript:

A Quasi-Delay-Insensitive Method to Overcome Transistor Variation Charlie Brej APT Group University of Manchester 22/06/2019 VLSI 2005

Overview Synchronous Problems Asynchronous Logic Asynchronous Benefits Why? How? Asynchronous Benefits Delay Insensitivity Early Output 22/06/2019 VLSI 2005

Problems: Communication Communication horizon “For a 60 nanometer process a signal can reach only 5% of the die’s length in a clock cycle” [D. Matzke,1997] Clock distributed using wave pipelining 22/06/2019 VLSI 2005

Can’t keep ramping up the clock Intel pulls the plug on 4GHz Pentium 4 AMD and Intel using PR based model numbers New ranges run at much slower clock rate Higher concentration on parallel execution Hyper-threading Multiple cores 22/06/2019 VLSI 2005

Problems: Performance Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Cycle time Worst – Average case performance Real Computation 22/06/2019 VLSI 2005

Clock! What is it good for? No arguing with the clock 9am - 5pm. No excuses! 22/06/2019 VLSI 2005

Bundled-Data When you finish, do the next task Flexitime Request + Delay Acknowledge When you finish, do the next task Flexitime 22/06/2019 VLSI 2005

Transistor Variability Remove the Clock Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Worst – Average case performance Cycle time Real Computation 22/06/2019 VLSI 2005

How do you know when you are finished? Synchronous: Estimate Global timing reference Asynchronous (bundled-data) Local delay elements Asynchronous (delay-insensitive) When the data arrives Intrinsic 22/06/2019 VLSI 2005

Becoming Delay Insensitive Dual-Rail Two wires 00 – NULL 01 – Zero 10 – One (11 – Not used) Four Phase handshake Return to zero R0 R1 Ack 22/06/2019 VLSI 2005

Delay Insensitivity No assumptions on speed of wires or gates Environmental effects Heat Voltage supply Manufacturing defects Thin Film Transistor Next generation process sizes 22/06/2019 VLSI 2005

Early Output Logic Dual-Rail interfaces Output generated as early as possible Two Early output cases If either input is ‘0’ then the output is ‘0’ 22/06/2019 VLSI 2005

Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to 22/06/2019 VLSI 2005

Bit level pipelining Forward completed parts of the result Pace work Don’t stall parts unless you have to 22/06/2019 VLSI 2005

Early Output cases 22/06/2019 VLSI 2005

Paper contribution With missing inputs still generates results Isolates late inputs Allows next data phase 22/06/2019 VLSI 2005

Remove Unnecessary computation Unbalanced Stages Clock overheads Clock Skew/Jitter Transistor Variability Timing Assumption overheads Signal Integrity Worst – Average case performance Unnecessary Computation/Delays Real Computation Cycle time 22/06/2019 VLSI 2005

Summary Asynchronous Delay Insensitive Average case performance Safe No timing assumptions Average case performance Remove unnecessary computation 22/06/2019 VLSI 2005