1 Seminar on High-Speed Asynchronous Pipelines Montek Singh Thursdays 10-11, SN325.

Slides:



Advertisements
Similar presentations
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
Advertisements

Issues in System on the Chip Clocking November 6th, 2003 SoC Design Conference, Seoul, KOREA Vojin G. Oklobdzija Advanced Computer System Engineering Laboratory.
1/1/ / faculty of Electrical Engineering eindhoven university of technology Processor support devices Part 1:Interrupts and shared memory dr.ir. A.C. Verschueren.
1 Clockless Logic  Recap: Lookahead Pipelines  High-Capacity Pipelines.
Introduction to CMOS VLSI Design Lecture 19: Design for Skew David Harris Harvey Mudd College Spring 2004.
Clock Design Adopted from David Harris of Harvey Mudd College.
Advances in Clockless and Mixed-Timing Digital Systems Prof. Steven M. Nowick Department of Computer Science Columbia University.
1 Clockless Logic Montek Singh Thu, Jan 13, 2004.
Advances in Designing Clockless Digital Systems Prof. Steven M. Nowick Department of Computer Science Columbia University New York,
1 Clockless Logic Montek Singh Tue, Mar 16, 2004.
COMP Clockless Logic and Silicon Compilers Lecture 3
Lab for Reliable Computing Generalized Latency-Insensitive Systems for Single-Clock and Multi-Clock Architectures Singh, M.; Theobald, M.; Design, Automation.
1 Exact Two-Level Minimization of Hazard-Free Logic with Multiple Input Changes Montek Singh Tue, Oct 16, 2007.
A. A. Jerraya Mark B. Josephs South Bank University, London System Timing.
1 Clockless Logic or How do I make hardware fast, power- efficient, less noisy, and easy-to-design? Montek Singh Thu, Jan 8, 2004.
Lecture 11 MOUSETRAP: Ultra-High-Speed Transition-Signaling Asynchronous Pipelines.
1 Clockless Logic: Dynamic Logic Pipelines (contd.)  Drawbacks of Williams’ PS0 Pipelines  Lookahead Pipelines.
Clockless Logic Montek Singh Tue, Apr 6, Case Study: An Adaptively-Pipelined Mixed Synchronous-Asynchronous System Montek Singh Univ. of North Carolina.
Pipelining By Toan Nguyen.
Computer Organization and Assembly language
Computer performance.
1  1801, Joseph Marie Jacquard Jacquard Loom and punch cards to program it. (George H. Williams, photos from Wikipedia) George H. WilliamsGeorge H. Williams.
4.0 rtos implementation part II
Intro to Digital Technology HARDWARE CONCEPTS. IT-IDT-4 Identify, describe, evaluate, select, and use appropriate technology. IT-IDT-5 Understand, communicate,
Clockless Chips Date: October 26, Presented by:
Low-Power Wireless Sensor Networks
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University,
Chapter 2 The CPU and the Main Board  2.1 Components of the CPU 2.1 Components of the CPU 2.1 Components of the CPU  2.2Performance and Instruction Sets.
2/6/2003IDEAL-IST Workshop, Christos P. Sotiriou, ICS-FORTH 1 IDEAL-IST Workshop Christos P. Sotiriou, Institute of Computer Science, FORTH.
2015/10/14Part-I1 Introduction to Parallel Processing.
Paper review: High Speed Dynamic Asynchronous Pipeline: Self Precharging Style Name : Chi-Chuan Chuang Date : 2013/03/20.
CSE 494: Electronic Design Automation Lecture 2 VLSI Design, Physical Design Automation, Design Styles.
Reminder Lab 0 Xilinx ISE tutorial Research Send me an if interested Looking for those interested in RC with skills in compilers/languages/synthesis,
RISC By Ryan Aldana. Agenda Brief Overview of RISC and CISC Features of RISC Instruction Pipeline Register Windowing and renaming Data Conflicts Branch.
System Architecture of Sensor Network Processors Alan Pilecki.
Test and Test Equipment Joshua Lottich CMPE /23/05.
I/O Computer Organization II 1 Interconnecting Components Need interconnections between – CPU, memory, I/O controllers Bus: shared communication channel.
1 COMP Clockless Logic and Silicon Compilers or How do I take “hard” out of hardware design? Montek Singh Thu, Jan 12, 2006.
Accessing I/O Devices Processor Memory BUS I/O Device 1 I/O Device 2.
EE3A1 Computer Hardware and Digital Design
Computer Organization & Assembly Language © by DR. M. Amer.
Reader: Pushpinder Kaur Chouhan
FPL Sept. 2, 2003 Software Decelerators Eric Keller, Gordon Brebner and Phil James-Roxby Xilinx Research Labs.
Curtis A. Nelson 1 Technology Mapping of Timed Circuits Curtis A. Nelson University of Utah September 23, 2002.
Lecture 11: FPGA-Based System Design October 18, 2004 ECE 697F Reconfigurable Computing Lecture 11 FPGA-Based System Design.
1 Clockless Logic or How do I make hardware fast, power- efficient, less noisy, and easy-to-design? Montek Singh Tue, Jan 14, 2003.
By Nasir Mahmood.  The NoC solution brings a networking method to on-chip communication.
VADA Lab.SungKyunKwan Univ. 1 L5:Lower Power Architecture Design 성균관대학교 조 준 동 교수
1 Recap: Lecture 4 Logic Implementation Styles:  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates, or “pass-transistor” logic.
Clockless Chips Under the esteemed guidance of Romy Sinha Lecturer, REC Bhalki Presented by: Lokesh S. Woldoddy 3RB05CS122 Date:11 April 2009.
Submitted by Abi Mathew Roll No:1
1 Clockless Logic Montek Singh Thu, Mar 2, Review: Logic Gate Families  Static CMOS logic  Dynamic logic, or “domino” logic  Transmission gates,
Welcome To Seminar Presentation Seminar Report On Clockless Chips
Memory Segmentation to Exploit Sleep Mode Operation
Asynchronous Interface Specification, Analysis and Synthesis
Roadmap History Synchronized vs. Asynchronous overview How it works
ECE354 Embedded Systems Introduction C Andras Moritz.
6. Structure of Computers
Parallel Programming By J. H. Wang May 2, 2017.
Parallel and Distributed Simulation Techniques
Recap: Lecture 1 What is asynchronous design? Why do we want to study it? What is pipelining? How can it be used to design really fast hardware?
Architecture & Organization 1
Scalable Processor Design
Edited by : Noor Alhareqi
Edited by : Noor Alhareqi
Architecture & Organization 1
Computers Inside and Out
Emerging Technologies of Computation
Clockless Computing Lecture 3
Presentation transcript:

1 Seminar on High-Speed Asynchronous Pipelines Montek Singh Thursdays 10-11, SN325

2 Lecture 1: Introduction  What is asynchronous design? Why do we want to study it?  What is pipelining? How can it be used to design really fast hardware?

3 Introduction: Clocked Digital Design Most current digital systems are synchronous: Clock: a global signal that paces operation of all components Clock: a global signal that paces operation of all components clock Benefit of clocking: enables discrete-time representation all components operate exactly once per clock tick all components operate exactly once per clock tick component outputs need to be ready by next clock tick component outputs need to be ready by next clock tick  allows “glitchy” or incorrect outputs between clock ticks

4 Microelectronics Trends Current and Future Trends: Significant Challenges Large-Scale “Systems-on-a-Chip” (SoC) Large-Scale “Systems-on-a-Chip” (SoC)  100 Million ~ 1 Billion transistors/chip Very High Speeds Very High Speeds  multiple GigaHertz clock rates Explosive Growth in Consumer Electronics Explosive Growth in Consumer Electronics  demand for ever-increasing functionality …  … with very low power consumption (limited battery life) Higher Portability/Modularity/Reusability Higher Portability/Modularity/Reusability  “plug ’n play” components, robust interfaces

5 Challenges to Clocked Design Breakdown of Single-Clock Paradigm: Chip will be partitioned into multiple timing domains Chip will be partitioned into multiple timing domains Increasing Difficulties with Clocked Design: Clock distribution: will require significant designer effort Clock distribution: will require significant designer effort Performance bottleneck: a single slow component Performance bottleneck: a single slow component Clock burns large fraction of chip power Clock burns large fraction of chip power Fixed clock rate: poor match for Fixed clock rate: poor match for  designing reusable components  interfacing with mixed-timing environments

6 What is Asynchronous Design?  Digital design with no centralized clock  Synchronization using local “handshaking” Synchronous System (Centralized Control) Asynchronous System (Distributed Control) handshakinginterface clock

7 Why Asynchronous Design?  Higher Performance May obtain “average-case” operation (not “worst-case”) May obtain “average-case” operation (not “worst-case”) Avoids overheads of multi-GHz clock distribution Avoids overheads of multi-GHz clock distribution  Lower Power No clock power expended No clock power expended Inactive components consume negligible power Inactive components consume negligible power  Better Electromagnetic Compatibility Smooth radiation spectra: no clock spikes Smooth radiation spectra: no clock spikes Much less interference with sensitive receivers [e.g., Philips pagers] Much less interference with sensitive receivers [e.g., Philips pagers]  Greater Flexibility/Modularity Naturally adapt to varied environments Naturally adapt to varied environments Supports reusable components Supports reusable components

8 Challenges of Asynchronous Design communication must be hazard-free! communication must be hazard-free! special design challenge = “hazard-free synthesis” special design challenge = “hazard-free synthesis”  Testability Issues: absence of clock means no “single-stepping” absence of clock means no “single-stepping”  Lack of Commercial CAD Tools: chicken-and-egg problem chicken-and-egg problem  Hazards: potential “glitches” on wire clean signals hazardous signals clock tick no problem for clocked systems

9 Asynchronous Design: Past & Present Async Design: In existence for 50 years, but … … many recent technical advances: Hazard-Free Circuit Design: Hazard-Free Circuit Design:  several practical techniques for controllers [Stanford/Columbia] Design for Testability: Design for Testability:  several test solutions, e.g. Philips Research Maturing Computer-Aided-Design (“CAD”) Tools: Maturing Computer-Aided-Design (“CAD”) Tools:  software tools for automated design [Philips,Columbia,Manchester] Successful Fabricated Chips: Successful Fabricated Chips:  embedded processors, high-speed pipelines, consumer electronics…

10 Recent Commercial Interest Several commercial asynchronous chips: Philips: asynchronous 80c51 microcontrollers Philips: asynchronous 80c51 microcontrollers  used in commercial pagers [1998] and cell phones [2000] Univ. of Manchester: async ARM processor [2000] Univ. of Manchester: async ARM processor [2000] Motorola: async divider in PowerPC chip [2000] Motorola: async divider in PowerPC chip [2000] HAL: async floating-point divider HAL: async floating-point divider  in HAL-I and II processors [early 1990’s] Recent experimental chips: IBM, Sun and Intel: IBM, Sun and Intel:  fast pipelines, arbiters, instruction-length decoder… IBM/Columbia Univ.: asynchronous digital FIR filter IBM/Columbia Univ.: asynchronous digital FIR filter Several recent startups: Theseus Logic, ADD, AmuCo… Theseus Logic, ADD, AmuCo…

11 Seminar Focus Overall Goal: Asynchronous Design for Very High-Speed Systems Focus: High-Throughput Pipelines Motivation: Pipelining is at the heart of nearly all high-performance digital systems high-performance digital systems Additional Benefits: Low power Low power Interfacing with mixed systems Interfacing with mixed systems Modular and scalable design Modular and scalable design

12 A “coarse-grain” pipeline (e.g. simple processor) A “fine-grain” pipeline (e.g. pipelined adder) fetchdecodeexecute Background: Pipelining What is Pipelining?: Breaking up a complex operation on a stream of data into simpler sequential operations + Throughput: significantly increased – Latency: somewhat degraded Storage elements (latches/registers) Throughput = #data items processed/second

13 Seminar Focus (contd.) Particular Focus: Extremely fine-grain pipelines “gate-level” pipelining = use narrowest possible stages “gate-level” pipelining = use narrowest possible stages each stage consists of only a single level of logic gates each stage consists of only a single level of logic gates  some of the fastest existing digital pipelines to date Application areas: multimedia hardware (graphics accelerators, video DSP’s, …) multimedia hardware (graphics accelerators, video DSP’s, …)  naturally pipelined systems, throughput is critical  input is often “bursty” optical networking optical networking  serializing/deserializing FIFO’s genomic string matching? genomic string matching?  KMP style string matching: variable skip lengths

14 Homework Problem Alice and Bob live on opposite sides of a wide river: Alice is supposed to send a message (say, a “Yes”/”No”) across to Bob around midnight. Both have flashlights, but neither owns a watch. What should they do? Suggest several strategies, and discuss pros and cons of each. AliceBob