Presentation is loading. Please wait.

Presentation is loading. Please wait.

Emerging Technologies of Computation

Similar presentations


Presentation on theme: "Emerging Technologies of Computation"— Presentation transcript:

1 Emerging Technologies of Computation
Montek Singh COMP Oct 27, 2011

2 Today: Basics of Asynchronous Design
Introduction to Asynchronous Design What is asynchronous design? Why do we want to do it? Data Representation and Communication How is data represented in an asynchronous system? How is information exchanged?

3 Introduction: Clocked Digital Design
Most current digital systems are synchronous: Clock: a global signal that paces operation of all components Benefit of clocking: enables discrete-time representation all components operate exactly once per clock tick component outputs need to be ready by next clock tick allows “glitchy” or incorrect outputs between clock ticks clock

4 Microelectronics Trends
Current and Future Trends: Significant Challenges Large-Scale “Systems-on-a-Chip” (SoC) 100 Million ~ 1 Billion transistors/chip Very High Speeds multiple GigaHertz clock rates Explosive Growth in Consumer Electronics demand for ever-increasing functionality … … with very low power consumption (limited battery life) Higher Portability/Modularity/Reusability “plug ’n play” components, robust interfaces

5 Challenges to Clocked Design
Breakdown of Single-Clock Paradigm: Chip will be partitioned into multiple timing domains challenge: gluing together multiple timing domains glue logic is susceptible to “metastability” (=incorrect values transferred) and latency overheads Increasing Difficulties with Clocked Design: Clock distribution: requires significant designer effort Performance bottleneck: a single slow component Clock burns large fraction of chip power (~40-70%) Fixed clock rate: poor match for designing reusable components interfacing with mixed-timing environments

6 What is Asynchronous Design?
Digital design with no centralized clock Synchronization using local “handshaking” Asynchronous System (Distributed Control) handshaking interface Synchronous System (Centralized Control) clock

7 Why Asynchronous Design? (1)
Higher Performance May obtain “average-case” operation (not “worst-case”) not limited by slowest component Avoids overheads of multi-GHz clock distribution Lower Power No clock power expended Inactive components consume negligible power Better Electromagnetic Compatibility Smooth radiation spectra: no clock spikes Much less interference with sensitive receivers [e.g., Philips pagers, smartcards] Greater Flexibility/Modularity Naturally adapt to variable-speed environments Supports reusable components

8 Why Asynchronous Design? (2)
The world already is mostly asynchronous! Events at the level of (or in between) large-scale systems are asynchronous several seconds to several milliseconds e.g., PC-printer communication, keyboard inputs, network comm. Events at the board level (or between chips) are often asynchronous milliseconds to 100 nanoseconds e.g., CPU-memory interface, interface with I/O subsystem (interrupts) Events within a chip, at the level of functional units (e.g., adders, control logic) are currently mostly synchronous several nanoseconds to 100 picoseconds Events at the level of a single logic gate are asynchronous 10 picoseconds Events at the quantum level are asynchronous picoseconds to femtoseconds So, why bother with clocks at all?! make everything asynchronous  greater elegance and robustness

9 Challenges of Asynchronous Design
Hazards: potential “glitches” on wire clock tick no problem for clocked systems clean signals hazardous signals communication must be hazard-free! special design challenge = “hazard-free synthesis” Testability Issues: absence of clock means no “single-stepping” Lack of Commercial CAD Tools: chicken-and-egg problem

10 Asynchronous Design: Past & Present
Async Design: In existence for 50 years, but … … many recent technical advances: Hazard-Free Circuit Design: several practical techniques for controllers [Stanford/Columbia] Design for Testability: several test solutions, e.g. Philips Research Maturing Computer-Aided-Design (“CAD”) Tools: software tools for automated design [Philips,Columbia,Manchester] recent DARPA program [Boeing,Philips,UNC,Columbia,…] Successful Fabricated Chips: embedded processors, high-speed pipelines, consumer electronics…

11 Recent Commercial Interest (1)
Several commercial asynchronous chips: Philips: asynchronous 80c51 microcontrollers used in commercial pagers [1998] and smartcards [2001] Univ. of Manchester: async ARM processor [2000] Motorola: async divider in PowerPC chip [2000] HAL: async floating-point divider in HAL-I and II processors [early 1990’s] Recent experimental chips: IBM, Sun and Intel: fast pipelines, arbiters, instruction-length decoder… IBM/Columbia/UNC: asynchronous digital FIR filter Several recent startups: Handshake Solutions, Theseus Logic, Codetronix, Fulcrum, Silistix, …

12 Recent Commercial Interest (2)
Major DARPA program: ~$13M Goals: commercial-strength automated CAD tool (=silicon compiler) direct translation from algorithms to chip layout capable of producing chips with 50M transistors or more rich suite of analysis and optimization tools demonstration chip Boeing application show dramatic improvements in: design time, power consumption, noise pollution, speed (?) Team: led by Boeing async startups: Theseus, Handshake Solutions, Codetronix universities: UNC, Columbia, UW, OrSU

13 Data Representation and Communication

14 A 5-minute Homework Problem
Alice and Bob live on opposite sides of a wide river: Alice Bob Alice is supposed to send a message (say, a “Yes”/”No”) across to Bob around midnight. Both have flashlights, but neither owns a watch. What should they do? Suggest several strategies, and discuss pros and cons of each.

15 Solution 1 Alice uses 2 lamps: Bob uses 1 lamp: Alice Bob
1 to indicate that she is ready with the message, and 1 for the message itself Bob uses 1 lamp: to indicate that he has received the message got it yes/no Alice ready Bob

16 Solution 2 Alice uses 2 lamps: Bob uses 1 lamp: Alice Bob
Green lamp to indicate “yes” Red lamp to indicate “no” Bob uses 1 lamp: to indicate that he has received the message got it no yes Alice Bob

17 Solution 3 What if Alice and Bob could keep time?
Alice uses 1 lamp for the message: At 12 midnight: turns on lamp if message = “yes” At 12:01: turns lamp off Bob needs no lamps! Takes down the message between 12 and 12:01 Pros: Fewer signals, lesser processing needed Cons: Alice and Bob must keep their clocks closely synchronized If Bob’s watch is off by a minute, incorrect communication possible

18 Homework! Think of all scenarios in which Solution #1 can fail
Are any of those scenarios a problem for Solution #2 as well?

19 Data Representation and Communication
How is data represented in an asynchronous system? How is information exchanged?: control signaling (handshake styles)

20 Data Encoding: “Bundled Data”
Single-rail “Bundled Datapath”: simplest approach widely used Features: datapath: 1 wire per bit (e.g. standard sync blocks) matched delay: produces delayed “done” signal worst-case delay: longer than slowest path done indicates valid data bit 1 request bit n bit m done matched delay function block Practical style: can reuse sync components; small area Fixed (worst-case) completion time

21 Bundled Data: Completion Sensing
Delay Matching: either single worst-case delay or, fine-grain delay request done bank of delays MUX delay selector Speculative completion: choose delay “on the fly” start with shortest delay; increase as needed

22 Data Encoding: Dual-Rail
Dual-rail: uses 2 wires per data bit bit n bit 1 bit m Each Dual-Rail Pair: provides both data value and validity provides robust data-dependent completion needs completion detectors

23 Dual-Rail: Completion Sensing
Dual-Rail Completion Detector: combines dual-rail signals indicates when all bits are valid (or reset) C-element: if all inputs=1, output  1 if all inputs=0, output  0 else, maintain output value C Done OR bit0 bit1 bitn OR together 2 rails per bit Merge results using a Müller “C-element”

24 Handshaking Styles: 4-phase
4-Phase: requires 4 events per handshake Request Acknowledge start event done get ready for next event ready for “Level-sensitive”  simpler logic implementation Overhead of “return-to-zero” (RTZ or resetting) extra events which do no useful computation

25 Handshaking Styles: 2-phase
2-Phase: requires 2 events per handshake a.k.a. transition signaling Request Acknowledge start event done start next next event Elegant: no return-to-zero Slower logic implementation: logic primitives are inherently level-sensitive, not event-based (at least in CMOS)

26 Handshaking Styles: Pulse Mode
Pulse Mode: combines benefits of 2-phase and 4-phase use pulses to represent events start next event start event Request next event done event done Acknowledge No return-to-zero (like 2-phase) Level-based implementation (like 4-phase) Need a timing constraint on pulse width

27 Handshaking Styles: Single-Track
Single-Track: combines req and ack onto single wire! one wire used for bidirectional communication sender raises, receiver lowers req + ack Request Acknowledge req ack Efficient protocol: no return-to-zero, level-based Need aggressive low-level design techniques much effort to ensure reliability, satisfy timing constraints

28 Handshaking + Data Representation
Several combinations possible: dual-rail 4-phase, single-rail 4-phase, dual-rail 2-phase, and single-rail 2-phase Example: dual-rail 4-phase bit m bit 1 ack dual-rail data: functions as an implicit “request” 4-phase cycle: between acknowledge and implicit request A B

29 Other Data Representation Styles
Level-Encoded Dual-Rail (LEDR) 2 wires per bit: “data” and “phase” exactly one wire per bit changes value if new value is different, “data” wire changes value else “phase” wire change value M-of-N Codes N wires used for a data word M wires (M <= N) change value Values of N and M: have impact on… information transmitted, power consumed and logic complexity Knuth codes, Huffman codes, … data phase

30 Which to use? Depends on several performance parameters: speed
single-rail vs. dual-rail single-rail may be faster (if designed aggressively) dual-rail may be faster (if completion times vary widely) 2-phase vs. 4-phase 2-phase may be faster (if logic overhead is small) 4-phase may be faster (if overhead of return-to-zero is small) power consumption 2-phase typically has fewer gate transitions ( lower power) amount of logic used (#gates/wires/pins  chip area) single-rail needs fewer gates/wires/pins design and verification effort dual-rail, 1-of-N, M-of-N, Knuth codes…: delay-insensitive: robust in the presence of arbitrary delays single-rail: requires greater timing verification effort

31 Homework! Suppose you are given N wires
Which M-of-N encoding (i.e. what M) encodes most information? Suppose you have to encode 4-bit values Which M-of-N encoding yields fewest wires? Suppose you can switch at most 2 wires Which M-of-N encoding yields fewest wires for 4-bit values?


Download ppt "Emerging Technologies of Computation"

Similar presentations


Ads by Google