Download presentation
Presentation is loading. Please wait.
Published byBrianna Manning Modified over 9 years ago
1
MOUSETRAP Ultra-High-Speed Transition-Signaling Asynchronous Pipelines Montek Singh & Steven M. Nowick Department of Computer Science Columbia University, New York, NY 10027 2001 IEEE
2
Agenda Review Introduction MOUSETRAP Preliminary Experiment Results Conclusions
3
Review Synchronous pipeline Wave pipeline Clock-delayed domino Skew-tolerant domino Self-resetting circuits Asynchronous pipeline Micropipeline GasP IPCMOS
4
Asynchronous circuit ’ s benefits No clock skew problem Low power consumption Faster speed (average case) Reduce global timing issues Avoid variations in fabrication,temperature, … etc. Low EMI & Noise ………
5
Low Power Consumption On high-performance chips Clock power consumption is a significant proportion of total power consumption. Gated clocks reduce the wastage Make clock skew worse Incur some power cost All parts of the clocked circuits run the same frequency
6
Performance Synchronous design must be toleranced for worst case conditions Fabrication, temperature, voltage, data values, Clock skew Asynchronous circuits self-adjust to the operating and data conditions
7
Agenda Review Introduction MOUSETRAP Preliminary Experiment Results Conclusions
8
Introduction Asynchronous Design Styles Protocol: Level signaling (four phase) Transition signaling (two phase) Logic: Bundled-data (ex: signal-rail) Self-timed (ex: dual-rail)
9
Level signaling ( four phase ) A send data to B (active) Step 1:A put data in bus, set req =1 Step 2:B get data from bus, set ack =1 (return-to-zero phase) Step 3:A set req =0 Step 4:B set ack =0
10
Transition signaling ( two phase ) A send data to B (active) Step 1:A put data in bus, set req =1 Step 2:B get data from bus, set ack =1 Step 3:A put data in bus, set req =0 Step 4:B get data from bus, set ack =0
11
Introduction Asynchronous Design Styles Protocol: Level signaling (four phase) Transition signaling (two phase) Logic: Bundled-data (ex: signal-rail) Self-timed (ex: dual-rail)
12
C-element Z next =AB+Z(A+B) When A=1,B=1 Z next =1 When A=0,B=0 Z next =0
13
Micropipeline 4-phase latch FIFO req ack
14
Bundled-data
15
Self-timed Generate Completion-Detection signal Delay-Insensitive (DI) Coding ex:dual-rail coding (two phase coding) 00 -> invalid value 01 -> 0 10 -> 1 11 -> no use
16
Self-timed (dual-rail coding)
17
Performance Comparison of Asynchronous Adders Mark A. Franklin & Tienyo Pan
18
Agenda Review Introduction MOUSETRAP Preliminary Experiment Results Conclusions
19
Mousetrap Minimal-Overhead Ultrahigh-SpEed Transition-signaling Asynchronous Pipeline
20
MOUSETRAP-FIFO Latch delay is 110 ps XNOR delay is 65 ps
21
MOUSETRAP with logic (bundled data)
22
Bundled data Bundled data scheme: Req n must arrive at stage N after the data inputs to that stage have stabilized. Worst-case delay Allow circuits to have hazards
23
Delay Buffer Inverter chain A chain of transmission gates Duplicate the worst-case critical path More accurate delay More area-expensive
24
Timing-forward latency
25
Timing-Cycle time
26
Standard synchronous pipeline Forward latency Cycle time
27
MOUSETRAP-Setup time
28
MOUSETRAP-Hold time
29
Clocked-CMOS (C 2 MOS) logic
30
C 2 MOS ’ s benefits Smaller delay Smaller area Lower power consumption
31
MOUSETRAP- C 2 MOS Forward latency Cycle time
32
Handling wide datapaths Datapath partitioning Control kiting (buffer insertion)
33
Optimization Sliding door Change MOS ’ s width (lower )
34
Non-Linear Pipeline-fork
35
Non-Linear Pipeline-join
36
experiment 0.25μm TSMC 2.5v, 300k A pass-gate implementation of an XNOR/XOR A standard 6 transistor pass-gate dynamic D-latch 0.6μm HP 3.3v,300K A pass-gate implementation of an XNOR/XOR Clocked-CMOS style latch 10 stage, 16-bit datapath pre-layout simulation (HSPICE)
37
result
38
Conclusions Use small & fast latches Low Latch controller overhead(XNOR) Transition-signaling protocol (efficient & concurrent) Without complex timing & design effort Variable-speed environment(elasticity)
39
comparison IPCMOS (asynchronous interlocked pipelined CMOS) 3.3~4.5GHz IBM 0.18μm Post-layout simulation
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.