Download presentation
Presentation is loading. Please wait.
Published byMoshe Milk Modified over 9 years ago
1
Southampton: Oct 99Asynchronous Circuit Compilation- 1 Asynchronous Circuit Compilation Dr. Doug Edwards doug@cs.man.ac.uk
2
Southampton: Oct 99Asynchronous Circuit Compilation- 2 Overview: n Asynchronous circuits n Advantages n Asynchronous Design Paradigms n Syntax Directed Compilation Handshake Circuits n Balsa n Datapath Compilation n Design Example - DMA Controller
3
Southampton: Oct 99Asynchronous Circuit Compilation- 3 Asynchronous (self-timed) Basics n Synchronous circuits a global clock separates system states – A time domain view of system activity. n Asynchronous circuits input changes separate system states –A sequence or trace domain view of system activity.
4
Southampton: Oct 99Asynchronous Circuit Compilation- 4 Why Asynchronous? n Low Power data-driven: power is only used to do useful work zero power when idle with instant restart n Low EMI In a clocked circuit, all noise is correlated Async circuits have “distributed” switching activity leading to uncorrelated EMI
5
Southampton: Oct 99Asynchronous Circuit Compilation- 5 Why Asynchronous? n No clock distribution problems n Composability/Modularity facilitates IP reuse n Average Case Performance exploit the fact that worst-case often occurs infrequently
6
Southampton: Oct 99Asynchronous Circuit Compilation- 6 Timing Models n Delay Insensitive (DI) Delays in circuits & wires are arbitrary n Quasi-Delay Insensitive (QDI) Similar to DI but assuming isochronic forks n Speed Independent (SI) Wires have no delays, arbitrary gate delays n Bounded Delay Single-sided timing constraints
7
Southampton: Oct 99Asynchronous Circuit Compilation- 7 Asynchronous Design Paradigms n AFSMs - for fast controllers etc Traditionally hard –hazards, races,state asigment problems Research has led to new techniques –STG/Petri net based SI circuits –Burst-Mode circuits n Macromodule-like for larger systems micropipeline approach, handshake circuits
8
Southampton: Oct 99Asynchronous Circuit Compilation- 8 n With no clock, some other means is required to co-ordinate control flow n Use a request/acknowledge handshake Asynchronous Control Req Ack Sender
9
Southampton: Oct 99Asynchronous Circuit Compilation- 9 Signalling Protocols n req & ack are abstractions: layer a signalling protocol on top of them: n Two common protocols 2-phase (transition signalling, NRZ) 4-phase (Return-to-Zero signalling)
10
Southampton: Oct 99Asynchronous Circuit Compilation- 10 Data Validity Models n Self Timed The validity of the data is encoded within the data itself – redundant coding e.g. Dual Rail: each data bit requires two wires. 00 -> no data, 01 -> ‘0’, 10 -> ‘1’ n Bundled Data approach conventional datapath validity is assured by imposing timing constraints.
11
Southampton: Oct 99Asynchronous Circuit Compilation- 11 valid 1 transaction1 transaction valid Req Ack 2-phase Protocol n Events are transitions:
12
Southampton: Oct 99Asynchronous Circuit Compilation- 12 4-phase protocol n Signals are returned to initial state after each transaction Several possible interleavings of the signal transitions
13
Southampton: Oct 99Asynchronous Circuit Compilation- 13 Comparison of Approaches n 2-phase/4-phase 2-phase conceptually simpler (once an event mind-set is adopted) 2-phase circuits slower & more complex think 2-phase, build 4-phase n Bundled-Data/Dual-rail Current orthodoxy: bundled data is faster, lower power, smaller area with tolerancing task no worse than for a clocked design
14
Southampton: Oct 99Asynchronous Circuit Compilation- 14 Current Approach n QDI control n Bounded-Delay (bundled-data) datapath n 4-phase signalling Amulet3i
15
Southampton: Oct 99Asynchronous Circuit Compilation- 15 Asynchronous HDLs n Conventional programming languages lack 3 necessary constructs: communication parallelism/concurrency sharing (of hardware) n Conventional HDLs lack adequate fine-grain concurrency channel based communication primitives
16
Southampton: Oct 99Asynchronous Circuit Compilation- 16 Asynchronous HDLs – 2 n Tangram, Balsa CSP based + data types + … based on underlying formal semantics –guarantees correct composition rules –easier composition than in sync circuits??? transparent compilation –each production rule in the language translates to an intermediate handshake circuit –allows designer to infer circuit costs & performance from the program
17
Southampton: Oct 99Asynchronous Circuit Compilation- 17 Handshake Circuits - 1 n Circuits communicate along channels n Channels connect ports at circuit interface n Ports have: Type Direction Sense
18
Southampton: Oct 99Asynchronous Circuit Compilation- 18 Handshake Circuits - 2 n Port type determines the number of data wires no data wires == control only port! n Port direction is input, output or control only n Port sense Active: initiates transfers Passive: responds to requests
19
Southampton: Oct 99Asynchronous Circuit Compilation- 19 Micropipeline-Style Circuits: Push Circuits: Circuit waits for data passive input req ack data cct active output req ack data
20
Southampton: Oct 99Asynchronous Circuit Compilation- 20 Micropipeline-Style Circuits: Push Circuits: data arrives req ack data cct req ack data
21
Southampton: Oct 99Asynchronous Circuit Compilation- 21 Micropipeline-Style Circuits: Push Circuits: data validity signalled req ack data cct req ack data
22
Southampton: Oct 99Asynchronous Circuit Compilation- 22 Micropipeline-Style Circuits: Push Circuits: circuit accepts data req ack data cct req ack data
23
Southampton: Oct 99Asynchronous Circuit Compilation- 23 Micropipeline-Style Circuits: Push Circuits: circuit signals data taken req ack data cct req ack data
24
Southampton: Oct 99Asynchronous Circuit Compilation- 24 Micropipeline-Style Circuits: Push Circuits: Circuit outputs data req ack data cct req ack data
25
Southampton: Oct 99Asynchronous Circuit Compilation- 25 Micropipeline-Style Circuits: Push Circuits: Circuit signals validity req ack data cct req ack data
26
Southampton: Oct 99Asynchronous Circuit Compilation- 26 Micropipeline-Style Circuits: Push Circuits: receiver takes data req ack data cct req ack data
27
Southampton: Oct 99Asynchronous Circuit Compilation- 27 Micropipeline-Style Circuits: n 4-phase protocol not detailed n Previous circuit decoupled input and ouput implies a latch inside the handshake circuit n An alternative is for the input handshake to enclose the output handshake
28
Southampton: Oct 99Asynchronous Circuit Compilation- 28 Enclosed Handshake: Push Circuits: data arrives req ack data cct req ack data
29
Southampton: Oct 99Asynchronous Circuit Compilation- 29 Enclosed Handshake: Push Circuits: data validity signalled req ack data cct req ack data
30
Southampton: Oct 99Asynchronous Circuit Compilation- 30 Enclosed Handshake: Push Circuits: circuit accepts data req ack data cct req ack data
31
Southampton: Oct 99Asynchronous Circuit Compilation- 31 Enclosed Handshake: Push Circuits: Circuit outputs data req ack data cct req ack data
32
Southampton: Oct 99Asynchronous Circuit Compilation- 32 Enclosed Handshake: Push Circuits: Circuit signals validity req ack data cct req ack data
33
Southampton: Oct 99Asynchronous Circuit Compilation- 33 Enclosed Handshake: Push Circuits: receiver takes data req ack data cct req ack data
34
Southampton: Oct 99Asynchronous Circuit Compilation- 34 Enclosed Handshake: Push Circuits: input handshake completes No latch required req ack data cct req ack data
35
Southampton: Oct 99Asynchronous Circuit Compilation- 35 Tangram Style Circuits Pull Circuits: active ported circuits/ control driven req ack data cct req ack data active input port
36
Southampton: Oct 99Asynchronous Circuit Compilation- 36 Tangram Style Circuits Pull Circuits: Circuit demands data req ack data cct req ack data
37
Southampton: Oct 99Asynchronous Circuit Compilation- 37 Tangram Style Circuits Pull Circuits: data is sent on demand req ack data cct req ack data
38
Southampton: Oct 99Asynchronous Circuit Compilation- 38 Tangram Style Circuits Pull Circuits: data is accepted and can then be released req ack data cct req ack data
39
Southampton: Oct 99Asynchronous Circuit Compilation- 39 Balsa n Language for synthesising large async circuits & systems n CSP/OCCAM background n Tangram-like based on Tangram compilation function compiles to a small (but expanding) set of handshake circuits origins: ESPRIT EXACT project
40
Southampton: Oct 99Asynchronous Circuit Compilation- 40 Balsa Language Features n Data types based on sequence of bits Arrays and records are bit-based Element extraction is by array slicing Strict data typing n Structural iteration n Arrayed channels n Parameterised & recursive functions
41
Southampton: Oct 99Asynchronous Circuit Compilation- 41 Balsa Language Features n Enclosed selection semantics Allows passive ported circuits Allows push (micropipeline-style) circuits Allows unbuffered (latch-free) circuits Can be considered a restricted form of Burns’ probe construct.
42
Southampton: Oct 99Asynchronous Circuit Compilation- 42 Balsa Source
43
Southampton: Oct 99Asynchronous Circuit Compilation- 43 Example: Single Place Buffer import [balsa.types.basic] public type word is 16 bits procedure buffer (input i : word; output o : word) is local variable x : word begin loop i -> x;-- Input communication o <- x-- Output communication end library mechanism visibility type declaration channel declarations procedure definition implies latch repeat forever sequential operation read input channel into local variable x output local variable x to output channel
44
Southampton: Oct 99Asynchronous Circuit Compilation- 44 Buffer Handshake Circuit Single-place buffer # x T ; T io activation channel repeater sequencer variable transferrer
45
Southampton: Oct 99Asynchronous Circuit Compilation- 45 # Buffer Handshake Circuit Single-place buffer repeater is activated x T ; T io
46
Southampton: Oct 99Asynchronous Circuit Compilation- 46 ; # Buffer Handshake Circuit Single-place buffer Sequencer handshakes to left transferrer x TT io
47
Southampton: Oct 99Asynchronous Circuit Compilation- 47 ; # Buffer Handshake Circuit Single-place buffer transferrer requests data from environment x TT io
48
Southampton: Oct 99Asynchronous Circuit Compilation- 48 x ; # Buffer Handshake Circuit Single-place buffer data transferred to variable x TT io
49
Southampton: Oct 99Asynchronous Circuit Compilation- 49 x ; # Buffer Handshake Circuit Single-place buffer variable handshake completes TT io
50
Southampton: Oct 99Asynchronous Circuit Compilation- 50 x ; # Buffer Handshake Circuit Single-place buffer transferrer handshake completes to environment TT io
51
Southampton: Oct 99Asynchronous Circuit Compilation- 51 x ; # Buffer Handshake Circuit Single-place buffer transferrer handshake completes TT io
52
Southampton: Oct 99Asynchronous Circuit Compilation- 52 x ; # Buffer Handshake Circuit Single-place buffer Sequencer handshakes to right transferrer TT io
53
Southampton: Oct 99Asynchronous Circuit Compilation- 53 x ; # Buffer Handshake Circuit Single-place buffer Transferrer reads variable TT io
54
Southampton: Oct 99Asynchronous Circuit Compilation- 54 x ; # Buffer Handshake Circuit Single-place buffer Transferrer outputs to environment TT io
55
Southampton: Oct 99Asynchronous Circuit Compilation- 55 x ; # Buffer Handshake Circuit Single-place buffer handshakes complete TT io
56
Southampton: Oct 99Asynchronous Circuit Compilation- 56 x ; # Buffer Handshake Circuit Single-place buffer Sequencer completes its input handshake TT io
57
Southampton: Oct 99Asynchronous Circuit Compilation- 57 Buffer Handshake Circuit Single-place buffer repeater initiates another transfer, etc x ; # TT i o
58
Southampton: Oct 99Asynchronous Circuit Compilation- 58 Example: Single Place Buffer import [balsa.types.basic] public type word is 16 bits procedure buffer (input i : word; output o : word) is local variable x : word begin loop i -> x;-- Input communication o <- x-- Output communication end
59
Southampton: Oct 99Asynchronous Circuit Compilation- 59 Example: 2-place buffer import [balsa.types.basic] import [buffer1a] public type word is 16 bits procedure buffer2c (input i : word; output o : word) is local channel c : word begin buffer (i, c) || buffer (c, o) end parallel composition reuse component internal channel connects two 1-place buffers buffers connected by common signal name
60
Southampton: Oct 99Asynchronous Circuit Compilation- 60 2-place Buffer Handshake Circuit B i x par component o cc passivator
61
Southampton: Oct 99Asynchronous Circuit Compilation- 61 2-place Buffer Handshake Circuit x ; # T T i x ; # T T # # par component o cc passivator
62
Southampton: Oct 99Asynchronous Circuit Compilation- 62 Peephole Optimisation n Composition of handshake circuits leads to inefficiencies at circuit boundaries n Straightforward peephole optimizations
63
Southampton: Oct 99Asynchronous Circuit Compilation- 63 2-place Buffer Handshake Circuit x ; # T T i x ; # T T # # par component o cc passivator
64
Southampton: Oct 99Asynchronous Circuit Compilation- 64 Optimized 2-place Buffer Circuit x ; # T T i x ; # T control-only
65
Southampton: Oct 99Asynchronous Circuit Compilation- 65 The Repeater n “Formal” Definition REP(a ,b ) = (a : #[b ]) denotes active port denotes passive port # denotes repeat : denotes handshake enclosure
66
Southampton: Oct 99Asynchronous Circuit Compilation- 66 The Repeater n “Formal” Definition REP (a ,b ) = (a : #[b ]) = (a : #[b ;b ]) = (a r : #[b r ; b a ; b r ; b a ]) b r b a a r a a
67
Southampton: Oct 99Asynchronous Circuit Compilation- 67 The Transferrer n Several Implementations simplest – wire-only: arar crcr baba a brbr caca data[n]
68
Southampton: Oct 99Asynchronous Circuit Compilation- 68 Balsa Toolkit -1 n balsa-c The compiler for the language n breeze2dot Produces a postscript plot of the generated handshake circuits n breezecost Reports the cost of the compiled circuit in arbitrary units
69
Southampton: Oct 99Asynchronous Circuit Compilation- 69 Balsa Toolkit -2 n breeze2lard The interface to the LARD simulation environment. –balsa source is translated to LARD –simple test harness is generated n balsa-md An automatic makefile generation facility. n balsa-mgr A GUI project manager
70
Southampton: Oct 99Asynchronous Circuit Compilation- 70 Mod-16 Counter (all even)
71
Southampton: Oct 99Asynchronous Circuit Compilation- 71 Bundled-Data Datapaths n Problems random standard cell layout –mixed control + datapath timing analysis required robustness of design reduced n Possible Solutions DI codes hybrid bundled + DI simpler timing analysis
72
Southampton: Oct 99Asynchronous Circuit Compilation- 72 DI Codes n Dual Rail (used in 1st Tangram system) Can use standard cell approach without timing analysis –no need to distinguish between control & data abandoned in favour of bundled-data –area cost in extra wires –area & time cost in completion detection Tangram/Balsa generates push-pull pipelines with expensive synchronization
73
Southampton: Oct 99Asynchronous Circuit Compilation- 73 Generic Pipeline n Passivators join compiled procedure B i B o cc passivator
74
Southampton: Oct 99Asynchronous Circuit Compilation- 74 Passivator Implementation n Bundled Data n Dual Rail arar babaa brbr data[n] d0d0 d1d1 C brbr babaa n-wide C-gate C C n-bits wide d n-1
75
Southampton: Oct 99Asynchronous Circuit Compilation- 75 DI Code Synchronizations n Expensive need C-element synchronisation tree n A partial solution (not always possible/desirable) is: transform to push-style datapath –(not possible in Tangram only Balsa)
76
Southampton: Oct 99Asynchronous Circuit Compilation- 76 Push Pipeline B i B o cc Passive input port connector (wires-only)
77
Southampton: Oct 99Asynchronous Circuit Compilation- 77 Hybrid Solutions n Use DI coding within bundled datapath framework e.g. use dual-rail carry signals within a conventional adder –early completion easily detected n Average-case performance n Only applicable to a few datapath operations
78
Southampton: Oct 99Asynchronous Circuit Compilation- 78 Simpler Timing Analysis n Separate control and datapath generate regular, compiled, datapath –area improvement over standard cell (because of regular layout) – generate matched delay paths (c.f. self-timed PLAs) must be able to recognize datapath –difficult: control often contains datapath-like elements. –e.g. start at variables and work backwards...
79
Southampton: Oct 99Asynchronous Circuit Compilation- 79 Datapath meets Control n Example: Balsa case statement data “n” bits wide true/complement lines: dual-rail expansion 1 hot encoding
80
Southampton: Oct 99Asynchronous Circuit Compilation- 80 Case Component n input from datapath dual-rail simplifies internal logic n expansions parameterisable n “encode” component is similar opposite of case with true/false expansion
81
Southampton: Oct 99Asynchronous Circuit Compilation- 81 Simpler Timing Analysis n Tool support required use existing (non-Balsa) tools if possible automatically add matched paths/delays to synthesised datapaths n Design own cells where appropriate e.g. hybrid stages
82
Southampton: Oct 99Asynchronous Circuit Compilation- 82 Future Work n Provide support for DI, hybrid and datapath-compiled datapaths even with datapath compilation, some datapath would still be standard cell –e.g. instruction decoder (control heavy) –datapath in control cost of connecting separate blocks in layout n Test Design required (datapath heavy)
83
Southampton: Oct 99Asynchronous Circuit Compilation- 83 Tool Enhancement n balsa-c support for attribution to select compilation mechanisms/ optimisation schemes n breeze2lard new models n balsa-netlist: new tech-mapping descriptions interface to datapath compilers
84
Southampton: Oct 99Asynchronous Circuit Compilation- 84 AMULET3i n Asynchronous macrocell ARM compatible processor core Full custom RAM Compiled ROM Balsa compiled DMA controller Test I/F, synchronous and off-chip bus bridges n Synchronous peripherals Designed by commercial partner...
85
Southampton: Oct 99Asynchronous Circuit Compilation- 85 AMULET3 System CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB
86
Southampton: Oct 99Asynchronous Circuit Compilation- 86 DMA Local RAM Access CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB
87
Southampton: Oct 99Asynchronous Circuit Compilation- 87 DMA Peripheral Accesses CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB DMA requests
88
Southampton: Oct 99Asynchronous Circuit Compilation- 88 Requirements / Specification n 16 clients, 32 channels n 3 channel types - complicated register structure n Programmable client channel 1 many mapping n Support synchronous requests n Transfers mostly between synchronous clients
89
Southampton: Oct 99Asynchronous Circuit Compilation- 89 Controller Structure
90
Southampton: Oct 99Asynchronous Circuit Compilation- 90 Two Controller Descriptions n Sequential (previous slides) Very simple control flow Requires two passes through register bank Slow!, Only memory decoupling helps n Parallel (next slides) Decouple TE actions from memory R/W with a new unit: Transfer Interface Interrupt the register bank on end of transfer
91
Southampton: Oct 99Asynchronous Circuit Compilation- 91 “Parallel” Design
92
Southampton: Oct 99Asynchronous Circuit Compilation- 92 The Design n 919 lines of Balsa describing register bank control, TE and TI. n Custom register banks and Synchronous Peripheral Interface n Miscellaneous glue standard cells Register bank controllers MARBLE interfaces n Compass Design Automation CAD
93
Southampton: Oct 99Asynchronous Circuit Compilation- 93 Implementation Technology n 0.35 m, 3LM CMOS n Standard cells from ARM Ltd. n Locally designed complex gates and asynchronous elements/gates. n Automated standard cell P&R n Only “essential” and simple gate level optimisation (by hand)
94
Southampton: Oct 99Asynchronous Circuit Compilation- 94 Design Partitioning Marble BUS: outside of DMA controller
95
Southampton: Oct 99Asynchronous Circuit Compilation- 95 Design Partitioning Balsa synthesised standard cells
96
Southampton: Oct 99Asynchronous Circuit Compilation- 96 Design Partitioning Custom “regular” layout
97
Southampton: Oct 99Asynchronous Circuit Compilation- 97 Design Partitioning Hand designed standard cells
98
Southampton: Oct 99Asynchronous Circuit Compilation- 98 DMA Controller Floor-Plan
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.