Download presentation
Presentation is loading. Please wait.
1
Handshake protocols for de-synchronization I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat Politecnica de Catalunya, Barcelona, Spain Cadence Berkeley Lab, Berkeley, USA ICS-FORTH, Crete, Greece
2
Asynchronous for dummies I. Blunno, J. Cortadella, A. Kondratyev, L. Lavagno, K. Lwin and C. Sotiriou Politecnico di Torino, Italy Universitat Politecnica de Catalunya, Barcelona, Spain Cadence Berkeley Lab, Berkeley, USA ICS-FORTH, Crete, Greece
3
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
4
Synchronous CLK Asynchronous De-synchronize CLK
5
MS flip-flop Synchronous circuit CLK LLLL L L 00 0 0 11
6
De-synchronization LLLL L L 00 0 0 11 C C C C C C
7
C C C C C C Distributed controllers substitute the clock network The data path remains intact !
8
Design flow Think synchronous Think synchronous Design synchronous: one clock and edge-triggered flip-flops Design synchronous: one clock and edge-triggered flip-flops De-synchronize (automatically) De-synchronize (automatically) Run it asynchronously Run it asynchronously
9
Prior work Micropipelines (Sutherland, 1989) Micropipelines (Sutherland, 1989) Local generation of clocks Local generation of clocks Varshavsky et al., 1995 Kol and Ginosar, 1996 Theseus Logic (Ligthart et al., 2000) Theseus Logic (Ligthart et al., 2000) Commercial HDL synthesis tools Direct translation and special registers Phased logic (Linder and Harden, 1996) (Reese, Thornton, Traver, 2003) Phased logic (Linder and Harden, 1996) (Reese, Thornton, Traver, 2003) Conceptually similar Different handshake protocol (2 phase vs. 4 phase)
10
Automatic de-synchronization Devise an automatic method for de-synchronization Devise an automatic method for de-synchronization Identify a subclass of synchronous circuits suitable for de-synchronization Identify a subclass of synchronous circuits suitable for de-synchronization Formally prove correctness Formally prove correctness
11
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
14
Synchronous flow
25
De-synchronized flow
27
+
38
Flow equivalence [Guernic, Talpin, Lann, 2003]
39
A B
40
Flow equivalence CLK A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior De-synchronized behavior
41
Flow equivalence CLK A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 A 1 3 0 2 1 5 3 1 6 0 B 5 1 2 3 1 4 2 4 3 1 Synchronous behavior De-synchronized behavior
42
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
43
LLLL L L 00 0 0 11 C C C C C C
44
C C C C C C
45
C L
46
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000 A latch cannot read another data item until the successor has captured the current one
47
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0100 A latch cannot read another data item until the successor has captured the current one
48
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000 A latch cannot read another data item until the successor has captured the current one
49
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 1000 A latch cannot read another data item until the successor has captured the current one
50
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000
51
ABCD A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0001
52
ABCD A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000
53
ABCD A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0010 A latch cannot become opaque before having captured the data item from its predecessor
54
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0110 A latch cannot become opaque before having captured the data item from its predecessor
55
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0010 A latch cannot become opaque before having captured the data item from its predecessor
56
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000 A latch cannot become opaque before having captured the data item from its predecessor
57
ABCD A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- A- B+ C- D+ A+ B- C+ D- 0000
58
ABCD A+ B+ C+ D+ A- B- C- D- A B
59
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
60
A+ B+ A- B- Can we increase concurrency ? A+ B+ A- B- A+ B+ A- B- not flow-equivalent
61
A B A B data overrun A B data lost
62
A+ B+ A- B- Can we reduce concurrency ? How much ?
63
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- (8 states) (6 states) A+ B+ A- B- (5 states) A+ B+ A- B- A+ B+ A- B- (4 states)
64
A B A B A B A B A B A B fully decoupled (Furber & Day) simple 4-phase semi-decoupled (Furber & Day) non-overlapping GasP, IPCMOS de-synchronizationmodel
65
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- simple 4-phase non-overlapping semi-decoupled (Furber & Day) fully decoupled (Furber & Day) GasP, IPCMOS de-synchronizationmodel
66
AB cntrl Ri Ai Rx Ax Ro Ao Ri+ A- Rx+ B- Ro+ Ai+ Ax+ Ao+ Ri- A+ Rx- B+ Ro- Ai- Ax- Ao- (semi-decoupled 4-phase protocol)
67
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
68
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
69
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
70
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
71
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
72
AB cntrl Ri Ai Rx Ax Ro Ao A- B- A+ B+ (semi-decoupled 4-phase protocol)
73
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B-
74
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
76
Which protocols are valid for de-synchronization ?
77
A+ B+ A- B- Theorem: the de-synchronization protocol preserves flow-equivalence Proof: by induction on the length of the traces Induction hypothesis: same latch values at reset Induction step: same values at cycle i same values at cycle i+1
78
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B-
79
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- Theorem: any reduction in concurrency preserves flow-equivalence
80
Any hybrid approach preserves flow-equivalence ! Semi-decoupledSemi-decoupled Semi-decoupled non-overlappingnon-overlapping FullydecoupledFullydecoupled
81
ABCD A+ B+ C+ D+ A- B- C- D-
82
ABCD A+ B+ C+ D+ A- B- C- D- semi- decoupled non- overlapping fully decoupled Flow-equivalence is preserved, … but …
83
Liveness Preservation of flow-equivalence: all the generated traces are equivalent Preservation of flow-equivalence: all the generated traces are equivalent Are all traces generated ? (Is the marked graph live ?) Not always ! Are all traces generated ? (Is the marked graph live ?) Not always !
84
A+ B+ C+ D+ A- B- C- D- Liveness: all cycles have at least one token [Commoner 1971] Semi-decoupled 4-phase handshake protocol
85
A+ B+ C+ D+ A- B- C- D- Simple 4-phase handshake protocol
86
Results about liveness At least three latches in a ring are required with only one data token circulating [Muller 1962] At least three latches in a ring are required with only one data token circulating [Muller 1962] (this paper): any hybrid combination of protocols is live if the simple 4-phase protocol is not used any cycle has at least one token Theorem (this paper): any hybrid combination of protocols is live if the simple 4-phase protocol is not used Proof: any cycle has at least one token
87
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- simple 4-phase non-overlapping semi-decoupled (Furber & Day) fully decoupled (Furber & Day) GasP, IPCMOS de-synchronizationmodel
88
Outline What is de-synchronization ? What is de-synchronization ? Behavioral equivalence Behavioral equivalence 4-phase protocols for de-synchronization 4-phase protocols for de-synchronization Concurrency Concurrency Correctness Correctness An example An example
90
Async DLX block diagram
91
Synchronous RTL = Synchronous Desynchronized Cycle: 4.4ns Power: 70.9mW Area: 372,656 m Cycle: 4.45ns Power: 71.2mW Area: 378,058 m All numbers are after Placement & Routing All numbers are after Placement & Routing Total of 1500 flip-flops, 3000 latches Total of 1500 flip-flops, 3000 latches DE-SYNC design includes 5 controllers, each driving 2 clock trees DE-SYNC design includes 5 controllers, each driving 2 clock trees Power numbers include the clock tree Power numbers include the clock tree Technology: UCM/Virtual Silicon 0.18 µm Technology: UCM/Virtual Silicon 0.18 µm
92
De-synchronized DLX on FPGA (demo outside the conference room)
93
Discussion The de-synchronization model provides an abstraction of the timing behavior The de-synchronization model provides an abstraction of the timing behavior
94
[2,3] [1,2][8,9] [5,7] [3,5] [2,4] ABE F G C D [0,0][3,5] [5,7] [2,3] [2,4] [1,2] [8,9] [1,2] Timing analysis Exploration of the design space
95
A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- A+ B+ A- B- simple 4-phase non-overlapping semi-decoupled (Furber & Day) fully decoupled (Furber & Day) GasP, IPCMOS de-synchronizationmodel
96
Conclusions EDA tools require a formal support (they must work for all circuits) EDA tools require a formal support (they must work for all circuits) A complete characterization of 4-phase protocols has been presented (partial order based on concurrency) A complete characterization of 4-phase protocols has been presented (partial order based on concurrency) Design flow developed at Cadence Berkeley Labs Design flow developed at Cadence Berkeley Labs Automated from gate netlist Static timing analysis to derive matched delays Constrained P&R to meet timing constraints
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.