Download presentation
Presentation is loading. Please wait.
1
1 ® Charles Dike Synchronization Ideas Charles E. Dike Intel Corporation
2
2 ® Charles Dike Introduction Tutorial Share some ideas about synchronization and metastability Introduce NEW, IMPROVED theory on metastability Charles Dike (cdike@ichips.intel.com)
3
3 ® Charles Dike Why and where synchronize? Reduce latency between independent clock domains. Asynchronous domain to synchronous clock. Synchronous clock to an independent synchronous clock. Benefit - higher performance in critical circuits. Asynchronous Circuit Pausable Clock at 1.8 GHz Synchronous Clock at 3.0 GHz Synchronous Clock at 1.5GHz
4
4 ® Charles Dike Design Direction MEM FPU ALU MEM FPU ALU MEM FPU ALU MEM FPU ALU 80s towards 100MHz 90s towards 1GHz 00s multi-GHz VALUE ADDED
5
5 ® Charles Dike Chip Area Networks Late 00s multi-GHz
6
6 ® Charles Dike I believe…. We must be able to synchronize all domains to a PLL controlled clock Interconnect on chip will be asynchronous (GALS) We need to minimize latency There will be two basic synchronizer uses - near neighbor and the chip net
7
7 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability
8
8 ® Charles Dike Generic Synchronizer Handles self timed to synchronous interfaces and vice-versa Supports synchronous to synchronous interfaces Can handle streaming data Adaptable to any speed range Possibly used over the chip network
9
9 ® Charles Dike Two flop synch DQDQ CLK VALID #1#2
10
10 ® Charles Dike Single latch synch DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK LATCH OUTPUT RECEIVER CLOCK SENDER CLOCK
11
11 ® Charles Dike Multi latch synch DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK DQDQ CLK2 REQ SR Q DQDQ CLK1 Write ValidRead Valid ACK
12
12 ® Charles Dike General Case 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 WRITE POINTER READ POINTER EMPTY SYNC STATUS REGISTER 1 1 1 1 1 0 0 0 0 0 SYNCHRONIZERSSYNCHRONIZERS LATENCY PADDING FULL EN Write Clock Write Enable Read Clock
13
13 ® Charles Dike empty case WRITE POINTER READ POINTER STATUS REGISTER EMPTY DQ R EN DQ R DQ R SYNCHRONIZER Write Pointer a Read Pointer b Read Clock EMPTY DQ R EN DQ R DQ R Write Clock Write Enable Write Pointer b Read Pointer a
14
14 ® Charles Dike General Case 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 WRITE POINTER READ POINTER EMPTY SYNC STATUS REGISTER 1 1 1 1 1 0 0 0 0 0 SYNCHRONIZERSSYNCHRONIZERS LATENCY PADDING FULL EN Write Clock Write Enable Read Clock
15
15 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm processor The Myrinet pipeline synchronization scheme Latest understanding of metastability
16
16 ® Charles Dike Simple Synchronizer Constrained by frequency ratio Supports synchronous to synchronous interfaces Does it support asynch to synch? Yes, with restrictions. Possibly used in local neighbor synchronizers
17
17 ® Charles Dike Simple Synchronizer DQDQDQDQ Divide by 2 SLOW CLK FAST CLK SYNC * MI* MI* = Metastable Immune AA1 A2A3 wxyz
18
18 ® Charles Dike timing1 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SLOW CLOCK A A1 A2 A3 SYNC
19
19 ® Charles Dike timing2 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK
20
20 ® Charles Dike timing3 DQDQDQDQ Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK
21
21 ® Charles Dike timing4 Divide by 2 SLOW FAST SYNC * MI* AA1 A2A3 123456 FAST CLOCK SYNC SLOW CLOCK SLOW CLOCK# SYNC DQDQDQ FAST SYNC * MI* AA1 A2A3 DQDQDQDQDQ * MI*
22
22 ® Charles Dike transfers 123456 FAST CLOCK SYNC SLOW CLOCK CHEATER CLOCK DQDQ SYNC FAST CLOCK SLOW CLOCK FAST TO SLOW TRANSFERSLOW TO FAST TRANSFER DQDQ SYNC FAST CLOCK SLOW CLOCK
23
23 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability
24
24 ® Charles Dike Pipeline Synchronizer Supports synchronous to synchronous interfaces Supports asynch to synch and vice- versa Possibly used in local neighbor synchronizers Essentially a distributed fifo and synchronizer
25
25 ® Charles Dike Pipeline Synchronizer S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do
26
26 ® Charles Dike R1R1 R0R0 A1A1 A0A0 ME S ME element X REQ
27
27 ® Charles Dike Fifo element Ri Ai Di Ro Ao Do C Ri Ai Ro Ao C Data
28
28 ® Charles Dike Async to sync S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do S Ri Ai Di Ro Ao Do SynchronousAsynchronous
29
29 ® Charles Dike Sync to async SynchronousAsynchronous Ri Ai Di Ro Ao Do Ri Ai Di Ro Ao Do Ri Ai Di Ro Ao Do S SS
30
30 ® Charles Dike Points to ponder #1 All synchronizing interfaces have one thing in common - a latching element that holds data while metastabilities are being resolved. There is no way to avoid the latency which is required to resolve metastabilities. To minimize latency the latching element characteristics can be improved. We will be required to understand and use this knowledge. This is the future of digital design.
31
31 ® Charles Dike Topics of Discussion Generic synchronizer of the type used in the TeraFlops computer Simple synchronizer of the type used in StrongArm The Myrinet pipeline synchronization scheme Latest understanding of metastability
32
32 ® Charles Dike Role of the Synchronizing Flop Reorients incoming information to a clock edge Its performance determines system failure rate or latency
33
33 ® Charles Dike Real Life There is no magic bullet There is a lot of misinformation on metastability around To date many circuits have been over designed through planning and luck Whenever a circuit fails based on too high of a frequency ultimately the cause of failure is metastability There is no way to synchronize a signal faster than about the time it takes to pass a signal through six static gates
34
34 ® Charles Dike Metastability is.... SET RESET OUT NODE A NODE B
35
35 ® Charles Dike Technical terms T w (window size) - likelihood of entering a metastable state - in units of time Tau ( ) - rate at which metastability resolves - in units of time MTBF (Mean Time Between Failures) MTBF = TwfdfcTwfdfc e t =4kT/C < thermal noise
36
36 ® Charles Dike Simple jamb latch DATA CLOCKRESET OUT NODE A NODE B Propagation delay time of data after clock
37
37 ® Charles Dike Simple jamb latch DATA CLOCKRESET OUT NODE A NODE B Propagation delay time of data after clock ~RC time constant
38
38 ® Charles Dike Rough Histogram Propagation delay time of data after clock Propagation delay time of data after clock (log scale) MTBF = TwfdfcTwfdfc e t TwTw The slope is the
39
39 ® Charles Dike Why is the theory a problem? It assumes a uniform distribution of data about the clock –What happens when data always violates the setup/ hold window? It is not detailed enough –Doesn’t consider a deterministic region –Doesn’t account for thermal noise People tend to extrapolate the theory improperly MTBF = TwfdfcTwfdfc e t
40
40 ® Charles Dike Overview of refined theory Not everything past a normal propagation is a metastable event The T w window can’t be improved by input edge rates T w has a complex relationship to t based on load The MTBF formula needs to be modified due to non-uniform distribution of data about the clock input
41
41 ® Charles Dike Schematic
42
42 ® Charles Dike Simulation of a typical latching device
43
43 ® Charles Dike Test case DQ R PC DELAY PULSE GENERATOR #2 PULSE GENERATOR #1 TRIGGER INPUT TEK 11801-B OSCILLOSCOPE DELAY
44
44 ® Charles Dike Measuring real data advancing time
45
45 ® Charles Dike Histogram Inflection point time 0.6mv/0.1ps
46
46 ® Charles Dike Histogram Inflection point time 0.6mv/0.1ps
47
47 ® Charles Dike Measured versus Basic Propagation delay time of data after clock (log scale) MTBF = TwfdfcTwfdfc e t TwTw The slope is the Propagation delay 0.6mv/0.1ps
48
48 ® Charles Dike Simulated.... Voltage Controlled Switch R1 = 100 R1 = 100M Battery
49
49 ® Charles Dike Tau Simulated 2 = | t1 - t2 | ln V2 V1 Where: V1 = voltage at time t1 V2 = voltage at time t2 t2 t1 Latch outputs at nodes 1 and 2 1.0 1.2 1.4ns Semilog difference between latch outputs 1.0 1.2 1.4ns 10 0 10 -3 10 -6 volts time 1.5 1.0 0.5 0.0 volts
50
50 ® Charles Dike =4kT/C=4kTBR k = 1.38 x 10 -23 J/K B = 1/ = 5 x 10 10 Hz R = ~400 T = 300 o K = 20 picoseconds V n = ~0.6 mv
51
51 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) A normal
52
52 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) B ? deterministic
53
53 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) C Thermal noise point 1.80 v 180 mv 18.0 mv 1.80 mv 180 v 18.0 v 1.80 v deterministic
54
54 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) D T=19 ps deterministic true metastability
55
55 ® Charles Dike Putting it all together -50020010050150250 180 ps 18.0 ps 1.80 ps 0.18 ps 18.0 fs 1.80 fs 0.18 fs 1.80 ns (picoseconds) E T w =15 ps T=19 ps deterministic true metastability
56
56 ® Charles Dike MTBF = TwfdfcTwfdfc e (t-deter) MTBF = TwfdfcTwfdfc e t Worst case Simple case MTBF = TwfdfcTwfdfc e (t-0.5*deter) Expected
57
57 ® Charles Dike Points to ponder #2 Jakov Seizovic postulated a “malicious” asynchronous signal: no matter how we position the sampling window, and no matter how small we make the sampling window, the asynchronous transition will appear in that window. This case has to be assumed when interfacing to a signal of unknown probability distribution. We know something about just how malicious a signal can be.
58
58 ® Charles Dike Exploring
59
59 ® Charles Dike Worst case bound
60
60 ® Charles Dike < 0.1 ps Uniform distribution 12 ps jitter Not worst case bound
61
61 ® Charles Dike Final comments With the proper synchronizing device it may be possible to synchronize a signal within a single clock cycle. The constraints are: –You require about 35 s in order to get the MTBF out to about 1 century. –Each typical static gate delay is equivalent to about 5 s in a properly designed synchronizing flop. –The metastability MTBF of a device should probably be an order of magnitude better than the mechanical MTBF. –You must assume a ‘malicious’ input to the synchronizer. Nevertheless, this only adds about 5 s to the delay. –Standard flop designs are generally very poor synchronizers. Use a jamb structure. It has the best transconductance. –You should never require more than two synchronizing flops in series
62
62 ® Charles Dike Conclusion There are several ways to communicate between independent domains I believe more asynchronous domains will appear that are imbedded within synchronous designs –Latency must be reduced to maximize the use of asynchronous designs. –This is a burden that asynch designers must bear –We need to know the limitations of synchronization and metastability Chip area networks are coming and they will open up opportunities for asynchronous design
63
63 ® Charles Dike References T. Sakurai, “Optimization of CMOS Arbiter and Synchronizer Circuits with Submicrometer MOSFET’s,” IEEE J. Solid State Circuits, vol. 23,no. 4, pp. 901-906, Aug 1988. L. Kleeman and A. Cantoni, “Metastable Behavior in Digital Systems,” IEEE Design & Test of Computers, pp. 4-19, Dec 1987. I. E. Sutherland, “Micropipelines.” Turing Award Lecture, Communications of the ACM, 32(6), pp.720-738, 1989. J. N. Seizovic, “Pipeline Synchronization,” Proc. Int’l Symp. Advanced Research in Asynchronous Circuits and Systems, CS Press, 1994. C. Dike and E. Burton, “Miller and Noise Effects in a Synchronizing Flip-Flop,” IEEE J. Solid State Circuits, vol. 34,no. 6, pp. 849-855, June 1999. A. Van der Ziel, Noise in Measurements. New York: Wiley, 1976.
64
64 ® Charles Dike Overview of present theory Everything past a normal propagation is considered a metastable event A deterministic region doesn’t exist T w has no fixed relationship to The MTBF formula assumes a uniform distribution of data about the clock input MTBF = TwfdfcTwfdfc e t
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.