Presentation is loading. Please wait.

Presentation is loading. Please wait.

Low Power Passive Equalizer Design for Computer-Memory Links

Similar presentations


Presentation on theme: "Low Power Passive Equalizer Design for Computer-Memory Links"— Presentation transcript:

1 Low Power Passive Equalizer Design for Computer-Memory Links
Ling Zhang1, Wenjian Yu2, Yulei Zhang1, Renshen Wang1, Alina Deutsch3, George A. Katopis3, Daniel M. Dreps3, James Buckwalter1, Ernest Kuh4, Chung-Kuan Cheng1 1UC San Diego, 2Tsinghua Univ., 3IBM Research Labs, 4UC Berkeley

2 Outline Introduction The CPU-memory links in IBM P6 system
Equalization structures and schemes Simulated annealing optimization flow Experimental results Conclusion

3 Introduction On-chip interconnects has Gbps data-rates
M. Hashimoto, etc. in 2004: 40Gbps(simulation), 10mm, 45nm M.P. Flynn, etc. in 2005: 14Gbps, 7.2mm, 180nm Package-level interconnects need improvements High bandwidth: reducing inter-symbol interference (ISI) Low power: using passive components Alleviate ISI: equalization H.A.Affel in 1924: equalization of carrier transmissions. H.W.Bode in 1936: attenuation equalizer. E.Kuh in 1959: constant-R ladder R.Sun, etc. in 2005: adaptive passive T-junction equalizer.

4 IBM P6 system IBM introduced P6 system in 2007.
Dual-core microprocessor 65nm SOI process Both speed and power are important for the P6 I/O circuitry and interconnect. For high performance applications, it operates at over 5GHz. For low power applications, it consumes less than 100W.

5 Structure of the CPU-Memory link in IBM P6 system
-The channel is a 20-inch long differential pair, operating at 6.4GHz. -The model takes all the fan-out, connector, and via array discontinuities into account.

6 Eye diagram at each port (simulation)
INPUT TXPKG RXPKG OUTPUT

7 Our contributions We propose a set of passive equalizer schemes.
We employ the schemes on the CPU-Memory link of IBM P6 system and observe significant performance improvement with little power overhead. We compare and analyze different results of the schemes. We demonstrated that the equalization approach is not sensitive to variations and crosstalk.

8 Equalization structures
(a) T-junction, (b) parallel RC, (c) series RL -T-junction and RC can be applied at both driver and receiver sides, RL is used only at receiver end. T-junction can be implemented on-chip or on-package.

9 The settings we use for each equalization component
label Component S RL NA Infinity P RC 10 ohm Tmc On-chip T-junction Z0 Tmp Off-chip T-junction Tuc Tup M No equalizer Four different groups of equalization schemes we studied Driver Receiver Group1 Match: M, Tmc, Tmp Group2 Un-match: P, Tuc, Tup Group3 Match: M, Tuc, Tup Un-match: P, Tuc, Tup, S Group4

10 Optimization Flow Variables: Object function (minimize): Constraints:
Simulated annealing method is used to find the optimal solution.

11 Experiment 1 For all possible schemes:
Apply the optimization flow to find the opt. solution. Upper bound of R=500 ohm Upper bound of C=100pF Upper bound of L=100nH Compare the results.

12 1. M+M: no eye 2. T+M, M+T: Veye is V, jitter is 22-23ps. 3. T+T: Veye is V, jitter is 12-16ps.

13 Summary of Group 1 Using Tmc or Tmp at both sides is better than using at one side only. Tmc and Tmp are equivalent when used at driver. At receiver, Tmp has smaller jitter than Tmc.

14 Transfer functions of Group 1
T**+Tmp M+M T**+M, M+T** T**+Tmc

15 Step responses of Group 1
T**+Tmp M+M T**+M, M+T** T**+Tmc

16

17 Summary of Group 2 Driver side: Receiver side:
Tup is better than Tuc: larger eye and smaller jitter. Tuc is better than P: larger eye and smaller jitter. Receiver side: M: largest jitter, Tup+M has largest eye. Tmp: lowest cost function and lowest jitter. Veye is slightly smaller than Tmc.

18 Transfer functions of Group 2

19 Transfer functions of Group 2

20 Step responses of Group 2

21 Step responses of Group 2

22

23 Summary of Group 3 Driver side: Receiver side: Tmc is the same as Tmp
M is worse than T: smaller eye opening and larger jitter. Receiver side: P has smallest eye opening. Tuc has largest eye opening. Tup has lowest jitter. S has largest jitter.

24 Transfer functions of Group 3
M +S Tmp +S M +P Tmc +S Tmp +Tuc Tmc +Tuc M+Tup Tmp +P Tmc +P Tmc +Tup Tmp +Tup M+Tuc

25 Transfer functions of Group 3
Tmc +S Tmc +Tuc Tmc +P Tmc +Tup Tmp +S Tmp +Tuc Tmp +P Tmp +Tup M +S M +P M+Tup M+Tuc

26 Step responses of Group 3
Tmc +S M+Tup Tmp +Tuc Tmp +S Tmp +P Tmc +P Tmc +Tuc Tmc +Tup Tmp +Tup M +S M +P M+Tuc

27 Step responses of Group 3
Tmc +S Tmc +Tuc Tmc +P Tmc +Tup Tmp +S Tmp +Tuc Tmp +P Tmp +Tup M +S M +P M+Tup M+Tuc

28 -At driver side, Tup is better than Tuc , Tuc is better than P.
-At receiver side, Tup is similar to Tuc, S has larger eye and larger jitter.

29 Summary of Group 4 Driver side: Receiver side:
Tup is better than Tuc, larger Veye and smaller jitter. P has largest jitter, P+Tuc has largest Veye, but others’ Veye is smallest. Receiver side: P has smallest eye opening. S has largest jitter. Tuc+Tup, Tup+Tup, Tuc+Tuc and Tup+Tuc are very similar.

30 Transfer functions of Group 4
P+S P+P Tuc +S Tup +S Tuc +Tuc Tuc +Tup Tup +Tuc Tup +Tup Tuc +P P+Tuc Tup +P P +Tup

31 Transfer functions of Group 4
Tuc +S Tuc +Tuc Tuc +Tup Tuc +P Tup +S Tup +Tuc Tup +Tup Tup +P P+S P+P P+Tuc P +Tup

32 Step responses of Group 4
P+S P+P Tuc +S Tup +S Tuc +Tuc Tuc +Tup Tup +Tuc Tup +Tup Tuc +P P+Tuc Tup +P P +Tup

33 Step responses of Group 4
Tuc +S Tuc +Tuc Tuc +Tup Tuc +P Tup +S Tup +Tuc Tup +Tup Tup +P P+S P+P P+Tuc P +Tup

34 Summary of experiment 1 Schemes in Group 1 have lower jitter because of matching. Schemes in Group 4 have larger eye-opening due to reflections. When used at receiver end, structure Tmc has slightly lower jitter than Tmp, and structure Tuc is very similar to Tup. When used at receiver end, structure P has smaller eye-opening, while S has larger jitter. When used at driver side, structures Tmc is very similar to Tmp, and structure Tup is slightly better than Tuc with larger eye-opening and smaller jitter.

35 Experiment 2 Equivalent/similar schemes are merged.
Apply optimization flow on each scheme. Size limits on L, C are enforced: Upper bound of L=5nH Upper bound of C=15pF Compare the results, power and eye-diagram

36 Choose representative schemes
-G1: M+Tmc (smallest jitter)

37 Choose representative schemes :
-M+P: (lowest power) -P+Tuc : (largest eye-opening) -Tup +S (smallest cost function)

38 Eye diagrams at output (b) M+Tmc: Veye=0.19V, Jitter=18.9ps
(a) Tup+S: Veye=0.37V, Jitter=19.3ps (c) M+P, Veye=0.23V, Jitter=24.5ps (d) P+Tuc: Veye=0.39V, Jitter=26.0ps

39 Transfer functions of selected schemes
M+M P+Tuc Tup +S M+P M+Tmc

40 Step responses of selected schemes
M+M P+Tuc M+P Tup +S M+Tmc

41 Step responses of selected schemes
M+M P+Tuc M+P Tup +S M+Tmc

42 Eye diagrams at input (a) Tup+S (b) M+Tmc (c) M+P (d) P+Tuc

43 Eye diagrams at TXPKG (a) Tup+S (b) M+Tmc (c) M+P (d) P+Tuc

44 Eye diagrams at RXPKG (a) Tup+S (b) M+Tmc (d) P+Tuc (c) M+P

45 Input impedances of selected schemes
M+Tmc M+M P+Tuc Tup +S -M+P has high Zin  low power -Tup +S has low Zin  high power

46 Sensitivity comparison of selected schemes
Parameters are perturbed by

47 Eye diagrams at output with xtalk
(b) M+Tmc: Veye=0.19V, Jitter=19.4ps (a) Tup+S: Veye=0.33V, Jitter=21.4ps (c) M+P, Veye=0.24V, Jitter=22.0ps (d) P+Tuc: Veye=0.38V, Jitter=26.2ps

48 Conclusion Simple and effective passive equalizer schemes are proposed. SA flow is used to optimize the equalizer parameters For the CPU-Memory link of IBM P6 system: Without equalizer: eye is closed, power is 7.9mW Largest eye after equalization: 0.39V with 26ps jitter, 8.8mW power Smallest jitter after equalization: 19ps with 0.19V eye-opening, 7.9mW power Significant performance improvement can be seen with very little overhead on power.


Download ppt "Low Power Passive Equalizer Design for Computer-Memory Links"

Similar presentations


Ads by Google