Download presentation
Presentation is loading. Please wait.
Published byMelvyn Fields Modified over 6 years ago
1
Low Power Passive Equalizer Design for Computer-Memory Links
Ling Zhang1, Wenjian Yu2, Yulei Zhang1, Renshen Wang1, Alina Deutsch3, George A. Katopis3, Daniel M. Dreps3, James Buckwalter1, Ernest Kuh4, Chung-Kuan Cheng1 1UC San Diego, 2Tsinghua Univ., 3IBM Research Labs, 4UC Berkeley
2
Outline Introduction The CPU-memory links in IBM P6 system
Equalization structures and schemes Simulated annealing optimization flow Experimental results Conclusion
3
Introduction On-chip interconnects has Gbps data-rates
M. Hashimoto, etc. in 2004: 40Gbps(simulation), 10mm, 45nm M.P. Flynn, etc. in 2005: 14Gbps, 7.2mm, 180nm Package-level interconnects need improvements High bandwidth: reducing inter-symbol interference (ISI) Low power: using passive components Alleviate ISI: equalization H.A.Affel in 1924: equalization of carrier transmissions. H.W.Bode in 1936: attenuation equalizer. E.Kuh in 1959: constant-R ladder R.Sun, etc. in 2005: adaptive passive T-junction equalizer.
4
IBM P6 system IBM introduced P6 system in 2007.
Dual-core microprocessor 65nm SOI process Both speed and power are important for the P6 I/O circuitry and interconnect. For high performance applications, it operates at over 5GHz. For low power applications, it consumes less than 100W.
5
Structure of the CPU-Memory link in IBM P6 system
-The channel is a 20-inch long differential pair, operating at 6.4GHz. -The model takes all the fan-out, connector, and via array discontinuities into account.
6
Eye diagram at each port (simulation)
INPUT TXPKG RXPKG OUTPUT
7
Our contributions We propose a set of passive equalizer schemes.
We employ the schemes on the CPU-Memory link of IBM P6 system and observe significant performance improvement with little power overhead. We compare and analyze different results of the schemes. We demonstrated that the equalization approach is not sensitive to variations and crosstalk.
8
Equalization structures
(a) T-junction, (b) parallel RC, (c) series RL -T-junction and RC can be applied at both driver and receiver sides, RL is used only at receiver end. T-junction can be implemented on-chip or on-package.
9
The settings we use for each equalization component
label Component S RL NA Infinity P RC 10 ohm Tmc On-chip T-junction Z0 Tmp Off-chip T-junction Tuc Tup M No equalizer Four different groups of equalization schemes we studied Driver Receiver Group1 Match: M, Tmc, Tmp Group2 Un-match: P, Tuc, Tup Group3 Match: M, Tuc, Tup Un-match: P, Tuc, Tup, S Group4
10
Optimization Flow Variables: Object function (minimize): Constraints:
Simulated annealing method is used to find the optimal solution.
11
Experiment 1 For all possible schemes:
Apply the optimization flow to find the opt. solution. Upper bound of R=500 ohm Upper bound of C=100pF Upper bound of L=100nH Compare the results.
12
1. M+M: no eye 2. T+M, M+T: Veye is V, jitter is 22-23ps. 3. T+T: Veye is V, jitter is 12-16ps.
13
Summary of Group 1 Using Tmc or Tmp at both sides is better than using at one side only. Tmc and Tmp are equivalent when used at driver. At receiver, Tmp has smaller jitter than Tmc.
14
Transfer functions of Group 1
T**+Tmp M+M T**+M, M+T** T**+Tmc
15
Step responses of Group 1
T**+Tmp M+M T**+M, M+T** T**+Tmc
17
Summary of Group 2 Driver side: Receiver side:
Tup is better than Tuc: larger eye and smaller jitter. Tuc is better than P: larger eye and smaller jitter. Receiver side: M: largest jitter, Tup+M has largest eye. Tmp: lowest cost function and lowest jitter. Veye is slightly smaller than Tmc.
18
Transfer functions of Group 2
19
Transfer functions of Group 2
20
Step responses of Group 2
21
Step responses of Group 2
23
Summary of Group 3 Driver side: Receiver side: Tmc is the same as Tmp
M is worse than T: smaller eye opening and larger jitter. Receiver side: P has smallest eye opening. Tuc has largest eye opening. Tup has lowest jitter. S has largest jitter.
24
Transfer functions of Group 3
M +S Tmp +S M +P Tmc +S Tmp +Tuc Tmc +Tuc M+Tup Tmp +P Tmc +P Tmc +Tup Tmp +Tup M+Tuc
25
Transfer functions of Group 3
Tmc +S Tmc +Tuc Tmc +P Tmc +Tup Tmp +S Tmp +Tuc Tmp +P Tmp +Tup M +S M +P M+Tup M+Tuc
26
Step responses of Group 3
Tmc +S M+Tup Tmp +Tuc Tmp +S Tmp +P Tmc +P Tmc +Tuc Tmc +Tup Tmp +Tup M +S M +P M+Tuc
27
Step responses of Group 3
Tmc +S Tmc +Tuc Tmc +P Tmc +Tup Tmp +S Tmp +Tuc Tmp +P Tmp +Tup M +S M +P M+Tup M+Tuc
28
-At driver side, Tup is better than Tuc , Tuc is better than P.
-At receiver side, Tup is similar to Tuc, S has larger eye and larger jitter.
29
Summary of Group 4 Driver side: Receiver side:
Tup is better than Tuc, larger Veye and smaller jitter. P has largest jitter, P+Tuc has largest Veye, but others’ Veye is smallest. Receiver side: P has smallest eye opening. S has largest jitter. Tuc+Tup, Tup+Tup, Tuc+Tuc and Tup+Tuc are very similar.
30
Transfer functions of Group 4
P+S P+P Tuc +S Tup +S Tuc +Tuc Tuc +Tup Tup +Tuc Tup +Tup Tuc +P P+Tuc Tup +P P +Tup
31
Transfer functions of Group 4
Tuc +S Tuc +Tuc Tuc +Tup Tuc +P Tup +S Tup +Tuc Tup +Tup Tup +P P+S P+P P+Tuc P +Tup
32
Step responses of Group 4
P+S P+P Tuc +S Tup +S Tuc +Tuc Tuc +Tup Tup +Tuc Tup +Tup Tuc +P P+Tuc Tup +P P +Tup
33
Step responses of Group 4
Tuc +S Tuc +Tuc Tuc +Tup Tuc +P Tup +S Tup +Tuc Tup +Tup Tup +P P+S P+P P+Tuc P +Tup
34
Summary of experiment 1 Schemes in Group 1 have lower jitter because of matching. Schemes in Group 4 have larger eye-opening due to reflections. When used at receiver end, structure Tmc has slightly lower jitter than Tmp, and structure Tuc is very similar to Tup. When used at receiver end, structure P has smaller eye-opening, while S has larger jitter. When used at driver side, structures Tmc is very similar to Tmp, and structure Tup is slightly better than Tuc with larger eye-opening and smaller jitter.
35
Experiment 2 Equivalent/similar schemes are merged.
Apply optimization flow on each scheme. Size limits on L, C are enforced: Upper bound of L=5nH Upper bound of C=15pF Compare the results, power and eye-diagram
36
Choose representative schemes
-G1: M+Tmc (smallest jitter)
37
Choose representative schemes :
-M+P: (lowest power) -P+Tuc : (largest eye-opening) -Tup +S (smallest cost function)
38
Eye diagrams at output (b) M+Tmc: Veye=0.19V, Jitter=18.9ps
(a) Tup+S: Veye=0.37V, Jitter=19.3ps (c) M+P, Veye=0.23V, Jitter=24.5ps (d) P+Tuc: Veye=0.39V, Jitter=26.0ps
39
Transfer functions of selected schemes
M+M P+Tuc Tup +S M+P M+Tmc
40
Step responses of selected schemes
M+M P+Tuc M+P Tup +S M+Tmc
41
Step responses of selected schemes
M+M P+Tuc M+P Tup +S M+Tmc
42
Eye diagrams at input (a) Tup+S (b) M+Tmc (c) M+P (d) P+Tuc
43
Eye diagrams at TXPKG (a) Tup+S (b) M+Tmc (c) M+P (d) P+Tuc
44
Eye diagrams at RXPKG (a) Tup+S (b) M+Tmc (d) P+Tuc (c) M+P
45
Input impedances of selected schemes
M+Tmc M+M P+Tuc Tup +S -M+P has high Zin low power -Tup +S has low Zin high power
46
Sensitivity comparison of selected schemes
Parameters are perturbed by
47
Eye diagrams at output with xtalk
(b) M+Tmc: Veye=0.19V, Jitter=19.4ps (a) Tup+S: Veye=0.33V, Jitter=21.4ps (c) M+P, Veye=0.24V, Jitter=22.0ps (d) P+Tuc: Veye=0.38V, Jitter=26.2ps
48
Conclusion Simple and effective passive equalizer schemes are proposed. SA flow is used to optimize the equalizer parameters For the CPU-Memory link of IBM P6 system: Without equalizer: eye is closed, power is 7.9mW Largest eye after equalization: 0.39V with 26ps jitter, 8.8mW power Smallest jitter after equalization: 19ps with 0.19V eye-opening, 7.9mW power Significant performance improvement can be seen with very little overhead on power.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.