Download presentation
Presentation is loading. Please wait.
Published byDortha McLaughlin Modified over 9 years ago
2
Advanced Analysis, Design, and Measurement Techniques for Multi-Gb/s Data Links Frank O’Mahony (frank.o’mahony@intel.com) Bryan Casper Circuit Research Lab, Intel Hillsboro, OR
3
2 Outline Overview of I/O trends System-level link modeling –Worst-case data eye –Statistical data eye –Design example: 20Gb/s link On-die measurement techniques
4
3 Chip-to-Chip Signaling Trends Decade SpeedsTransceiver Features 1980’s >10Mb/sInverter out, inverter in 1990’s >100Mb/sTermination Source-synchronous clk. 2000’s >1 Gb/sPt-to-pt serial streams Pre-emphasis equalization Future >10 Gb/sAdaptive Equalization, Advanced low power clk. Alternate channel materials Lumped capacitance … Transmission line Lossy transmission line h(t) Channel noise Sampler Slicer Linear Equalize r Transmit Filter CDR
5
4 CMOS transceiver data rates Plot showing link rate vs. year Courtesy of Prof. Ken Yang, UCLA Technology limited Power/channel limitations
6
5 Power density [Watts/cm 2 ] 1 10 100 1000 1.510.70.50.350.250.180.130.10.07 i386 i486 Pentium® Pentium® Pro Pentium® II Pentium® III Nuclear reactor Pentium® 4 Power Density Increases Exponentially! Rocket Nozzle Hot plate Max power density envelope Process Technology node [μm]
7
6
8
7 Teraflops Research Chip 100 Million Transistors ● 80 Tiles ● 275mm 2 First tera-scale programmable silicon: –Teraflops performance –Tile design approach –On-die mesh network –Power-aware capability Tera-scale many-core μP’s will drive aggregate I/O rates aggressively
9
8 Power efficiency and process technology Process scaling enables lower power data links Channel characteristics can limit achievable power efficiency Courtesy of Prof. Ken Yang, UCLA
10
9 I/O Data Rate and Power Efficiency 01015205 0 40 20 60 Data Rate (Gb/s) Power Efficiency (mW/Gb/s) BNV ISSCC’06 J. Wong VLSI’03 7.5 11.7 Prete ISSCC’07 9.6 R. Palmer ISSCC’07 2.2
11
10 Designing power-efficient multi-Gb/s links Accurate system-level link modeling –Careful statistical accounting of all noises ISI, Xtalk, voltage, and timing noise Power-efficient I/O system implementation –Design within the BW of the process technology –Better channel characteristics enable lower power –Immunity to variation, deterministic and random noise comes at a power cost On-die calibration and measurement –Calibration can significantly reduce power –Measurement necessary to close the modeling loop
12
11 System-level link modeling 1.Empirical calculation –Use random data 2.Peak distortion analysis –Analytical calculation of worst-case eye 3.Statistical ISI analysis –Analytical calculation of BER eye
13
12 Traditional method of signaling analysis and validation Most chip-to-chip signaling links considered in the past used simple Binary NRZ modulation These links had a low symbol rate and little channel memory Transient simulation using a few random data vectors was sufficient to accurately characterize the eye.
14
13 Motivation for behavioral link analysis Simulated eye can be optimistic –Won’t capture worst-case ISI, especially for channels with long memory Characterizes impact of deterministic and random noise sources –For low bit error rates (BER), very unlikely noise conditions must be considered Nearly exact statistical analysis reduces need for excess design margins Fast evaluation of various link architectures without designing complete circuits –e.g. Various equalizers can be traded off easily
15
14 Properties of a Linear Time-invariant System Frequency response (e.g. S-parameters) S 21 FFT Impulse Response Convolution Superposition
16
15 LTI property: Convolution Tx symbol (mirror) Impulse response Pulse response
17
16 LTI property: Superposition Tx symbol …000010000000… In Out Pulse response
18
17 LTI property: Superposition of symbols In Out Response to pattern 100111 Tx symbol … 000010011100 …
19
18 LTI property: Superposition of coupled symbols In Out FEXT Pulse response Tx symbol …000010000000…
20
19 In Out FEXT response LTI property: Superposition of coupled symbols Tx symbol …000011111100…
21
20 Out FEXT response LTI property: Superposition of coupled symbols Tx symbol …000011111100…
22
21 Out Tx symbol …000010011100… Insertion loss response LTI property: Superposition of coupled symbols
23
22 Out Tx symbol …000010011100… Tx symbol …000011111100… FEXT response Insertion loss response Composite response LTI property: Superposition of coupled symbols
24
23 Worst-case eye calculation Eye diagrams are generally calculated empirically –Convolve random data with pulse response of channel –Pulse response is derived by convolving the impulse reponse with the transmitted symbol For eye diagrams to represent the worst-case, a large set of random data must be used –Low probability of hitting worst case data transitions –Computationally inefficient An analytical method of producing the worst-case eye diagram exists –Computationally efficient algorithm
25
24 Differential S Parameters
26
25 Eye diagram (100 bits @5Gb/s)
27
26 Random data eye (100 bits) --- Random data eye (1000 bits) --- Eye diagram (1000 bits @5Gb/s)
28
27 Sample pulse response cursorprecursorpostcursor ISI+ISI-
29
28 Step response 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1
30
29 0 1 1 0 1 0 0 1 0 0 0 0 0 Worst-case 0
31
30 0 1 0 1 1 0 0 0 0 0 Worst-case 1
32
31 1 1 0 1 0 0 1 0 0 1 0 1 1 0 Worst-case 0 Worst-case 1 How to find worst-case patterns
33
32 Worst-case Received Voltage Difference (RVD) 16-34222
34
33 5Gb/s Pulse Response
35
34 5Gb/s Response due to worst-case data pattern Worst-case 0 Worst-case 1
36
35 Worst-case data response Worst-case 1 Lone 1
37
36 5Gb/s WC eye shape Precursor Cursor Postcursor
38
37 WC eye vs random data eye WC eye shape 1000 symbols random data eye 100 symbols random data eye
39
38 BER in 10 X Legend Sample time (sec) Sample voltage (V) BER distribution eye What is a BER distribution eye? Sample time Sample reference BER=10 -10
40
39 BER in 10 X Legend Sample time (sec) Sample voltage (V) BER distribution eye Sample time Sample reference BER=10 -5 What is a BER distribution eye?
41
40 BER in 10 X Legend Sample time (sec) Sample voltage (V) BER distribution eye Sample time Sample reference BER=10 -1 What is a BER distribution eye?
42
41 BER distribution vs Worst-case eye Worst-case eye edges Legend shows BER in 10 X
43
42 BER distribution eye calculation Calculation method is based on pulse response shape Assumption: Equal probability of 1 or 0 Determine probability density function (pdf) of ISI –In contrast to determining peak value of ISI More computationally intensive than Peak Distortion Analysis
44
43 BER eye calculation example (no ISI) 9 0 00 000
45
44 PDF of the cursor (when sending a 1) PDF of cursor for a 1 9 1 PDF of ISI 0 1
46
45 PDF of a 1 PDF of cursor for a 1 9 PDF of ISI 0 PDF of a 1 9 Convolve PDFs
47
46 Cumulative Distribution Function (CDF) of a 1 PDF of a 1 9 9
48
47 BER distribution eye (when sampling a 1) 0 9 9 CDF of a 1 Legend (BER): 1 0
49
48 PDF of a 0 PDF of cursor for a 0 0 PDF of a 0 0 PDF of ISI 0 Convolve PDFs
50
49 Cumulative Distribution Function (CDF) of a 0 PDF of a 0 0 0
51
50 BER distribution eye (when sampling a 0) 0 9 Legend (BER): 1 0 0 CDF of a 0
52
51 BER distribution eye Legend (BER): 0.5 0 0 9 Reference BER=0.5 Reference BER=0 Reference BER=0 Reference BER=0.5 0 CDF of a 1 or 0 9 0 CDF of a 0 p=0.5 9 CDF of a 1 p=0.5
53
52 BER eye calculation example (w/ ISI) 16-34222
54
53 1 st precursor ISI PDF 50% chance of a 1 50% chance of a 0 02 0.5 PDF of 1 st pretcursor ISI
55
54 PDF of 1 st postcursor ISI 50% chance of a 1 50% chance of a 0 0-3 1st postcursor ISI PDF 0.5
56
55 2 nd postcursor ISI PDF 50% chance of a 1 50% chance of a 0 PDF of 2 nd postcursor ISI 04 And so on... 0.5
57
56 -3 result 02 p=0.25 PDF all ISI Convolve individual PDFs 02 1 st Precursor 0-3 1 st Postcursor -302 04 2 nd Postcursor And so on... -3 result 0 2 1 3 46 p=0.125
58
57 PDF all ISI p=1/64 1098765432-2-3-410
59
58 PDF of the cursor (when sending a 1) PDF of cursor for a 1 16 1
60
59 PDF of a 1 PDF of cursor for a 1 16 PDF of ISI 1098765432-2-3-410 PDF of a 1 262524232221201918151413121716
61
60 Cumulative Distribution Function (CDF) of a 1 PDF of a 1 262524232221201918151413121716
62
61 BER distribution eye (when sampling a 1) 16 22 CDF of a 1 Legend (BER): 1 0 Reference BER=0 Reference BER=0.3 Reference BER=0.9 Reference BER=1
63
62 BER distribution eye (when sampling a 0) 16 22 CDF of a 0 Legend (BER): 1 0 Reference BER=1 Reference BER=0.9 Reference BER=0.3 Reference BER=0
64
63 BER distribution eye Reference BER=0.5 Reference BER=0.25 Reference BER=0.25 Reference BER=0.5 Legend (BER): 0.5 0 CDF of a 0 or 1 BER=0 CDF of 1 CDF of 0
65
64 Handling Tx jitter in link analysis Jitter is amplified over lossy channels –Byproduct of frequency-dependent delay and loss –Must be accounted for in analytical model Discussion of these methods is beyond the scope of this presentation Lossy Rx Tx Following primary authors have published techniques to analyze Tx jitter: Balamurugan, Hanumolu, Sanders, Stojanovic, Casper Following primary authors have published techniques to analyze Tx jitter: Balamurugan, Hanumolu, Sanders, Stojanovic, Casper
66
65 Signaling analysis summary Link analysis accuracy enables design of balanced link design –Low power –High performance Three types of link analysis –Empirical: Inexact, optimistic, time consuming –Peak distortion: Uses LTI to find worst-case eye, can be pessimistic –Behavioral/statistical: Exact channel modeling using LTI and behavioral models of circuit blocks Tx jitter is a special case that must be handled for better behavioral accuracy
67
66 Design example: 20Gb/s data link “Bonneville” Goals –Achieve highest performance link using 90nm CMOS 20Gb/s target across a desktop channel 10Gb/s target across a server channel –Power < 20mW/Gb/s –Small area (300um by 300um for Rx and Tx) –Forwarded and embedded clock architectures
68
67 chipset CPU socket μP/CS Clean BP -80dB -60dB -40dB 0dB 0GHz5GHz15GHz -20dB 10GHz Channel Insertion Loss 7” FR4 microstrip Microprocessor/Chipset: Non-interleaved routing FEXT only Tx Pad cap=0.4pF Rx Pad cap=0.1pF microstrip Sockets on Tx
69
68 Channel loss and Equalization Channel loss distorts and attenuates signal Develop low loss materials Compensate for channel distortion- Equalization –Transmitter pre-emphasis –Receiver linear equalizer –Decision Feedback Equalizer Channel Response Vs. Frequency Non-Equalized Equalized 0123456789 Frequency (GHz) -50 -40 -30 -20 -10 0 10 20 30 40 50 Received Magnitude (dB) Channel Equalizer Equalized Channel Response Targeted Filter (Equalizer) Response
70
69 Equalization overview – Rx DFE + × c4c4 × c3c3 × c2c2 × c1c1 + + _ Non-linear –DFE Linear –Continuous-time Transversal Filter High-pass –passive –active »capacitive degeneration »L peaking –Discrete-time Rx ADC & FIR Rx analog FIR Tx pre-emphasis
71
70 Δ Δ DAC Data C -1 [5:0] C 0 [5:0] C 1 [5:0] 6 Non-linear –DFE Linear –Continuous-time Transversal Filter High-pass –passive –active »capacitive degeneration »L peaking –Discrete-time Rx ADC & FIR Rx analog FIR Tx pre-emphasis Equalization overview – Tx Preemphasis
72
71 bias Non-linear –DFE Linear –Continuous-time Transversal Filter High-pass –passive –active »capacitive degeneration »L peaking –Discrete-time Rx ADC & FIR Rx analog FIR Tx pre-emphasis Equalization overview – CTLE
73
72 20Gb/s 10Gb/s 30Gb/s Tx FIR taps DFE taps DFE tap start 1 23 4 5 6 482482 483483 484484 4 128 4141 4242 4848 4 16 4 32 4 64 4444 μP/CS 1 st order CTLE No CTLE Measured data using similar assumptions
74
73 Bonneville architecture RXTX 5GHz clock 20Gb/s Phase gen. 4-tap LE (pre- emphasis) 2 nd -order CTLE Measurement results: –20Gb/s across uP channel –15Gb/s across server channel –12mW/Gb/s power efficiency –Measured data rate matched link modeling results within 10%
75
74 Data link measurements Some data link blocks are straightforward to characterize with external measurement equipment –Examples: Data Tx (50Ω) DC currents and voltages (averaging) Recovered data (after sampling), e.g. Bit error rate tester (BERT) Other measurements are extremely difficult to perform with external measurement equipment –Examples: Clock jitter (>5GHz), especially high-frequency jitter Sampled data eye Data Rx sensitivity Built-in self test (BIST) and self-calibration is required for high-volume testing of data links –Examples: Automatic clock-data deskew Adaptive equalization On-die measurement capability nearly essential in multi-Gb/s data links –Closes the loop for link design –Enables BIST and calibration
76
75 Bonneville on-die measurement RXTX 5GHz clock 20Gb/s Phase gen. 4-tap LE (pre- emphasis) 2 nd -order CTLE
77
76 Bonneville on-die measurement RXTX 5GHz clock 20Gb/s Phase gen. 4-tap LE (pre- emphasis) 2 nd -order CTLE Error counter Offset Test Logic
78
77 On-die scope test capabilities 3 modes Characterize circuits
79
78 On-die scope test capabilities 3 modes Characterize circuits Waveform capture
80
79 On-die scope test capabilities 3 modes Characterize circuits BER eye diagrams Waveform capture
81
80 RX Input-referred Noise RX PDF Counter Test Control Offset ctrl Measurement: –Sweep calibrated digital offset to generate CDF, counting 1’s and 0’s –Generate noise CDF/PDF for Rx + - V test (DC)
82
81 RX Input-referred Noise RX PDF Counter Test Control Offset ctrl + - V test (DC) V test (DC) + noise All 1’s All 0’s Offset [V]
83
82 RX Input-referred Noise RX PDF Counter Test Control Offset ctrl + - V test (DC) Prob{‘1’} (CDF) Offset [V] V test (DC) + noise All 1’s All 0’s Offset [V]
84
83 IR Noise PDF RX Input-referred Noise RX PDF Counter Test Control Offset ctrl + - V test (DC) Offset [V] All 1’s All 0’s V test (DC) + noise Offset [V]
85
84 Rx input noise (no PSN, no offset) noise 1.3mV Det. noise 0mVp-p
86
85 Rx input noise (200MHz PSN, no offset) noise 1.1mV Det. noise ~1mVp-p
87
86 Rx input noise (200MHz PSN, 85mV offset) noise 1mV Det. noise 16mVp-p
88
87 Rx PSRR (200MHz) Noise floor
89
88 Sample periodic signal: –Voltage: Eqivalent- time A/D using comparator offset On-die waveform capture
90
89 On-die waveform capture Sample periodic signal: –Voltage: Eqivalent- time A/D using comparator offset –Time: Equivalient time A/D using interpolator offset
91
90 On-die waveform capture Sample periodic signal: –Voltage: Eqivalent- time A/D using comparator offset –Time: Equivalient time A/D using interpolator offset
92
91 On-die waveform capture Sample periodic signal: –Voltage: Eqivalent- time A/D using comparator offset –Time: Equivalient time A/D using interpolator offset
93
92 Wave capture, Rx eq Tx Rx Eq Rx
94
93 Wave capture, Tx eq Tx Eq Tx Rx
95
94 Wave capture, Tx+Rx eq Tx Eq Tx Rx Eq Rx
96
95 BER eye diagram Pass Fail Characterize BER at various sampling points: –Voltage: Vary comparator offset –Time: Vary interpolator offset
97
96 BER eye diagram Pass Fail Characterize BER at various sampling points: –Voltage: Vary comparator offset –Time: Vary interpolator offset
98
97 BER eye diagram # errors Characterize BER at various sampling points: –Voltage: Vary comparator offset –Time: Vary interpolator offset
99
98 Rx Equalization (CTLE) Datarate17.5Gb/s Channel7” Desktop Tx Rx Eq Rx
100
99 Tx Equalization (Pre-emphasis) Datarate17.5Gb/s Channel7” Desktop Tx Eq Tx Rx
101
100 Tx + Rx Equalization Datarate17.5Gb/s Channel7” Desktop Tx Eq Tx Rx Eq Rx
102
101 Tx + Rx Equalization, no Rx offset trim Datarate17.5Gb/s Channel7” Desktop Datarate17.5Gb/s Channel7” Desktop Tx Eq Tx Rx Eq Rx
103
102 Tx + Rx Eq, 10% Tx PSN @ 200MHz Datarate17.5Gb/s Channel7” Desktop Tx Eq Tx Rx Eq Rx
104
103 Tx + Rx Eq, 10% Rx PSN @ 200MHz Datarate17.5Gb/s Channel7” Desktop Tx Eq Tx Rx Eq Rx
105
104 Measurement summary On-die link measurements close the design loop and enable link self test and adaptation –Example: BER eye On-die measurements can add significantly less noise than off- die measurements –Example: Clock-data jitter measurement However, calibration of the on-die circuits is still required for absolute accuracy –Examples: Voltage offsets, phase interpolators In some cases, such as were averaging is possible, off-die measurements are still very useful.
106
105 Overall summary Tera-scale many-core μP’s will drive aggregate I/O rates aggressively –Power budget will constrain link design space Power efficiency depends strongly on process technology and channel High-performance and low-power link design requires accurate system level tools –Tools are in place with areas for improvement On-die link measurement capabilities close design loop and enable link self-test and adaptation Acknowledgements: Ganesh Balamurugan, James Jaussi, Joe Kennedy, Mozhgan Mansuri, Randy Mooney, Shekhar Borkar
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.