Download presentation
1
ECE 679: Digital Systems Engineering
Patrick Chiang Office Hours: 1-2PM Mon-Thurs GLSN 100
2
Class Introductions Who am I Who are you
3
Class Basics Class basics Guest lecture (Dr. Frank O’Mahony)
4 Homeworks (%20) (groups of 2) Midterm (%40) Final Project (%40) 4-page IEEE report 10 minute presentation (groups of 2) Guest lecture (Dr. Frank O’Mahony) Intel Research Labs (May 4th) Intel Field Trip (June 7th) TBD Presentations of 1-2 best project reports
4
Class Homework Homework
Skim Dally/Poulton “Digital Systems Engineering” Chapter 3 Skim Overview Paper: Includes running Stat Eye Oregon State Matlab (eecs.oregonstate.edu/it) Problem Set #1 rlc files -- ~pchiang/hspice (rlc_spice_deck; rlc.rlc) Spice models -- ~pchiang/hspice/process_files/ 130nm to 22nm Simulator lang = spice Spectre models – DEFINE gpdk090 /nfs/guille/analog/c/cdsmgr/process/gpdk090_v3.8/libs.cdb/gpdk090
8
What does this mean for analog designers?
Ever build an ADC? Ever wonder what to do with the digital bits? MHz, 200MHz, 400MHz Goes to Vector analyzer Analog Why does this clock rate not increase? What really is this output doing? Where is it going? Fs = 600MHz
10
Brief Summary Introduction to the area Why serial links are important
What are the current technology trends/limitations
11
4Gb/s Low Power, Area Efficient Serial Links
IBM Processor CPU M e m o r y From/to other subsystems (e.g. backplane) High-speed I/Os Interconnection between different chips Transmitter Equalization Receiver Offset Cancellation 4Gb/s Transmitter Output, 1m Organization of the channel, arrows from channel, plots…change image layout Reall what you want to say on the slides. Transmitter Output Router Backplane(1m, FR4) Receiver Input um Testchip um Testchip Ming-Ju E. Lee, William J. Dally, John W. Poulton, Patrick Chiang, Stephen F. Greenwood. An 84-mW 4Gb/s Clock and Data Recovery Circuit for Serial Link Applications. VLSI Circuits Symposium, Kyoto, Japan, June 2001, pp Ming-Ju E. Lee, William Dally, Patrick Chiang. Low-Power Area-Efficient High-Speed I/O Circuit Techniques. IEEE Journal of Solid-State Circuits, November 2000, Vol. 35, No. 11, pp 4Gb/s Transmitter Output, Equalized 4Gb/s Transmitter Output
12
Scaling Serial Links: From 4Gb/s->20Gb/s
Thesis: Develop 20Gb/s Serial Link Area: 500um x 500um Power: 200mW/link 1 bit time = 1FO4 Focus on timing uncertainty, not channel…independent vector Timing uncertainty becomes KEY issue t 250ps v 4Gb/s Eye Diagram t 50ps v 20Gb/s Eye Diagram
13
Transmitter Block Diagram
No post-PLL Clock Buffers Dotted lines around different circuit components, PLL, muxing, etc. Clocks are differential clocks. Get rid of everything else, use red. Or change images…lose people on the insight, carry through. Simpler is better
14
Test Chip UMC 1.2V, 0.13um CMOS(single Vt) Die size 700um x 1.15mm
Test Interface 10GHz PLL PRBS Check Test Structures 700um Phase Interpolators RX DLL TX Clock Recovery Transmitter Muxing PRBS Gen Our test chip was fabricated in National Semiconductor’s quarter micron CMOS technology. The die is 2.6 by 1.4 square millimeter and uses a 52-pin impedance controlled package donated by Vitesse Corporation. The active area of the transceiver circuits is 0.31-mm2. 1.1mm UMC 1.2V, 0.13um CMOS(single Vt) Die size 700um x 1.15mm 50 Ohm Pad Termination using Wafer Probes
15
PLL Measurements Jitter limited by 1.25GHz input reference clock
Power Spectrum Open Loop VCO Phase Noise @ 1MHz -97dBc/Hz 10GHz Jitter (RMS) 0.97ps 10GHz Jitter(pk-pk) 8.0ps PLL Power 38.6mW VCO Power 6mW Tuning Range Change the cadence of talking…these are the important points. Too much stuff in slides, too heavy…line width, is 2-3 points. Q=10 Jitter Q=5 Jitter (c) Jitter limited by 1.25GHz input reference clock HP 8133A input clock (1.2ps RMS, 8.9ps pk-pk)
16
Eye Diagram Jitter 2.2ps RMS 15.6ps pk-pk Data Rate = 19.2Gb/s
Don’t spend toom uch time on 19.2 Seen here is the phase step values across the entire range. The average phase resolution should by 15.6ps, so the interpolation steps shown are very accurate. Note that every 9nth phase has phase interpolation values lower than the average of 15.6ps, which is what is expected, since these are the redundant steps. You can also see that not “every” 9th phase value is consistently small. For example, phases 18 and 36 don’t show as small of a phase step as phases 9 and 27.The reason for this error is due to a layout error, due to asymmetric clock loading causing different capacitive coupling for different transitions. (Different phase differences due to different delays amounts in the DLL itself) Data Rate = 19.2Gb/s Voltage ripple caused by lack of current source at differential pair tail node
17
High Speed Transmitter Comparisons
A 250mW Full-Rate 10Gb/s Transceiver Core in 90nm CMOS using a Tri-State Binary PD with 100ps Gated Digital Output T. Masuda, et. al., ISSCC 2007. A full-rate 10Gb/s transceiver core employing a tri-state binary PD with 100ps gated digital output is implemented in a 90nm CMOS process. Direct drive from the VCO is utilized to eliminate the 10GHz clock buffer current. The RX exhibits a recovered jitter of 906fs(rms) and an input sensitivity of 5.9mV. The TX generates a jitter of 5mUI(rms). The chip consumes 250mW.
18
Conventional Serial Link Receivers
Pre-Amp In Data 20Gb/s Multiphase PLL D[0] D[1] D[2] D[3] ck[0] ck[1] ck[2] ck[3] Conventional architectures also use multi-phase PLL Static Phase Offset Power Supply Sensitivity Well, guess what…we have same problem at the receiver
19
2nd Generation Transmitter
Equalizing Path Analog delay, but replica bias… 2-Tap Equalizer implemented for compensating for channel losses Achieve 50ps analog delay with CML buffers
20
Fabrication: Test Chip
ST Microelectronics 0.13um test chip 307mW / transceiver 0.46mm^2 20mV input sensitivity um Test Chip 450um 350um Transmitter 500um 600um Receiver First 0.13um
21
All Results Single-Ended
80mV 20Gb/s Ideal Channel All Results Single-Ended 43ps 33mV 20Gb/s 10GHz 37ps
22
20Gb/s Ideal Channel with α=0.37
Results (cont’d) 20Gb/s Ideal Channel with α=0.37 72mV 36.4ps 62mV 20Gb/s 10GHz with α=0.37 35ps
23
Rationale for Multi-cores
Next generation computing – Multi-core Processing i.e. multiple, parallel DSPs (i.e. MACs) Why we cannot achieve faster frequencies? Wire delays don’t scale like transistors Power increases exponentially (when pushing process technology) Timing margins degraded by Variability Power supply noise Digital crosstalk NOTE: More independent threads require more memory bandwidth Intel, 80 Cores, ISSCC 2007
24
Research: Explore Parallel Serial Links
Serial Links also exhibit the same characteristics Channel losses get worse Power consumption increases significantly with bandwidth Timing precision limited by: Static Phase Offset (process variation) Power-supply Induced Jitter Interchannel Crosstalk Serial Links need to to also push for high amounts of parallelism How is this different than conventional link design? Channel equalization becomes more difficult Adjacent channel crosstalk Difficult channel estimation problem (power, flexibility, data-rate, equalizer design, channel, distance) Amortize Clock Power for Multiple Links Distributed resonant clocking of analog/mixed-signal front-end’s
25
Problem of IO 2500 pins / 2 = 1200 Differential pins
Assume 10Gbs / link = 12 Tb/s Bandwidth 100mW/Gb(bandwidth) = 120W
27
Stateye Playing Fun with Stat-Eye Homework examples 5Gb/s -> 10Gb/s
Worse Channels Worse timing jitter Homework examples
28
Next Time Telegrapher’s Equation Channel Models
Reflection coefficients Channel Models Skin Effect Dielectric constant vias
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.