Joint Design-Time and Post-Silicon Optimization for Analog Circuits: A Case Study Using High-Speed Transmitter Yiyu Shi, Wei Yao, Lei He, and Sudhakar Pamarti Electrical Engineering Dept., UCLA Speakers: Yu-Hsiu Wu and Liang-Yun Lin
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
Problem Statement Goal To maximize parametric yield for analog circuits Reasons for Concern Analog circuits are highly sensitive to process variation Process variation causes problems for parametric yield and becomes worse with technology scaling Techniques for maximize parametric yield Design-time optimization Post-silicon tuning
Existing Work Design-time optimization System-level System-level and circuit-level co-design Device-level Transistor sizing and layout optimization Post-silicon tuning Tunable amplifier Programmable capacitor array for filter, ADC Transistor finger selection to reduce mismatch A lot more adaptive design for analog/mixed-signal circuit … First yield-driven circuit design technique that considers both post-silicon tuning along with design time optimization
Adaptive / Tunable Circuits Tunable circuits with negative feedback loop to compensate process variation Traditional corner-based design methodology makes sure the circuit satisfies the design spec in all process corners Circuit tunability does not comes for free Yield-driven optimization is required to prevent over-design
Joint Design-Time and Post-Silicon Optimization Use high speed link transmitter design as an example Proposed goal Maximize yield Yield is defined by BER Satisfy power and area constraints Optimization framework Build model for analog building blocks from SPICE Include V t variation and consider tuning circuit cost Use SPICE-characterized cells as building units Combine branch-and-bound and gradient-ascent algorithm Effectively find the global optimum solution
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
High-Speed Serial Link Example Consider the transmitter pre-emphasis filter Combats inter-symbol interference (ISI) Plays an important role in system performance Consumes most power at transmitter
Worst Case Jitter and Noise Amplitude Eye diagram: synchronized superposition of all possible realizations of the signal viewed within a particular signal interval. [Ref] Wei Yao, Yiyu Shi, Lei He and Sudhakar Parmati, "Worst Case Timing Jitter and Amplitude Noise in Differential Signaling", ISQED 2009 Width of the eye opening: defines the time interval over which the received signal can be sampled without error. Height of the eye opening: with the amount of amplitude noise at a specified sampling time defines the SNR of the received signal
Worst Case Jitter and Noise Amplitude Applying a RLGC lossy transmission line model according to differential microstrip line geometry Representing the channel impairments and crosstalk through transmission line time domain response. Formula-based jitter and amplitude noise model combing effects of ISI, corss-talk, and pre-emphasis Mathematical programming algorithms to directly find out worst case jitter and amplitude noise Vs: Tx CML G L +sC L : Rx loading and parasitics
Worst Case Jitter and Noise Amplitude With given input patterns, the models achieve within 5% difference compared to SPICE simulation. Obtaining more reliable worst-case jitter and noise compared to Monte Carlo simulations. At the same time, 150x faster. Unmatched termination and reflection response Left: SPICE Right: formula- based model
Transmission Environment Channel Attenuation Dispersion Reflection Impedance mismatch Inter-symbol interference Band-limited channel Crosstalk Capacitive or Inductive coupling Other random noises ex: circuit thermal noise
Transmitter Design Pre-emphasis filter Last stage of the pre-driver Pre-filter the pulse with the inverse of the channel a i : input symbol b i : transmitter output W j : filter coefficient Other stages of the pre-driver Sizing is according to logic effort
Transmitter Design (cont’d) LMS algorithm is used for optimal filter coefficients given the number of taps n Large transistor parasitic capacitance exists Considered as part of the channel Transistor sizing is done through parallel connected unit cells Unit cells α are pre-characterized through simulation Output swing constraint is applied to make sure correct operation region Get rid of SPICE simulation during optimization
Performance Metric Transmission ReceptionChannel ModulationDemodulation BER = N e = Number of errors R = Data rate t = Test time Bit Error Rate (BER) Error Vector Magnitude (EVM) 1 2 e I Q Error in the received symbol V2V2 V1V1 e = V1V1 - V2V2 R × t > !!
Performance Metric (cont’d) The relation between EVM and BER can be obtained through simulation Monotonic Highly correlated EVM can be measured efficiently with far less data a i : input symbol b j : transmitter output p j: : channel response r i : received data M: total number of data < 10 4
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
Process Variation Threshold voltage variation Doping fluctuations Short channel device Channel length variation also causes V th variation Becomes dominant in the next few technology generations Pre-emphasis filter coefficients Implemented as CMOS current sources V th Variation induces drain current mismatch Assume 10% variation in V th 30% variation in power BER varies in several order of magnitude
Post-Silicon Tuning through DAC Current-division DAC is commonly used to combat process variation Two design parameters LSB size ( ): minimum step during digital-to-analog conversion Resolution ( β ): number of bits used Implied the range wanted to cover
Power and Performance Variation (a) Without Tuning(b) With Tuning Both power and performance variations are reduced significantly Given the same design Tuning circuits bring extra costs Area & power Find the optimum b/w the performance and area/power
Problem Formulation Where , random variable e
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
Yield vs. Power and Area Significant improvement can be expected Solution space surface is rough and many local maxima exists Discrete problem with non-convex objective and constraints 3000 Monte Carlo runs over different unit cell design α, resolution β, and LSB size
Basic idea: Partition the solution space by α and γ Obtain an upper bound on the performance in the sub-region Discard if the bound worse than the current best solution Branch and Bound with Gradient Ascent Method Use gradient ascent method to find the local maxima Sequentially take steps in the direction along its gradient. Bound estimation Remove the area and power constraints for optimal performance f upper bound Use LMS algorithm to find the local optimal coefficients Results in best feasible performance
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
BER Distribution Comparison Two extreme cases Without tuning circuit All resources are used for filter design Unavoidable large variation One tap filter All resources are used for DAC Has extreme small variance but suffers severe ISI Manually design Assume LSB size is equal for each tap Good balance between above two extreme cases Our algorithm Provides better solution
Experiment Results Yield comparison for different constraints area v t variation power Improve the yield by up to 47%
Outline Introduction Design-Time Optimization Post-Silicon Tuning and Joint Optimization Optimization Framework Experimental Results Conclusions
Use high speed link transmitter design as an example propose to maximize BER yield subject to power and area constraints. Build model for analog building blocks from SPICE and Include V t variation with the consideration of tuning circuit cost Combine branch-and-bound and gradient-ascent algorithm Effectively find the global optimum Experiments show that, compared to manual design, joint design-time and post-silicon optimization can improve the yield by up to 47%
Thank you !