Stephen Govea Javad Zandazad

Slides:

Advertisements

Similar presentations

Roundoff and truncation errors

Advertisements

Chapter 4: Image Enhancement

Generated Waypoint Efficiency: The efficiency considered here is defined as follows: As can be seen from the graph, for the obstruction radius values (200,

Theoretical Program Checking Greg Bronevetsky. Background The field of Program Checking is about 13 years old. Pioneered by Manuel Blum, Hal Wasserman,

Introduction to Monte Carlo Methods D.J.C. Mackay.

1 CE 530 Molecular Simulation Lecture 7 David A. Kofke Department of Chemical Engineering SUNY Buffalo

1 Statistical Mechanics and Multi- Scale Simulation Methods ChBE Prof. C. Heath Turner Lecture 11 Some materials adapted from Prof. Keith E. Gubbins:

Chapter 6 Control Using Wireless Throttling Valves.

CSC 211 Data Structures Lecture 13

1 Everyday Statistics in Monte Carlo Shielding Calculations  One Key Statistics: ERROR, and why it can’t tell the whole story  Biased Sampling vs. Random.

STAR Sti, main features V. Perevoztchikov Brookhaven National Laboratory,USA.

Chapter 8 Delving Into The Use of Inference 8.1 Estimating with Confidence 8.2 Use and Abuse of Tests.

EE201C Final Project Adeel Mazhar Charwak Apte. Problem Statement Need to consider reading and writing failure – Pick design point which minimizes likelihood.

HW5 and Final Project Yield Estimation and Optimization for 6-T SRAM Cell Fang Gong

Kevin Stevenson AST 4762/5765. What is MCMC?  Random sampling algorithm  Estimates model parameters and their uncertainty  Only samples regions of.

EE 201C Homework 4 [Due on Feb 26, 2013] Wei Wu

Stats Methods at IC Lecture 3: Regression.

Chapter 6 The Normal Distribution and Other Continuous Distributions

Analysis Manager Training Module

Continuous Distributions

Overview Modern chip designs have multiple IP components with different process, voltage, temperature sensitivities Optimizing mix to different customer.

Lesson 8: Basic Monte Carlo integration

Chapter 9: Testing a Claim

Fang Gong HW5 and Final Project Yield Estimation and Optimization for 6-T SRAM Cell Fang Gong

SRAM Yield Rate Optimization EE201C Final Project

Chapter 9: Testing a Claim

CS 326A: Motion Planning Probabilistic Roadmaps for Path Planning in High-Dimensional Configuration Spaces (1996) L. Kavraki, P. Švestka, J.-C. Latombe,

Properties of the Normal Distribution

The Normal Distribution

Chapter 9: Testing a Claim

Break and Noise Variance

The Calibration Process

Chapter 8: Inference for Proportions

Chapter 8: Estimating with Confidence

Data Mining (and machine learning)

Artificial Intelligence (CS 370D)

Han Zhao Advisor: Prof. Lei He TA: Fang Gong

EE201C Modeling of VLSI Circuits and Systems Final Project

Chapter 4a Stochastic Modeling

Created by Art Kay Presented by Peggy Liska

Week 5 Lecture 2 Chapter 8. Regression Wisdom.

Adjustment of Temperature Trends In Landstations After Homogenization ATTILAH Uriah Heat Unavoidably Remaining Inaccuracies After Homogenization Heedfully.

EE201C Modeling of VLSI Circuits and Systems Final Project

Haim Kaplan and Uri Zwick

RS – Reed Solomon List Decoding.

NanoBPM Status and Multibunch Mark Slater, Cambridge University

Chapter 4a Stochastic Modeling

Yield Optimization: Divide and Conquer Method

Nonlinear regression.

Prediction and Accuracy

Chapter 8: Estimating with Confidence

Sampling Distributions

Chapter 9: Testing a Claim

EE 201C Homework 5 [Due on March 12, 2012]

Chapter 9: Testing a Claim

Chapter 9: Testing a Claim

Chapter 8: Estimating with Confidence

Chapter 9: Testing a Claim

8.3 Estimating a Population Mean

Chapter 8: Estimating with Confidence

Chapter 9: Testing a Claim

Chapter 9: Testing a Claim

Chapter 9: Testing a Claim

Chapter 9: Testing a Claim

Retrieval Performance Evaluation - Measures

Chapter 8: Estimating with Confidence

Chapter 9: Testing a Claim

Lesson 9: Basic Monte Carlo integration

Chapter 8: Estimating with Confidence

Probability, contd.

Presentation transcript:

Stephen Govea Javad Zandazad EE201C SRAM Simulation Stephen Govea Javad Zandazad

Performance Constraints LTSpice != HSPICE Ignored transistor parameters 100% yield calculations Recalculated performance constraints 155mV nbit voltage at 10ps Evaluate write condition at 6.5ps Nominal performance (10k MC) 0.2607 Vth, 0.1e-6um Leff Read yield: 46% Write yield: 65% SPICE simulation differences prevented us from using the provided performance constraints; using the given performance constraints at the nominal parameter points resulted in a 100% success read/write yield. Instead, we performed a 2000 sample Monte Carlo analysis with the nominal circuit parameters. We captured the results in a pair of histogram plots. The first plot shows the ‘crossing time’, the time at which the nbit voltage exceeds the bit voltage and thus indicates ‘write’ success. The second plot illustrates the distribution of the nbit voltage at 10ps. From these plots we selected new performance constraints: 155mV read constraint and 6.5ps write constraint. MATLAB CalculatePerformanceConstraints(2000) output: ***** RESULTS ***** Mean voltage at 10ps after 2000 simulations (mV): 1.564609e+002 Voltage mode at 10ps (mV): 1.558103e+002 Voltage median at 10ps (mV): 1.564170e+002 Mean crossing time (ps): 6.499229e+000 Crossing time mode (ps): 6.787109e+000 Crossing time median (ps): 6.500000e+000 Elapsed time is 2068.060557 seconds.

LTSpice Hack -- .DATA Absence of .DATA feature made simulations run slowly (~2 sim/s) Used user-defined functions and counter variable tm Only one equation term is non- zero for each value of tm tm increments from 1 to N Dramatic performance improvement (x5) Necessitated using linear interpolation for output values One of our initial challenges centered on our lack of access to HSPICE and our reliance on LTSpice for circuit simulation. In addition to recalculating the performance constraints to fit our simulation environment we were hindered by the lack of the .DATA functionality present in HSPICE. Our initial simulations ran very slowly, approximately one sample every 0.5s, since every simulation required a separate call to LTSpice from MATLAB. We solved this problem through a novel use of user-defined LTSpice functions. We created four functions with N terms, where N represents the number of simulation samples, and each term represents the requested value for a given parameter at a given sample iteration. We also used the .STEP function create a counter variable from 0 to N; this told LTSpice to run the netlist simulation N times. By adding a unique expression to each term of the user-defined function we were able to selectively pick out the appropriate value at each of the N simulations. The expression combined two unit step expressions: the first unit step function was set to 1 during the desired simulation while the second was set to 1 on the following simulation; by subtracting these and multiplying the result by the remainder of the term we created a condition where only one term for each function was non-zero for a given simulation run.

Importance Sampling Method Goal: focus sampling on the problem regions Implementation: Divide the [-3σ, 3σ] parameter space into 8 regions Uniformly sample the given parameter space Assign failed results based on parameter regions Distribute remaining samples according to the relative number of failures in each region Convert yield/power results to normal distribution Include the uniform sampling yield results Modify the number of uniform and total samples to tune the algorithm

Importance Sampling Tuning – Number of Uniform Samples Monte Carlo baseline 10,000 samples R: 46.44% W: 65.33% Importance Results Shaded superior performance regions Diff equation: diff = abs(read_yield - 0.4644) + abs(write_yield - 0.6533); MC simulation at 0.2607, 0.2607, 0.1e-6, 0.1e-6, 10k 0.4644, 0.6533, 6.5528e-6, 1.5704e-5, 2.125e-13 6170 seconds Target: 155mV, 6.5ps Narrative: We started by performing a 10k MC sampling at nominal parameter values and calculating the read and write yield performance results; we used these as a baseline case for comparing our QMC Importance results against. Our importance sampling implementation has two key tuning parameters, the first is the number of uniform samples evaluated before sampling the higher failure regions. Here, the graph illustrates the effect of changing the number of uniform samples on the overall yield difference from the baseline case; each iteration used 350 samples total, which we previously identified as the minimum number of samples for the QMC simulation to converge. The two highest performance and somewhat stable regions are highlighted. Surprisingly, a relatively few number of uniform samples are required before the yield results approach the level of accuracy we have seen with the QMC implementation. Also, the error increases dramatically once the number of uniform samples passes ~45. From these results, we tested our importance sampling method using 40 and 16 uniform samples. Overall, the 40 sample case performed rather poorly and did not offer a significant improvement over the QMC-only case. However, the 16 sample case proved somewhat more promising…

Importance Sampling Tuning – Overall Number of Samples Sanity check 16 uniform sample case vs. QMC-only method As a sanity check, we compared the QMC-only method with the QMC Importance implementation with 16 uniform samples. As you can see from the graphs, the yield values track each other closely and the power calculations are virtually identical.

Importance Sampling Tuning – Overall Number of Samples Increase accuracy +2% at 300 Crossing at 250 Reduce samples 35% reduction for comparable accuracy +2% The key

Simulated Annealing Method Randomly pick starting transistor parameters Simulate SRAM performance and calculate score For T attempts: Is current best case good enough? Return to best case after 75% complete Generate new parameter set, simulate and score Decide whether to accept the new parameter set If better than current or stored best case, then accept Possibility of acceptance based on temperature Maintain current, best & suggested parameter sets Return the best performing parameter set

Simulated Annealing Tuning Generate neighbor parameters Used temperature and performance to adjust performance_ratio = max(0.2, 1 - (cur_perform(6) / performance_threshold)); if (cur_temp > 0.25) change_percentage = (0.2 * performance_ratio * cur_temp) + 0.05; else change_percentage = (0.15 * performance_ratio * cur_temp) + 0.05; end Current Temperature cur_temp = 1 - T / max_num_attempts Calculate Score Evenly weighted read & write yield

Simulated Annealing Tuning Accept Probability accept_probability = exp(suggest_perform(6) - cur_perform(6)) * cur_temp

Simulated Annealing Tuning Parameter Trending Track overall score against parameter trend Maintain a window of size n (=10) of the overall score and of the parameters. Check if the score is more often improving (or getting smaller) in that window. Check also if the parameters are following a certain trend. If the score is improving, favor that parameter trend in the next window. Otherwise try to stay away from the trend.

Simulated Annealing Results Test Read % Write % Score Attempts Total Sim. Comparison Baseline 97.73% 97.47% 97.60% 2,592 1,036,800 - 1 86.33% 91.15% 88.74% 600 180,000 90.9% 23.1% 17.4% 2 87.45% 91.35% 89.40% 91.6% 3 84.64% 93.62% 89.13% 91.3% 4 87.51% 92.02% 89.77% 92.0% 5 87.10% 91.67% 89.39% 200 60,000 7.7% 5.8% 6 87.75% 92.32% 90.04% 92.2% 7 87.95% 92.31% 90.13% 92.3% 8 86.35% 91.74% 89.05% 91.2% Test #1: Random neighbor sampling, 15% allowed range; best parameters found: 0.324080332257384, 0.209918760886952, 9.96103359620180e-08, 9.57181477759534e-08; QMC Importance testing method, 300 samples at each point; best performance found: 0.863272463768116, 0.911539514066496, 2.41609528553886e-06, 2.73889238857625e-06, 2.04860634593675e-13, 0.887405988917306; found the “best” case around sample 200. Test #2: Random neighbor sampling, 20% allowed range; best parameters found: 0.327712497526511 0.205154913659538 9.82275318931117e-08 9.55700326139110e-08; QMC Importance testing method, 300 samples at each point; best performance found: 0.874527536231884 0.913540153452685 2.42719274587092e-06 2.75108294205828e-06 2.04082881534261e-13 0.894033844842285; found the “best” case around sample 350. Test #3: Moved to the best parameter values with 20% of the attempts left. Used a combination of the performance ratio and the switch value. The switch cut out the performance as a factor in calculating the neighboring values after 5 failed attempts to find new neighbor values with a better combination. The performance ratio also had a 0.25 limit so that it could not restrict the size of the neighbor space too much even when the performance was quite high. Overall percentage was 20% plus a 2.5% offset to ensure that the change percentage did not drop to zero. Used 600 attempts. Best performance: 0.846394086021505 0.936160893571179 2.97755332679179e-07 7.85617372662998e-07 2.07651695817931e-13 0.891277489796342; best parameters: 0.307191161386178 0.210712622576986 9.86895600309633e-08 9.75103490321825e-08 Test #4: Moved to the best parameter values with 20% of the attempts left. Used a combination of the performance ratio and the switch value. The switch cut out the performance as a factor in calculating the neighboring values after 5 failed attempts to find new neighbor values with a better combination. The performance ratio also had a 0.25 limit so that it could not restrict the size of the neighbor space too much even when the performance was quite high. Overall percentage was 15% plus a 2.5% offset to ensure that the change percentage did not drop to zero. Used 600 attempts. Best performance: 0.875138989733671 0.920193130982493 2.39929966627402e-06 2.83457194481747e-06 2.14034983073350e-13 0.897666060358082; best parameters found: 0.320895140243545 0.205486866181568 1.01272073652386e-07 1.00604545973546e-07 Test #5: Moved to the best parameter values with 20% of the attempts left. Used a combination of the performance ratio and the current temperature to tune the neighbor selection. Performance ratio has a 20% cut-out limit; 20% overall range used plus 5% baseline. 200 overall attempts. Best performance: 0.871014492753623 0.916743734015345 2.34746347282020e-06 2.66497847400628e-06 2.07239867950473e-13 0.893879113384484. Best parameters: 0.320204266194619 0.213544135031960 9.80794578087020e-08 9.74057550126915e-08 Test #6: Just like test #5, but a 15% overall sampling plus 5% baseline. Best performance: 0.877498550724638 0.923155050697430 2.35122481942707e-06 2.66936084747776e-06 2.11099370828382e-13 0.900326800711034. Best parameters: 0.321678263944620 0.206889488147027 1.03269751344202e-07 9.84989794710321e-08 Test #7: Test #8: Decreased the go back to best point to 75% of the way done with the attempts and introduced a new neighbor function during the go-back and after period that reduces the overall percentage to 10% with a 2.5% baseline.