Advanced Computing and Information Systems laboratory Device Variability Impact on Logic Gate Failure Rates Erin Taylor and José Fortes Department of Electrical and Computer Engineering Advanced Computing and Information Systems Lab University of Florida, Gainesville, FL, USA 1/20
Advanced Computing and Information Systems laboratory Challenges of CMOS Scaling 2/20 Intel’s high-k gate dielectric will enable scaling down to 22nm node Challenge: Solution: The thickness of SiO 2 gate insulators is constrained by gate leakage It was assumed that spatial resolution smaller than the wavelength of light used in lithography was not possible Challenge: Solution: New techniques, such as Immersion and EUV lithography, have achieved resolutions far beyond this point
Advanced Computing and Information Systems laboratory Challenges of CMOS Scaling Very little can be done to prevent increases in variability since it is an intrinsic consequence of scaling We will be working against the Law of Large Numbers 3/20 Challenge: Solution: The inability to precisely control device fabrication at the nanoscale as well as age-related device wear-out will result in an increasing amount of variability among devices
Advanced Computing and Information Systems laboratory Device Variability 4/20 Variability has become a first-order limitation to continued scaling If you can’t beat them, join them! In order to tolerate variability we need to Characterize the extent of variability Translate variability into fault models and failure rates for logic gates in order to evaluate circuit reliability Efficiently apply fault tolerate techniques to designs Develop new computing methodologies that harness the probabilistic nature of devices to perform useful computation 1) 2) 3) -OR-
Advanced Computing and Information Systems laboratory 3 2 Goals of Our Work 5/20 In this work, we Characterize the extent of device variability from several key sources Simulate the effects of this variability in end-of-the- roadmap CMOS and determine corresponding fault models and failure rates for logic gates Review our Probabilistic Gate Model method for calculating circuit reliability which can incorporate these realistic fault models and failure rates in order to evaluate effective fault tolerant techniques 1
Advanced Computing and Information Systems laboratory Characterizing Device Variability 6/20 The critical effect of these sources of variability is variation in device threshold voltage (V th ) Sources of variability Manufacturing Related Fluctuations in dopant profiles Negative Bias Temperature Instability (NBTI) Atomic-scale variations Hot-Carrier Injection effects (HCI) Age Related
Advanced Computing and Information Systems laboratory Characterizing Device Variability 7/20 Manufacturing-related variability results in V th normally distributed with mean 0V. Variation in oxide thickness (10% increase in 22nm 16nm [Ref: Bernstein et. al.] Extrapolating manufacturing-related variability to end- of-the-roadmap CMOS using observations made by Bernstein & Mayergoyz Dopant Fluctuations 22nm 16nm
Advanced Computing and Information Systems laboratory Characterizing Device Variability 8/20 Combined effect of manufacturing- and age-related variability assuming independence: HCI (nMOS) 50mV 55mV NBTI (pMOS) -50mV 55mV Age-related variability effects on V th V th normally distributed = ± 50mV = 22nm = 16nm
Advanced Computing and Information Systems laboratory Simulating Effects of Variability 9/20 We simulated both 22nm and 16nm transistors using BSIM4 predictive MOSFET SPICE models 16nm node transistors 22nm node transistors We used the Predictive Technology Model developed by the Nanoscale Integration and Modeling Group at Arizona State Univ. Effective gate length: 10nm Gate oxide thickness: 1.2nm V dd : 0.9V We developed a model for 16nm node transistors based on ITRS predictions Effective gate length: 7nm Gate oxide thickness: 1nm V dd : 0.8V
Advanced Computing and Information Systems laboratory Simulating Effects of Variability 10/20 To study how variability translates into logic gate failures, we simulated the behavior of NAND and NOT gates using Cadence’s Spectre device simulator V th of pMOS or nMOS transistors was varied in order to mimic effect of device variability
Advanced Computing and Information Systems laboratory Results 11/20 a) b) 22nm, Input = 0 22nm, Input = 1 16nm, Input = 0 16nm, Input = 1 The output of an inverter gate versus V th for a) nMOS and b) pMOS
Advanced Computing and Information Systems laboratory Results 12/20 a) b) 22nm, Input = 00 22nm, Input = 10 16nm, Input = 10 16nm, Input = 11 The output of a NAND gate versus V th for a) nMOS 22nm, Input = 11 16nm, Input = 00 and b) pMOS
Advanced Computing and Information Systems laboratory Results 13/20 As V th increases in the positive direction, output that should be at 0V increases stuck-at-1 fault As V th increases in the negative direction, output that should be at V dd decreases stuck-at-0 fault In the worst-case, V th may result in output that exceeds its noise margins and causes an error
Advanced Computing and Information Systems laboratory Logic Gate Fault Models 14/20 Fault Model Direction of V th shift Stuck-at-1 Stuck-at-0 Inversion + V th shift - V th shift ± V th shift Likely Cause nMOS since V th has = 50mV pMOS since V th has = -50mV Both pMOS and nMOS
Advanced Computing and Information Systems laboratory Device Failure Rates 15/20 Calculating the probability of logic gate failure Calculate the noise margins of the device Determine minimum value of V th needed for the output to exceed noise margin Find the probability that V th equals or exceeds this value using our V th distributions previously determined 3 2 1
Advanced Computing and Information Systems laboratory Device Failure Rates 16/20 Example: The probability of stuck-at-0 fault in 16nm inverter When V th equals -0.54V in pMOS device, output will be 0.48V which exceeds noise margins Use normal distribution of V th : = -50mV and = 100mV for 16nm pMOS Calculate: P( V th V) = 5.07 10 -7
Advanced Computing and Information Systems laboratory Evaluating Circuit Reliability 17/20 Probabilistic Gate Models (PGMs) are equations that describe the behavior of unreliable logic gates Each PGM assumes that A gate fails according to a particular fault model (stuck-at or inversion) The probability of such a fault is Example: NAND with inversion fault (1-X)(1-Y) (1- ) 1-(1-X)(1-Y) Z = (1-X)(1-Y)(1- ) + (1-(1-X)(1-Y)) x y z x y z Define: X=P(x=1), Y=P(y=1), Z=P(z=1) PGM Z = P(z=1|no error)P(no error) + P(z=1|error)P(error)
Advanced Computing and Information Systems laboratory Evaluating Circuit Reliability 18/20 PGM method for evaluating circuit reliability Apply the PGM equation topologically for each gate in the circuit starting from primary input and proceeding to the outputs Deal with signal dependencies created through fanouts by circuit transformations described in our previous work Gate fault models and failure rates Circuit reliability PGM equations Efficient application of fault tolerant techniques Process Summary
Advanced Computing and Information Systems laboratory Conclusion 19/20 In this work, we have Characterized device variability using current data on manufacturing and age-related effects Determined fault models and failure rates for logic gates by simulating devices at the 22nm and 16nm nodes Provided a framework for calculating realistic measures of circuit reliability based on fault models and failure rates Future Work Incorporate additional sources of variation Explore new computing methodologies that can use the probabilistic nature of devices to perform reliable computation
Advanced Computing and Information Systems laboratory Questions This work is supported by: NASA award No. NCC /20