Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability 1 Lerong Cheng, 1 Yan Lin,

Slides:



Advertisements
Similar presentations
Savas Kaya and Ahmad Al-Ahmadi School of EE&CS Russ College of Eng & Tech Search for Optimum and Scalable COSMOS.
Advertisements

Collaborators EXPERIMENT Duc Nguyen, 3rd year student UNM/AFRL RVSE
Device and Architecture Co-Optimization for FPGA Power Reduction Lerong Cheng, Phoebe Wong, Fei Li, Yan Lin, and Prof. Lei He EE Department, UCLA Partially.
Non-Gaussian Statistical Timing Analysis Using Second Order Polynomial Fitting Lerong Cheng 1, Jinjun Xiong 2, and Lei He 1 1 EE Department, UCLA *2 IBM.
Joint Design-Time and Post-Silicon Optimization for Analog Circuits: A Case Study Using High-Speed Transmitter Yiyu Shi, Wei Yao, Lei He, and Sudhakar.
Tunable Sensors for Process-Aware Voltage Scaling
Robust Low Power VLSI R obust L ow P ower VLSI Sub-threshold Sense Amplifier (SA) Compensation Using Auto-zeroing Circuitry 01/21/2014 Peter Beshay Department.
Recent Challenges. 2 Soft Errors Scaling:  SEU (Single-event upset): −Ionizing radiation corrupts data stored  Cause: −Radioactive impurities in device.
Single Event Upsets (SEUs) – Soft Errors By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M University, College.
Elettronica T A.A Digital Integrated Circuits © Prentice Hall 2003 Inverter CMOS INVERTER.
Leakage and Dynamic Glitch Power Minimization Using MIP for V th Assignment and Path Balancing Yuanlin Lu and Vishwani D. Agrawal Auburn University ECE.
0 1 Width-dependent Statistical Leakage Modeling for Random Dopant Induced Threshold Voltage Shift Jie Gu, Sachin Sapatnekar, Chris Kim Department of Electrical.
1 Dual Threshold Voltage Domino Logic Synthesis for High Performance with Noise and Power Constraint Seong-Ook Jung, Ki-Wook Kim and Sung-Mo (Steve) Kang.
Minimal Skew Clock Synthesis Considering Time-Variant Temperature Gradient Hao Yu, Yu Hu, Chun-Chen Liu and Lei He EE Department, UCLA Presented by Yu.
Statistical Full-Chip Leakage Analysis Considering Junction Tunneling Leakage Tao Li Zhiping Yu Institute of Microelectronics Tsinghua University.
1 Closed-Loop Modeling of Power and Temperature Profiles of FPGAs Kanupriya Gulati Sunil P. Khatri Peng Li Department of ECE, Texas A&M University, College.
Yuanlin Lu Intel Corporation, Folsom, CA Vishwani D. Agrawal
 Device and architecture co-optimization – Large search space – Need fast yet accurate power and delay estimator for FPGAs  Trace-based power and delay.
Non-Linear Statistical Static Timing Analysis for Non-Gaussian Variation Sources Lerong Cheng 1, Jinjun Xiong 2, and Prof. Lei He 1 1 EE Department, UCLA.
Yan Lin, Fei Li and Lei He EE Department, UCLA
Device Sizing Techniques for High Yield Minimum-Energy Subthreshold Circuits Dan Holcomb and Mervin John University of California, Berkeley EE241 Spring.
Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA
1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
Address comments to Robust FPGA Resynthesis Based on Fault-Tolerant Boolean Matching Yu Hu 1, Zhe Feng 1, Lei He 1 and Rupak Majumdar 2.
An Efficient Chiplevel Time Slack Allocation Algorithm for Dual-Vdd FPGA Power Reduction Yan Lin 1, Yu Hu 1, Lei He 1 and Vijay Raghunathan 2 1 EE Department,
Chung-Kuan Cheng†, Andrew B. Kahng†‡,
Stochastic Physical Synthesis for FPGAs with Pre-routing Interconnect Uncertainty and Process Variation Yan Lin and Lei He EE Department, UCLA
© 2005 Altera Corporation © 2006 Altera Corporation Placement and Timing for FPGAs Considering Variations Yan Lin 1, Mike Hutton 2 and Lei He 1 1 EE Department,
Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.
Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.
1 Reconfigurable ECO Cells for Timing Closure and IR Drop Minimization TingTing Hwang Tsing Hua University, Hsin-Chu.
Statistical Critical Path Selection for Timing Validation Kai Yang, Kwang-Ting Cheng, and Li-C Wang Department of Electrical and Computer Engineering University.
Digital Integrated Circuits© Prentice Hall 1995 Inverter THE INVERTERS.
Advanced Computing and Information Systems laboratory Device Variability Impact on Logic Gate Failure Rates Erin Taylor and José Fortes Department of Electrical.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
1 Efficient Analytical Determination of the SEU- induced Pulse Shape Rajesh Garg Sunil P. Khatri Department of ECE Texas A&M University College Station,
*F. Adamu-Lema, G. Roy, A. R. Brown, A. Asenov and S. Roy
Page 1 Hannes Luyken CPR ND N e v e r s t o p t h i n k i n g. ULIS 2003 Ultimate Integration of Silicon T. Schulz, C. Pacha, R. J. Luyken, M. Städele,
Capturing Crosstalk-Induced Waveform for Accurate Static Timing Analysis Masanori Hashimoto, Yuji Yamada, Hidetoshi Onodera Kyoto University.
Power Reduction for FPGA using Multiple Vdd/Vth
Pierpaolo Valerio.  CLICpix is a hybrid pixel detector to be used as the CLIC vertex detector  Main features: ◦ small pixel pitch (25 μm), ◦ Simultaneous.
1 5. Application Examples 5.1. Programmable compensation for analog circuits (Optimal tuning) 5.2. Programmable delays in high-speed digital circuits (Clock.
A Routing Approach to Reduce Glitches in Low Power FPGAs Quang Dinh, Deming Chen, Martin D. F. Wong Department of Electrical and Computer Engineering University.
Impact of Interconnect Architecture on VPSAs (Via-Programmed Structured ASICs) Usman Ahmed Guy Lemieux Steve Wilton System-on-Chip Lab University of British.
Outline Introduction: BTI Aging and AVS Signoff Problem
Analytical Approach for Soft Error Rate Estimation of SRAM-Based FPGAs Ghazanfar (Hossein) Asadi and Mehdi B. Tahoori Why Soft Error Rate (SER) Estimation?
Basics of Energy & Power Dissipation
ATS Exploiting Free LUT Entries to Mitigate Soft Errors in SRAM- based FPGAs Keheng Huang, Yu Hu, Xiaowei Li Institute of Computing Technology Chinese.
QuickYield: An Efficient Global-Search Based Parametric Yield Estimation with Performance Constraints Fang Gong 1, Hao Yu 2, Yiyu Shi 1, Daesoo Kim 1,
1 Carnegie Mellon University Center for Silicon System Implementation An Architectural Exploration of Via Patterned Gate Arrays Chetan Patel, Anthony Cozzie,
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
In-Place Decomposition for Robustness in FPGA Ju-Yueh Lee, Zhe Feng, and Lei He Electrical Engineering Dept., UCLA Presented by Ju-Yueh Lee Address comments.
EE141 Project: 32x32 SRAM Abhinav Gupta, Glen Wong Optimization goals: Balance between area and performance Minimize area without sacrificing performance.
© PSU Variation Aware Placement in FPGAs Suresh Srinivasan and Vijaykrishnan Narayanan Pennsylvania State University, University Park.
Gill 1 MAPLD 2005/234 Analysis and Reduction Soft Delay Errors in CMOS Circuits Balkaran Gill, Chris Papachristou, and Francis Wolff Department of Electrical.
A Novel, Highly SEU Tolerant Digital Circuit Design Approach By: Rajesh Garg Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
EE 653: Group #3 Impact of Drowsy Caches on SER Arjun Bir Singh Mohammad Abdel-Majeed Sameer G Kulkarni.
Fault-Tolerant Resynthesis for Dual-Output LUTs Roy Lee 1, Yu Hu 1, Rupak Majumdar 2, Lei He 1 and Minming Li 3 1 Electrical Engineering Dept., UCLA 2.
Raghuraman Balasubramanian Karthikeyan Sankaralingam
IPF: In-Place X-Filling to Mitigate Soft Errors in SRAM-based FPGAs
Alireza Shafaei, Shuang Chen, Yanzhi Wang, and Massoud Pedram
Challenges in Nanoelectronics: Process Variability
Analytical Delay and Variation Modeling for Subthreshold Circuits
Impact of Parameter Variations on Multi-core chips
Chapter 5 Circuit Simulation.
Off-path Leakage Power Aware Routing for SRAM-based FPGAs
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

Trace-Based Framework for Concurrent Development of Process and FPGA Architecture Considering Process Variation and Reliability 1 Lerong Cheng, 1 Yan Lin, 1 Lei He, and 2 Yu Cao 1 EE Department, UCLA 2 EE Department, ASU Address comments to

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

Review of Previous Work Device and architecture co-optimization  Power and delay [Cheng DAC’05]  Process variation [Wong ICCAD’05]  Soft error rate [Lin, ICCAD’07]

Limitation of Ptrace Ptrace requires a stable SPICE model which is able to consider all process corners  SPICE model is not available at the early stage of process development Circuit simulation for all process corners is time consuming  The accuracy of circuit simulation is not needed for quick architecture evaluation Does not handle realistic variation  Non-Gaussian variation sources  Spatial correlation Does not handle device aging

Extended Ptrace (Ptrace2) Trace Circuit Element Statistics Critical Path Structure Switching Activity Process parameters Chip Level Leakage Power Dynamic Power Delay Reliability Soft Error Rate Device Aging Process Variation Power Distribution Delay Distribution InputOutput PTrace2 Reliability Chip Level Power and Delay Estimation Variation Analysis Circuit Level Power and Delay Estimation Transistor Electrical Characteristics

Early-Stage Circuit Modeling ITRS MASTAR4 model [ITRS MASTAR4 2005] Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: Ioff Ion Igon Igoff Cg Cdiff

Extended Ptrace Trace Circuit Element Statistics Critical Path Structure Switching Activity Process parameters Chip Level Leakage Power Dynamic Power Delay Reliability Soft Error Rate Device Aging Process Variation Power Distribution Delay Distribution InputOutput PTrace2 Reliability Chip Level Power and Delay Estimation Variation Analysis Circuit Level Power and Delay Estimation Transistor Electrical Characteristics

Circuit Level and Chip Level Power and Delay Circuit level power and delay  Inverter  Pass transistor driven by an inverter Chip level power and delay  Similar to the original Ptrace [Cheng DAC ’ 05, Wong ICCAD ’ 05]

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

Experimental Setting 20 MCNC benchmarks  Assume all 20 MCNC benchmarks are placed in the same chip ITRS high performance 32nm technology (HP32) Architecture  Cluster size N=6  LUT size K=7  Wire segment length W=4 Device  Vdd=1.0, 1.05, 1.1 V  L gate =31, 32, 33 nm Baseline ITRS HP32

Delay and Power Tradeoff 3.1X energy span and 1.3X delay span within search space

Power and Delay Optimization DevicePower (W) Delay (ns) Energy (nJ) ED (nJ·ns) HP Min-ED (-29.4%) Device tuning reduces energy delay product by 29.4%

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

Experimental Setting Variation sources  Doping density N bulk 3 σ g =5% of nominal value, 3 σ r =3% of nominal value  Gate channel length L gate 3 σ g =0.8nm, 3 σ r =0.6nm Simulation  M=10,000 sample Monte Carlo simulation

Power and Delay Distribution

Power and Delay Variation Min-ED device setting significantly reduce leakage variation with a small increase of delay variation Device Leakage (mW)Delay (ns) µσµσ HP Min-ED34045 (-87%) (+34%)

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

NBTI and HCI Negative-bias-temperature-instability (NBTI) effect increases the threshold voltage of PMOS [Wang DAC’06] hot-carrier-injection (HCI) increases the threshold voltage of NMOS [Wang CICC’07] Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: ΔV th (NBTI) ΔV th (HCI)

V th Increase Caused by NBTI and HCI V th increase is the most significant in the first year Device burn-in can be applied to reduce the impact of device aging

Impact of Device Burn-in High performance device setting is more sensitive to device aging Device aging leads to 8.5% of delay degradation after 10 years Device burn-in reduce delay degradation from 8.5% to 5.5% after 10 years Device W/O Burn-inW/ Burn-in Current10 yearsCurrent10 years P (mW)D (ns)P (mW)D (ns)P (mW)D (ns)P (mW)D (ns) HP (-25.1%) 4.23 (+8.5%) (-10.0%) 4.25 (+5.5%) Min-ED (-5.2%) 4.64 (+2.0%) (-1.9%) 4.65 (+1.1%)

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

Permanent Soft Error Rate Single-event upset (SEU) due to cosmic rays or high energy particles may affect configuration SRAMs in FPGAs and result in permanent soft error Inputs: Lgate Tox Nbulk Xjext W Racc T Vdd Outputs: SER

SER under Different Device Setting SER for both device setting is similar DeviceSER (FIT) HP Min-ED (+1.6%)

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

Impact of Device Aging on Power and Delay Variation Device aging significantly reduces leakage variation and slightly increase delay variation Device σ Leakage (W) σ Delay (ns) Current10 yearsCurrent10 years HP (-65.2%) (+1.67%) Min-ED (-32.7%) (+0.16%)

Impact of Device Aging and Process Variation on SER Neither device aging nor process variation has significant impact on permanent SER Current10 yearsVariation SRAM SER (FIT)2.914E-5+0.3%-0.18% ~ +0.17%

Outline Introduction  Review of existing work  Process models Concurrent development of process and architecture  Power and delay  Process variation Concurrent development for reliability  Device aging  Permanent soft error rate (SER)  Interaction between process variation and reliability Conclusion

A trace-based framework has been developed to enable concurrent process and FPGA architecture co-development Device tuning achieves significant energy delay product reduction Applying device burn-in reduces delay degradation from 8.5% to 5.5% within 10 years Device aging significantly reduces leakage variation but has has almost neglegible impact on delay variation Neither device aging nor process variation has significant impact on permanent SER