PROCEED: Pareto Optimization-based Circuit-level Evaluation Methodology for Emerging Devices Shaodi Wang, Andrew Pan, Chi-On Chui and Puneet Gupta Department.

Slides:



Advertisements
Similar presentations
Tunable Sensors for Process-Aware Voltage Scaling
Advertisements

OCV-Aware Top-Level Clock Tree Optimization
1 A Self-Tuning Cache Architecture for Embedded Systems Chuanjun Zhang*, Frank Vahid**, and Roman Lysecky *Dept. of Electrical Engineering Dept. of Computer.
Slides based on Kewal Saluja
NTHU-CS VLSI/CAD LAB TH EDA De-Shiuan Chiou Da-Cheng Juan Yu-Ting Chen Shih-Chieh Chang Department of CS, National Tsing Hua University, Taiwan Fine-Grained.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Xing Wei, Wai-Chung Tang, Yu-Liang Wu Department of Computer Science and Engineering The Chinese University of HongKong
Improving Placement under the Constant Delay Model Kolja Sulimma 1, Ingmar Neumann 1, Lukas Van Ginneken 2, Wolfgang Kunz 1 1 EE and IT Department University.
0 1 Width-dependent Statistical Leakage Modeling for Random Dopant Induced Threshold Voltage Shift Jie Gu, Sachin Sapatnekar, Chris Kim Department of Electrical.
Fall 06, Sep 19, 21 ELEC / Lecture 6 1 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic.
CMOS Circuit Design for Minimum Dynamic Power and Highest Speed Tezaswi Raja, Dept. of ECE, Rutgers University Vishwani D. Agrawal, Dept. of ECE, Auburn.
Power-Aware Placement
9/08/05ELEC / Lecture 51 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Device Sizing Techniques for High Yield Minimum-Energy Subthreshold Circuits Dan Holcomb and Mervin John University of California, Berkeley EE241 Spring.
© Digital Integrated Circuits 2nd Inverter CMOS Inverter: Digital Workhorse  Best Figures of Merit in CMOS Family  Noise Immunity  Performance  Power/Buffer.
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
1 A Single-supply True Voltage Level Shifter Rajesh Garg Gagandeep Mallarapu Sunil P. Khatri Department of Electrical and Computer Engineering, Texas A&M.
From Compaq, ASP- DAC00. Power Consumption Power consumption is on the rise due to: - Higher integration levels (more devices & wires) - Rising clock.
NTHU-CS VLSI/CAD LAB TH EDA Student : Da-Cheng Juan Advisor : Shih-Chieh Chang Fine-Grained Sleep Transistor Sizing Algorithm for Leakage Power Minimization.
On-Line Adjustable Buffering for Runtime Power Reduction Andrew B. Kahng Ψ Sherief Reda † Puneet Sharma Ψ Ψ University of California, San Diego † Brown.
1 UCSD VLSI CAD Laboratory ISQED-2009 Revisiting the Linear Programming Framework for Leakage Power vs. Performance Optimization Kwangok Jeong, Andrew.
Effects of Global Interconnect Optimizations on Performance Estimation of Deep Sub-Micron Design Yu (Kevin) Cao 1, Chenming Hu 1, Xuejue Huang 1, Andrew.
Jan. 2007VLSI Design '071 Statistical Leakage and Timing Optimization for Submicron Process Variation Yuanlin Lu and Vishwani D. Agrawal ECE Dept. Auburn.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Circuit Performance Variability Decomposition Michael Orshansky, Costas Spanos, and Chenming Hu Department of Electrical Engineering and Computer Sciences,
Fall 06, Sep 14 ELEC / Lecture 5 1 ELEC / (Fall 2006) Low-Power Design of Electronic Circuits (Formerly ELEC / )
Lecture 5 – Power Prof. Luke Theogarajan
Lecture 7: Power.
Selective Gate-Length Biasing for Cost-Effective Runtime Leakage Control Puneet Gupta 1 Andrew B. Kahng 1 Puneet Sharma 1 Dennis Sylvester 2 1 ECE Department,
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
Prof. John Nestor ECE Department Lafayette College Easton, Pennsylvania ECE VLSI Circuit Design Lecture 8 - Comb. Logic.
Lecture 21, Slide 1EECS40, Fall 2004Prof. White Lecture #21 OUTLINE –Sequential logic circuits –Fan-out –Propagation delay –CMOS power consumption Reading:
UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD Laboratory UC San Diego Computer Engineering VLSI CAD.
1 An Interconnect-Centric Approach to Cyclic Shifter Design David M. Harris Harvey Mudd College. Haikun Zhu, Yi Zhu C.-K. Cheng Harvey Mudd College.
A Methodology for Interconnect Dimension Determination By: Jeff Cobb Rajesh Garg Sunil P Khatri Department of Electrical and Computer Engineering, Texas.
Logic Optimization Mohammad Sharifkhani. Reading Textbook II, Chapters 5 and 6 (parts related to power and speed.) Following Papers: –Nose, Sakurai, 2000.
1 Delay Estimation Most digital designs have multiple data paths some of which are not critical. The critical path is defined as the path the offers the.
Accuracy-Configurable Adder for Approximate Arithmetic Designs
Review: CMOS Inverter: Dynamic
Power Reduction for FPGA using Multiple Vdd/Vth
An Efficient Algorithm for Dual-Voltage Design Without Need for Level-Conversion SSST 2012 Mridula Allani Intel Corporation, Austin, TX (Formerly.
Jia Yao and Vishwani D. Agrawal Department of Electrical and Computer Engineering Auburn University Auburn, AL 36830, USA Dual-Threshold Design of Sub-Threshold.
Washington State University
-1- UC San Diego / VLSI CAD Laboratory Construction of Realistic Gate Sizing Benchmarks With Known Optimal Solutions Andrew B. Kahng, Seokhyeong Kang VLSI.
Notices You have 18 more days to complete your final project!
Optimal digital circuit design Mohammad Sharifkhani.
Introduction to CMOS VLSI Design Lecture 5: Logical Effort GRECO-CIn-UFPE Harvey Mudd College Spring 2004.
Modern VLSI Design 4e: Chapter 3 Copyright  2008 Wayne Wolf Topics n Pseudo-nMOS gates. n DCVS logic. n Domino gates. n Design-for-yield. n Gates as IP.
NUMERICAL TECHNOLOGIES, INC. Assessing Technology tradeoffs for 65nm logic circuits D Pramanik, M Cote, K Beaudette Numerical Technologies Inc Valery Axelrad.
Eyecharts: Constructive Benchmarking of Gate Sizing Heuristics Puneet Gupta, University of California, Los Angeles Andrew B. Kahng, University of California,
Basics of Energy & Power Dissipation
QuickYield: An Efficient Global-Search Based Parametric Yield Estimation with Performance Constraints Fang Gong 1, Hao Yu 2, Yiyu Shi 1, Daesoo Kim 1,
Simultaneous Supply, Threshold and Width Optimization for Low-Power CMOS Circuits With an aside on System based shutdown. Gord Allan PhD Candidate ASIC.
FPGA-Based System Design: Chapter 2 Copyright  2004 Prentice Hall PTR Topics n Logic gate delay. n Logic gate power consumption. n Driving large loads.
EE201C : Stochastic Modeling of FinFET LER and Circuits Optimization based on Stochastic Modeling Shaodi Wang
Tae- Hyoung Kim, Hanyong Eom, John Keane Presented by Mandeep Singh
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
Joshua L. Garrett Digital Circuits Design GroupUniversity of California, Berkeley Compact DSM MOS Modeling for Energy/Delay Estimation Joshua Garrett,
-1- UC San Diego / VLSI CAD Laboratory Optimization of Overdrive Signoff Tuck-Boon Chan, Andrew B. Kahng, Jiajia Li and Siddhartha Nath Tuck-Boon Chan,
-1- Delay Uncertainty and Signal Criticality Driven Routing Channel Optimization for Advanced DRAM Products Samyoung Bang #, Kwangsoo Han ‡, Andrew B.
1 Timing Closure and the constant delay paradigm Problem: (timing closure problem) It has been difficult to get a circuit that meets delay requirements.
ELEC Digital Logic Circuits Fall 2015 Delay and Power Vishwani D. Agrawal James J. Danaher Professor Department of Electrical and Computer Engineering.
1 Hardware Reliability Margining for the Dark Silicon Era Liangzhen Lai and Puneet Gupta Department of Electrical Engineering University of California,
Proximity Optimization for Adaptive Circuit Design Ang Lu, Hao He, and Jiang Hu.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Reading: Hambley Ch. 7; Rabaey et al. Sec. 5.2
Circuit Design Techniques for Low Power DSPs
Applications of GTX Y. Cao, X. Huang, A.B. Kahng, F. Koushanfar, H. Lu, S. Muddu, D. Stroobandt and D. Sylvester Abstract The GTX (GSRC Technology Extrapolation)
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

PROCEED: Pareto Optimization-based Circuit-level Evaluation Methodology for Emerging Devices Shaodi Wang, Andrew Pan, Chi-On Chui and Puneet Gupta Department of Electrical Engineering, University of California, Los Angeles ASP-DAC

NanoCAD Lab Need for Device Evaluation Traditional CMOS technologies are reaching physical limits Many alternative emerging devices under investigation: TFET, CNT, Heterogeneous CMOS, etc.  need to be able quickly compare them to guide technology development How should we compare emerging devices ? −Comprehensive, systematic and automated comparison in context of how they are going to be used −Account for various design types and circuit-level optimizations −Fast and flexible evaluation framework −Cover the wide performance range (KHz to GHz) ASP-DAC 20142

NanoCAD Lab Prior Work Three classes of works –Devices level 1,6 : I on /I off, Subthreshold Slope (SS ), CV/I, CV 2 –Canonical circuit level: Simple circuits + Analytical model 4 based power-delay tradeoff –Full design flow: Library generation, Synthesis, Placement and Routing Existing evaluation benchmarks neglect how modern circuits really use devices  These can dramatically change the conclusions. −Circuit topology dependence (e.g., logic depth) −Design-time power optimization (Multi-V th and multiple gate sizes) −Runtime adaptive power management (DVFS, Gating) ASP-DAC L. Wei, et. al., IEDM, D. J. Frank, et. al., IBM J, M. Luisier, et. al., IEDM, 2011

NanoCAD Lab Outline PROCEED Methodology Example Experimental Results ASP-DAC 20144

NanoCAD Lab Proposed Framework: PROCEED ASP-DAC Canonical circuit construction Wire capacitance, resistance, and chip area Interconnect model Pareto optimizer Variation information Gate sizes Power management Logic depth histogram, Fan-out Delay and power Pareto curves Throughput power Pareto curves Device model, Activity V dd, V th, gate sizes constraints Ratio of throughput

NanoCAD Lab Canonical Circuit Construction ASP-DAC Utilize essential design information –Logic depth histogram, average number of transistors per gate, average fan-out, average interconnect load and chip area –Ignore detailed circuit design Little impact on device performance evaluation  Logic depth histogram  Simulation blocks (S i ) –Construct logic paths in corresponding bins  Single stage –Gate (Nand, XOR, etc.) –Buffer, interconnect, fan-out load  Tuning parameters –Gate sizes, V dd, V th

NanoCAD Lab Canonical Circuit Construction ASP-DAC Logic depth histogram is estimated using slack histogram Interconnect –Interconnect is proportional to the square-root of chip area 1 –Chip area is assumed to be linear to cell area –Cell area is modeled as a function of transistor width from DRE 2 1 J. A. Davis, et. al.,, IEEE TED, R. S. Ghaida and P. Gupta, IEEE Trans. CAD, 2012.

NanoCAD Lab Pareto Optimization: Overview ASP-DAC Objective functions are weighted sum of delay and power –Non-convex problem –Gradient descent used Delay and power are approximated with second order functions in trust region Trust region shrinks during optimization Logarithmic barrier is incorporated to confine parameter range Local models

NanoCAD Lab Pareto Optimization: Modeling ASP-DAC Build simulation-block-level delay and power models by utilizing circuit simulations results. Objective delay is the longest delay of all logic paths (constructed by simulation blocks) –Using high order norm to estimate max-delay function  This can make the objective function continuous for gradient calculation Objective power is the weighted total power

NanoCAD Lab Power Management: Overview ASP-DAC Modern circuits allow devices to operate in three modes: normal, power saving and sleep mode. –A 2 nd lower V dd is applied to devices in power saving mode –Device is turned off in sleeping mode Pick best 2 nd V dd and optimally divide time spent on each mode –Need input of the ratio of average throughput to peak throughput –Use polynomial models for delay and power of simulation block as a function of V dd Minimizing average power consumption: f 1 P 1 + f 2 P 2

NanoCAD Lab Validation of PROCEED ASP-DAC Model 3 is frequently used for device evaluation –Ignoring logic depth histogram and using analytical delay and power models is inaccurate Proposed methodology is 21X (average) more accurate –Efficient to evaluate devices with performance range from MHz to GHz 3 D. J. Frank, et. al., IBM J, 2006

NanoCAD Lab Outline PROCEED Methodology Example Experimental Results ASP-DAC

NanoCAD Lab Impact of Activity ASP-DAC Low activity circuit benefits low leakage devices –TFET better than SOI MOSFET for low-activity, low-performance circuits

NanoCAD Lab Impact of Circuit Topology ASP-DAC CortexM0 is more evenly distributed in LDH Power consumption in MIPS is dominated by short logic paths –More accommodating to low power devices ( TFETs)

NanoCAD Lab Impact of Power Management ASP-DAC Peak throughput cross-points shift higher as ratio of average to peak throughput decreases –This indicates TFET may be a better device for applications with wide dynamic range in performance needs

NanoCAD Lab Impact of Variation ASP-DAC Variation evaluation indicates that TFET suffers more from effective voltage drop –TFET and SOI are assumed to have peak 10% Vdd and 50mV Vth worst case variation –High Sub-threshold Slope (SS )devices are more sensitive to voltage drop

NanoCAD Lab Conclusion Proposed new methodology for evaluating emerging devices accounts for circuit topology, adaptivity, variability, and use context Proposed methodology is efficient and accurate for device evaluation over broad operating range. –Effective Pareto -based optimization heuristic –Accurate circuit simulation and device compact model Example comparison of TFET and SOI devices –TFET is better for low activity, low logic depth and high dynamic performance design –TFET is more sensitive to voltage drop and threshold voltage shifting Entire PROCEED source-code (MATLAB, C++) will be made available openly. ASP-DAC

NanoCAD Lab Q&A Thanks! ASP-DAC