Ronny Krashinsky Seongmoo Heo Michael Zhang Krste Asanovic MIT Laboratory for Computer Science SyCHOSys Synchronous.

Slides:



Advertisements
Similar presentations
VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동 교수
Advertisements

Computer Abstractions and Technology
Power Reduction Techniques For Microprocessor Systems
Elettronica T A.A Digital Integrated Circuits © Prentice Hall 2003 Inverter CMOS INVERTER.
XPower for CoolRunner™-II CPLDs
8/19/04ELEC / ELEC / Advanced Topics in Electrical Engineering Designing VLSI for Low-Power and Self-Test Fall 2004 Vishwani.
Energy Evaluation Methodology for Platform Based System-On- Chip Design Hildingsson, K.; Arslan, T.; Erdogan, A.T.; VLSI, Proceedings. IEEE Computer.
8/18/05ELEC / Lecture 11 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
Mehdi Amirijoo1 Power estimation n General power dissipation in CMOS n High-level power estimation metrics n Power estimation of the HW part.
Iterative Algorithms for Low Power VLSI Placement Sadiq M. Sait, Ph.D Department of Computer Engineering King Fahd University of Petroleum.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Enhancing Embedded Processors with Specific Instruction Set Extensions for Network Applications A. Chormoviti, N. Vassiliadis, G. Theodoridis, S. Nikolaidis.
Spring 07, Feb 22 ELEC 7770: Advanced VLSI Design (Agrawal) 1 ELEC 7770 Advanced VLSI Design Spring 2007 Power Aware Microprocessors Vishwani D. Agrawal.
Lecture 7: Power.
مرتضي صاحب الزماني  The registers are master-slave flip-flops (a.k.a. edge-triggered) –At the beginning of each cycle, propagate values from primary inputs.
Modern VLSI Design 2e: Chapter 4 Copyright  1998 Prentice Hall PTR Topics n Crosstalk. n Power optimization.
Author: D. Brooks, V.Tiwari and M. Martonosi Reviewer: Junxia Ma
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris
8/16/2015\course\cpeg323-08F\Topics1b.ppt1 A Review of Processor Design Flow.
6.893: Advanced VLSI Computer Architecture, September 28, 2000, Lecture 4, Slide 1. © Krste Asanovic Krste Asanovic
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
Modern VLSI Design 4e: Chapter 4 Copyright  2008 Wayne Wolf Topics n Interconnect design. n Crosstalk. n Power optimization.
Design of Robust, Energy-Efficient Full Adders for Deep-Submicrometer Design Using Hybrid-CMOS Logic Style Sumeer Goel, Ashok Kumar, and Magdy A. Bayoumi.
17 Sep 2002Embedded Seminar2 Outline The Big Picture Who’s got the Power? What’s in the bag of tricks?
Chalmers University of Technology FlexSoC Seminar Series – Page 1 Power Estimation FlexSoc Seminar Series – Daniel Eckerbert
1 System-level Power Estimation and Optimization Chong-Min Kyung KAIST.
1 VLSI Design SMD154 LOW-POWER DESIGN Magnus Eriksson & Simon Olsson.
Lecture 1: Performance EEN 312: Processors: Hardware, Software, and Interfacing Department of Electrical and Computer Engineering Spring 2013, Dr. Rozier.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
Basics of Energy & Power Dissipation Lecture notes S. Yalamanchili, S. Mukhopadhyay. A. Chowdhary.
Sogang University Advanced Computing System Chap 1. Computer Architecture Hyuk-Jun Lee, PhD Dept. of Computer Science and Engineering Sogang University.
XPower for CoolRunner™ XPLA3 CPLDs. Quick Start Training Overview Design power considerations Power consumption basics of CMOS devices Calculating power.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
Section 10: Advanced Topics 1 M. Balakrishnan Dept. of Comp. Sci. & Engg. I.I.T. Delhi.
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Lecture 9. MIPS Processor Design – Instruction Fetch Prof. Taeweon Suh Computer Science Education Korea University 2010 R&E Computer System Education &
Radix-2 2 Based Low Power Reconfigurable FFT Processor Presented by Cheng-Chien Wu, Master Student of CSIE,CCU 1 Author: Gin-Der Wu and Yi-Ming Liu Department.
Modern VLSI Design 3e: Chapter 4 Copyright  1998, 2002 Prentice Hall PTR Topics n Interconnect design. n Crosstalk. n Power optimization.
Computer Architecture Lecture 4 Sequential Circuits Ralph Grishman September 2015 NYU.
Power Estimation and Optimization for SoC Design
Computer Organization CS224 Fall 2012 Lesson 22. The Big Picture  The Five Classic Components of a Computer  Chapter 4 Topic: Processor Design Control.
AEEC405 – Microprocessor Architecture. Some Information Instructor Details Main Book.
PACE: Power-Aware Computing Engines Krste Asanovic Saman Amarasinghe Martin Rinard Computer Architecture Group MIT Laboratory for Computer Science
1 Power estimation in the algorithmic and register-transfer level September 25, 2006 Chong-Min Kyung.
Basics of Energy & Power Dissipation
Morgan Kaufmann Publishers
LOGIC OPTIMIZATION USING TECHNOLOGY INDEPENDENT MUX BASED ADDERS IN FPGA Project Guide: Smt. Latha Dept of E & C JSSATE, Bangalore. From: N GURURAJ M-Tech,
DSP Architectures Additional Slides Professor S. Srinivasan Electrical Engineering Department I.I.T.-Madras, Chennai –
1 Energy-Efficient Register Access Jessica H. Tseng and Krste Asanović MIT Laboratory for Computer Science, Cambridge, MA 02139, USA SBCCI2000.
CSCI1600: Embedded and Real Time Software Lecture 33: Worst Case Execution Time Steven Reiss, Fall 2015.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
11/15/05ELEC / Lecture 191 ELEC / (Fall 2005) Special Topics in Electrical Engineering Low-Power Design of Electronic Circuits.
1 of 14 Lab 2: Design-Space Exploration with MPARM.
RTL Hardware Design by P. Chu Chapter 9 – ECE420 (CSUN) Mirzaei 1 Sequential Circuit Design: Practice Shahnam Mirzaei, PhD Spring 2016 California State.
Load-Sensitive Flip-Flop Characterization Seongmoo Heo and Krste Asanović MIT Laboratory for Computer Science WVLSI 2001.
LOW POWER DESIGN METHODS
-1- Soft Core Viterbi Decoder EECS 290A Project Dave Chinnery, Rhett Davis, Chris Taylor, Ning Zhang.
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
A Review of Processor Design Flow
CSCI1600: Embedded and Real Time Software
PACE: Power-Aware Computing Engines
Circuit Design Techniques for Low Power DSPs
A High Performance SoC: PkunityTM
The Processor Lecture 3.1: Introduction & Logic Design Conventions
Activity-Sensitive Flip-Flop and Latch Selection for Reduced Energy
Load-Sensitive Flip-Flop Characterization
Low Power Digital Design
CSCI1600: Embedded and Real Time Software
Presentation transcript:

Ronny Krashinsky Seongmoo Heo Michael Zhang Krste Asanovic MIT Laboratory for Computer Science SyCHOSys Synchronous Circuit Hardware Orchestration System

Motivation Existing simulators – Prohibitively slow or inaccurate Given a proposed processor architecture, we want to: Simulate performance (cycle count) Determine energy usage (Joules) Investigate SW, compiler, and architecture changes

SyCHOSys SyCHOSys generates compiled cycle simulators Can optionally track energy usage: Exploits low power microprocessor design domain to obtain accurate transition-sensitive energy models Factors out common transition counts Uses fast bit-parallel transition counting 7 orders of magnitude faster than SPICE (7% error) 5 orders of magnitude faster than PowerMill SyCHOSys can accurately simulate the energy usage of a CPU circuit at speeds on the order of a billion cycles per day

Overview of talk SyCHOSys Framework Microprocessor Energy Modeling Energy Simulation in SyCHOSys Results Status & Future Work

SyCHOSys Framework Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

SyCHOSys Framework GCD(x, y) { if (x < y) return GCD(y, x); else if (y!=0) return GCD(x-y, y); else return x; } Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

SyCHOSys Framework Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc X{ N-CLK FF_En } (NextX.out, Ctrl.Xen); Y { N-CLK FF_En } (X.out, Ctrl.Yen); NextX{ Mux2 } (Y.out, XSubY.out, Ctrl.XMuxSel); XSubY{ H-DYNAM Sub } (X.out, Y.out); Yzero{ H-DYNAM Zero } (Y.out); YZeroL{ H-LATCH Latch } (YZero.out); XLessYL{ H-LATCH Latch } (XSubY.signbit); Ctrl{ GCDCtrl } (XLessYL.out, YZeroL.out);

SyCHOSys Framework template class Mux2 { inline void Evaluate( BitVec input0, BitVec input1, BitVec select) { if (select) out = input1; else out = input0; } BitVec out; } Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

SyCHOSys Framework GCD::clock_rising() {} GCD::clock_high() { YZero.Evaluate(Y.out); YZeroL.Evaluate(YZero.out); XSubY.Evaluate(X.out, Y.out); XLessYL.Evaluate(XSubY.signbit); Ctrl.Evaluate(XLessYL.out, YZeroL.out); NextX.Evaluate(Y.out, XSubY.out, Ctrl.XMuxSel); } GCD::clock_falling() { Y.Evaluate(X.out, Ctrl.Yen); X.Evaluate(NextX.out, Ctrl.Xen); } GCD::clock_low() { YZero.Precharge(); XSubY.Precharge(); NextX.Evaluate(Y.out, XSubY.out, Ctrl.XMuxSel); } Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

SyCHOSys Framework void gcd_clock_tick() { gcd->clock_rising(); gcd->clock_high(); gcd->clock_falling(); gcd->clock_low(); } Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

SyCHOSys Framework Optimizing compiler Component evaluation calls are inlined Structural Netlist Cycle Scheduler Additional Simulator Code Scheduled Component Evaluations Component Behavioral Methods Cycle-based Simulator gcc

Energy Modeling Power consumption in digital CMOS: Dynamic Switching Short Circuit Current Leakage Current

Energy Modeling Power consumption in digital CMOS: Dynamic Switching: around 90% aC load V SWING V DD f a switching activitydata dependent C load load capacitancevaries dynamically V SWING voltage swingfixed V DD supply voltagefixed f clock frequencyfixed

Energy Modeling Power consumption in digital CMOS: Dynamic Switching: around 90% aC load V SWING V DD f a switching activitydata dependent C load load capacitancevaries dynamically V SWING voltage swingfixed V DD supply voltagefixed f clock frequencyfixed We simplify our task by taking advantage of our restricted domain of well designed low power microprocessors

Microprocessor Energy Energy usage in a microprocessor: Memory arrays Datapaths Control

Microprocessor Energy Energy usage in a microprocessor: Memory arrays Datapaths Control Extremely regular Calibrate models with several test cases: Accounts for partial voltage swings, effective capacitance values, etc. Estimate energy based on cycle by cycle address and data trace (3% error)

Microprocessor Energy Energy usage in a microprocessor: Memory arrays Datapaths Control Determine a and C load for every node Effective C load is calculated statically a is determined based on simulation statistics Optimizations for determining switching activity: Factor out common transition counts Fast bit-parallel transition counting

Effective Load Capacitance Gate and Drain Capacitance Models Characterized using FO4 delays and rise/fall times X C load SPACE 2D extractor MergeCap

Microprocessor Energy Energy usage in a microprocessor: Memory arrays Datapaths Control Synthesized using automated tools — Irregular, hard to model Less than 10% of energy in simple RISC designs Will become more important in low power designs Can be modeled at the level of standard cell gates Work in progress

SyCHO Energy Analysis

Minimal statistics gathering during simulation Simple to add to SyCHOSys Structure of design is explicit Values on all nets are cycle-accurate Can incorporate arbitrary C++ code Energy Statistics Gathering Nets: Count transitions during simulation Counters generated automatically Components: Each component tracks arbitrary per- cycle internal statistics

SyCHO Energy Analysis Use simulation statistics and calculated capacitance values to compute energy Energy Calculation Components: Each component defines internal energy calculation routine Based on internal statistics, input and output switching frequencies, and internal capacitance values Nets: Multiply switching frequency by capacitance

Energy-Performance Model Evaluation Used GCD circuit as an example datapath Various component types (Flip-Flops, Latches, Dynamic) Small enough for SPICE simulation Hand-designed layout (0.25  m TSMC)

Simulation Speed Simulation model Compiler / Simulation Engine Simulation Speed (Hz) C-Behavioralgcc –O3109,000, Verilog-Behavioralvcs –O3 +2+state544, Verilog-Structuralvcs –O3 +2+state341, SyCHOSys-Structuralgcc –O38,000, SyCHOSys-Energygcc –O3195, Extracted LayoutPowerMill0.73 Extracted LayoutStar-Hspice0.01 All tests run on 333 MHz Sun Ultra-5 (Solaris 2.7)

Energy Simulation Results

Status & Future Work Microprocessor simulation: Five stage MIPS RISC including caches and exception handling Runs SPECint programs Most of energy modeling is complete Energy simulation of over 2000 nodes at 16 kHz Future Work: Short circuit & leakage current modeling Control logic modeling Take energy statistics per static program instruction Incorporate SyCHOSys into VLSI tool flow