Deepa Soman, HyunSuk Nam, Rekha Srinivasaraghavan, Shashank Sivakumar

Slides:



Advertisements
Similar presentations
FPGA (Field Programmable Gate Array)
Advertisements

Feb. 17, 2011 Midterm overview Real life examples of built chips
Subthreshold SRAM Designs for Cryptography Security Computations Adnan Gutub The Second International Conference on Software Engineering and Computer Systems.
Leakage Energy Management in Cache Hierarchies L. Li, I. Kadayif, Y-F. Tsai, N. Vijaykrishnan, M. Kandemir, M. J. Irwin, and A. Sivasubramaniam Penn State.
University of Michigan Electrical Engineering and Computer Science University of Michigan Electrical Engineering and Computer Science University of Michigan.
Keeping Hot Chips Cool Ruchir Puri, Leon Stok, Subhrajit Bhattacharya IBM T.J. Watson Research Center Yorktown Heights, NY Circuits R-US.
Power Reduction Techniques For Microprocessor Systems
Synchronous Digital Design Methodology and Guidelines
S. Reda EN160 SP’08 Design and Implementation of VLSI Systems (EN1600) Lecture 14: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Yan Lin, Fei Li and Lei He EE Department, UCLA
Power Modeling and Architecture Evaluation for FPGA with Novel Circuits for Vdd Programmability Yan Lin, Fei Li and Lei He EE Department, UCLA
Programmable logic and FPGA
Lecture 16: Power Reduction Techniques November 5, 2013 ECE 636 Reconfigurable Computing Lecture 16 Power Reductions Techniques for FPGAs.
S. Reda EN160 SP’07 Design and Implementation of VLSI Systems (EN0160) Lecture 13: Power Dissipation Prof. Sherief Reda Division of Engineering, Brown.
Lecture 5 – Power Prof. Luke Theogarajan
Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Yan Lin and Lei He EE Department, UCLA Partially supported.
Lecture 7: Power.
Computation Energy Randy Huang Sep 29, Outline n Why do we care about energy/power n Components of power consumption n Measurements of power consumption.
Introduction to FPGA and DSPs Joe College, Chris Doyle, Ann Marie Rynning.
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris
Dynamic Power Consumption In Large FPGAs WILLIAM GARCIA, ANDREW MORTELLARO.
Low power architecture and HDL coding practices for on-board hardware applications Kaushal D. Buch ASIC Engineer, eInfochips Ltd., Ahmedabad, India
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 1 Fundamentals of Quantitative Design and Analysis Computer Architecture A Quantitative.
Robust Low Power VLSI R obust L ow P ower VLSI Finding the Optimal Switch Box Topology for an FPGA Interconnect Seyi Ayorinde Pooja Paul Chaudhury.
EE466: VLSI Design Power Dissipation. Outline Motivation to estimate power dissipation Sources of power dissipation Dynamic power dissipation Static power.
Lecture 2: Field Programmable Gate Arrays September 13, 2004 ECE 697F Reconfigurable Computing Lecture 2 Field Programmable Gate Arrays.
ENGG 6090 Topic Review1 How to reduce the power dissipation? Switching Activity Switched Capacitance Voltage Scaling.
1 VLSI Design SMD154 LOW-POWER DESIGN Magnus Eriksson & Simon Olsson.
Power Reduction for FPGA using Multiple Vdd/Vth
Low-Power Wireless Sensor Networks
CAD for Physical Design of VLSI Circuits
LOPASS: A Low Power Architectural Synthesis for FPGAs with Interconnect Estimation and Optimization Harikrishnan K.C. University of Massachusetts Amherst.
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
Logic Synthesis for Low Power(CHAPTER 6) 6.1 Introduction 6.2 Power Estimation Techniques 6.3 Power Minimization Techniques 6.4 Summary.
1 EE 587 SoC Design & Test Partha Pande School of EECS Washington State University
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Washington State University
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Guy Lemieux, Mehdi Alimadadi, Samad Sheikhaei, Shahriar Mirabbasi University of British Columbia, Canada Patrick Palmer University of Cambridge, UK SoC.
ISSS 2001, Montréal1 ISSS’01 S.Derrien, S.Rajopadhye, S.Sur-Kolay* IRISA France *ISI calcutta Combined Instruction and Loop Level Parallelism for Regular.
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Under-Graduate Project Improving Timing, Area, and Power Speaker: 黃乃珊 Adviser: Prof.
Leakage reduction techniques Three major leakage current components 1. Gate leakage ; ~ Vdd 4 2. Subthreshold ; ~ Vdd 3 3. P/N junction.
경종민 Low-Power Design for Embedded Processor.
Basics of Energy & Power Dissipation
1 Leakage Power Analysis of a 90nm FPGA Authors: Tim Tuan (Xilinx), Bocheng Lai (UCLA) Presenter: Sang-Kyo Han (ECE, University of Maryland) Published.
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Patricia Gonzalez Divya Akella VLSI Class Project.
FPGA-Based System Design: Chapter 6 Copyright  2004 Prentice Hall PTR Topics n Low power design. n Pipelining.
A New Class of High Performance FFTs Dr. J. Greg Nash Centar ( High Performance Embedded Computing (HPEC) Workshop.
Z. Feng MTU EE4800 CMOS Digital IC Design & Analysis 6.1 EE4800 CMOS Digital IC Design & Analysis Lecture 6 Power Zhuo Feng.
Seok-jae, Lee VLSI Signal Processing Lab. Korea University
FPGA Logic Cluster Design Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
An FFT for Wireless Protocols Dr. J. Greg Nash Centar ( HAWAI'I INTERNATIONAL CONFERENCE ON SYSTEM SCIENCES Mobile.
CS203 – Advanced Computer Architecture
LOW POWER DESIGN METHODS
Power-Optimal Pipelining in Deep Submicron Technology
Temperature and Power Management
Give qualifications of instructors: DAP
LOW POWER DESIGN METHODS V.ANANDI ASST.PROF,E&C MSRIT,BANGALORE.
SECTIONS 1-7 By Astha Chawla
An Active Glitch Elimination Technique for FPGAs
A High Performance SoC: PkunityTM
FPGA Glitch Power Analysis and Reduction
Lecture 7: Power.
Off-path Leakage Power Aware Routing for SRAM-based FPGAs
Low Power Digital Design
A New Hybrid FPGA with Nanoscale Clusters and CMOS Routing Reza M. P
Chapter 3b Leakage Efficient Chip-Level Dual-Vdd Assignment with Time Slack Allocation for FPGA Power Reduction Prof. Lei He Electrical Engineering Department.
Presentation transcript:

Deepa Soman, HyunSuk Nam, Rekha Srinivasaraghavan, Shashank Sivakumar Optimization of Power Reduction in FPGA Interconnect by Charge Recycling Presentation slide for courses, classes, lectures et al. Deepa Soman, HyunSuk Nam, Rekha Srinivasaraghavan, Shashank Sivakumar

Agenda Day 1 Day 2 Intro Power Consumption Techniques Power Reduction Techniques Discussions Day 2 Power Reduction Techniques (Conti) Charge Recycling Our Project Discussions Beginning course details and/or books/materials needed for a class/project.

Introduction Motivation Achilles’ Heel 3 A schedule design for optional periods of time/objectives. Introduction Motivation Achilles’ Heel Logic flexibility & re-programmability -longer wires (7-14 X) higher than asics

Power Consumption Dynamic Power -  power consumed while the inputs are active Static power - power consumed even when there is no circuit activity !!! Dynamic Power Consumption Affected by Switching activity, Capacitance of transistors, supply voltage and frequency of operation Static Power Consumption Thermal characteristic accompanying Shrinking transistor size

Why Panic about Power?

Why Static Power??

Low Power Opportunities

Hardware Techniques Voltage Scaling Dual Vdd Frequency Scaling Clock Gating

Voltage Scaling Selecting core voltage based on performance requirements How to Choose? – From Timing Analysis Types: 1) Static Voltage Scaling 2) Dynamic Voltage Scaling

1. Static Voltage Scaling Selected core voltage only Realized using on chip Low-Dropout regulator(LDO) Voltage controlled by configuration bit stream  0.8-V - minimum dynamic and leakage power 1.0-V - overall highest performance 1.0v 0.8v LDO [1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian

2. Dynamic Voltage Scaling Provides different voltage levels Realized using voltage controlling unit Can be level shifter or DC-DC converter DVS implementation (LDMC – Logic Delay Measurement Unit) Delay error a novel Logic Delay Measurement Circuit using FPGA resources: to the first order, the reading produced by the LDMC tracks the critical path delay of a circuit that we wish to operate under DVS; we also show experimentally that by using a closed loop DVS system which keeps the LDMC reading above a threshold, no errors occur;  ”Dynamic Voltage Scaling for Commercial FPGAs”, C.T. Chow1, L.S.M. Tsui1, P.H.W.

Dual Supply Voltage (Vdd) Separate voltage supplies for configuration SRAM and other elements Purpose: To support sleep mode Shutdown most logic except SRAM using LDO “A Dual-VDD Low Power FPGA Architecture” A. Gayasen1, K. Lee1, N. Vijaykrishnan1, M. Kandemir1, M.J. Irwin1, and T. Tuan2

Performance Static voltage scaling techniques leads to nearly 53% power reduction. Dynamic(upto 54%). Dual Vdd- 14% Merits: SVS - Simple hardware DVS - Self adaptive Dual Vdd – eliminate speed penalty Demerits: SVS - Voltage is fixed DVS - design complexity Dual Vdd - area overhead [1]"A FPGA Prototype Design Emphasis on Low Power Technique" Xu, Jian [2]”A 90-nm Low-Power FPGA for Battery-Powered Applications”,Tuan, Das, Steve, Sean

Frequency Scaling f : frequency of switching Simple dynamic clock management circuit (b) Using Feedback, PLL circuit can reduce skew; lock time (a) The simplest dynamic clock management circuit is an open-loop implementation with a clock divider inserted into the desired paths (b) Skew can be compensated by introducing a Phase Locked Loop (PLL) into the circuitry. The simplest dynamically scaled structure is obtained by taking feedback from a point that does not change frequency © This scheme can successfully apply dynamic clock division. For dynamic multiplication, the signal in the feedback path must be divided In the case of a large change in input frequency, the output of the PLL may take a long period to settle and regain a lock on the input signal. (c) dynamic clock division Merits: Can subsequently reduce voltage Demerits: Increased Latency Dynamic Clock Management Implementations

Benefits of Frequency Scaling Dynamic Clock Management for Low Power Applications in FPGAs As frequency decreases, power consumption also decreases "Dynamic Clock Management for Low Power Applications in FPGAs", Lan, zilic

Clock Gating Controlling the clock flow Purpose: To temporarily disable blocks Can be realized in hardware using clock enable signals minimizes power dissipation in clock circuits/network (a) a clock is driving a number of flip-flops. The top two rows of flip-flops are connected to a clock enable signal, clkEnable, whereas the bottom row of flip-flops is not connected to any clock enable signal. Observe that the clock is driven by global clock buffer (b) The new global clock buffer, called BUFGCE, The input to this buffer is also clk, however, the clock enable of this buffer is connected to flip-flop’s enable signal clkEnable, and the clkEnable signal is disconnected from the flip-flops it was previously feeding.

Clock Gating - Performance Clock Power Reduction for Virtex-5 FPGAs Over 20% power reductions are observed for the DSP circuits Eliminates unnecessary toggling on outputs, gates of FFs and clock signals industry-a,b,c,d, are DSP circuits, while the remaining circuits are collected from customers and are of unknown function Demerits: Clock skew "Clock Power Reduction for Virtex-5 FPGAs",Wang, Gupta, Anderson

Software Techniques System Level: Algorithm Modification CAD Tools : Logic Partitioning Mapping, Clustering Placement & Routing A

Low Power FFT Implementation Architecture Matrix multiplication ->1D array low power dissipation than 2D array Module Disabling – Clock gating to disable modules eg: twiddle factor calculation dynamic memory activation Multiple time multiplexed Pipeline uP Parallel Processing Algorithm : Block Matrix Multiplication Time-multiplexers instead of routing network are used for shuffling the intermediate data, thus reducing the burden of interconnection power for large FFT problem size. As pipeline stages increasesturn, reduces dynamic power. energy reduces - Pipelining reduces the number of spurious glitches which, in To reduce memory power, a method of dynamic memory activation is developed. Cache Based approach

FFT implementation Results 17% to 26% power reduction "High throughput energy efficient multi-FFTarchitecture on FPGAs" , Chen , Park, Prasanna

Energy Reduction Contributions of CAD Stages Clustering contributes to the major share ! "On the interaction between power aware FPGA CAD algorithms" , Julien , Steven

Power Aware Clustering Power Aware TV pack How?? Cost function Modification to include power

Results: Power Aware clustering “Netlength Based Routability Driven Power Aware Clustering" , Akoglu, Easwaran

Power Aware Placement Problem Addressed: Power analysis of configurable switches is usually implemented during the routing and mapping stages and has been largely ignored during the placement stage of the design due to the inaccuracy associated with power estimation at high level design process Proposed Idea: A Power-Aware Algorithm for the Design of Reconfigurable Hardware during High Level Placement Modeled the number of switches used in the circuit and employed simulated annealing algorithm to reduce the overall routing power

Results "On the interaction between power aware FPGA CAD algorithms" , Julien , Steven

Temperature Aware Routing leakage current increases exponentially with temperature Switching capacitance Needs the knowledge of spatial distribution of parameters

Algorithm By discouraging routing algorithm to form connections that cross hotspot regions Cost Function Modification: Power Savings Range between 30 – 63 % "A Temperature-Aware Placement and Routing targeting 3D FPGAs", Kostas, Soudris

Power-Aware FPGA Design Flow Step 1 Power Based Architectural (High level modelling) RTL Voltage scaling, Dual Vdd Freq Scaling, Clock gating Step 2 Power Aware Packing or Clustering CAD Power Aware Placement Tools Power Aware Routing

Main/Baseline Paper Problem Addressed Proposed idea Power consumption in FPGAs is dominated by interconnect(62%) Proposed idea Charge recycling for power reduction in FPGA interconnect

Charge Recycling (CR)

Charge Recycling in FPGAs How?? “Unused routing resources “ as reservoirs Reduces charge drawn from Vdd 25% reduction in energy Unused/Reservoir Unused/Reservoir Unused w/o friends !! 

CR-Capable FPGA Interconnect Analysis Four components SRAM Cell Produce signals CR and TS : control a switch (Normal, CR, tri-state ) Delay Line Transition between VIN and DLOUT CR Circuit Perform the charge sharing between the load and reservoir Input Stage

Experiments/Methodology VPR6.0 Baseline : Island style, Unidirectional, Wilton (K=6 ,N=4) Router – Path Finder - Cost Function Modification Post Routing CR mode VPR place/route tool helps in finding % increase in area

VPR Cost Function Cost Function – Path Finder Modified Cost Function

Post - Routing Mixed Integer Linear Program Tries to maximize the number of nodes to be put into CR mode Constraint: Critical delay of the circuit

Results Dynamic power in the FPGA interconnect is reduced by up to ∼15-18.4%

Results Continued… Number of min-width transistors as the area metric Reductions in power savings are not directly proportional to the reduction in CR-capable switches (area)

What we propose new? Not all unused wires become friends Unused wires connected to constant voltage “URekha” --- Unused wires Tri-stated “further power savings!!” ~6% savings

Thank you