Body Bias Grain Size Exploration for a Coarse Grained Reconfigurable Accelerator  Yusuke Matsushita, Hayate Okuhara, Koichiro Masuyama, Yu Fujita, Hideharu.

Slides:



Advertisements
Similar presentations
MICROWAVE FET Microwave FET : operates in the microwave frequencies
Advertisements

Barcelona Forum on Ph.D. Research in Communications, Electronics and Signal Processing 21st October 2010 Soft Errors Hardening Techniques in Nanometer.
Subthreshold SRAM Designs for Cryptography Security Computations Adnan Gutub The Second International Conference on Software Engineering and Computer Systems.
Cache Optimization for Coarse Grain Task Parallel Processing Using Inter-Array Padding K. Ishizaka, M. Obata, H. Kasahara Waseda University, Tokyo, Japan.
Timing Optimization. Optimization of Timing Three phases 1globally restructure to reduce the maximum level or longest path Ex: a ripple carry adder ==>
1 A Variation-tolerant Sub- threshold Design Approach Nikhil Jayakumar Sunil P. Khatri. Texas A&M University, College Station, TX.
A Self-adjusting Scheme to Determine Optimum RBB by Monitoring Leakage Currents Nikhil Jayakumar* Sandeep Dhar $ Sunil P. Khatri* $ National Semiconductor,
Institute of Digital and Computer Systems 1 Fabio Garzia / Finding Peak Performance in a Process23/06/2015 Chapter 5 Finding Peak Performance in a Process.
Threshold Voltage Assignment to Supply Voltage Islands in Core- based System-on-a-Chip Designs Project Proposal: Gall Gotfried Steven Beigelmacher 02/09/05.
Evaluation – Question 5 How did you attract/address the audience?
1 SEMICONDUCTORS Testing Bipolar Transistors. 2 SEMICONDUCTORS Bipolar transistors are solid state devices that are capable of operating for extremely.
Low-Power Wireless Sensor Networks
1 Memory Technology Comparison ParameterZ-RAMDRAMSRAM Size11.5x4x StructureSingle Transistor Transistor + Cap 6 Transistor Performance10.5x2x.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Drowsy Caches: Simple Techniques for Reducing Leakage Power Authors: ARM Ltd Krisztián Flautner, Advanced Computer Architecture Lab, The University of.
Introduction to FinFet
Efficient Mapping onto Coarse-Grained Reconfigurable Architectures using Graph Drawing based Algorithm Jonghee Yoon, Aviral Shrivastava *, Minwook Ahn,
Reconfigurable Computing Using Content Addressable Memory (CAM) for Improved Performance and Resource Usage Group Members: Anderson Raid Marie Beltrao.
XIAOYU HU AANCHAL GUPTA Multi Threshold Technique for High Speed and Low Power Consumption CMOS Circuits.
Creating a poster is easier than you think.
October 2008 Integrated Predictive Simulation System for Earthquake and Tsunami Disaster CREST/Japan Science and Technology Agency (JST)
FaridehShiran Department of Electronics Carleton University, Ottawa, ON, Canada SmartReflex Power and Performance Management Technologies.
Introduction to CMOS VLSI Design Lecture 0: Introduction.
A novel approach to visualizing dark matter simulations
Power-Optimal Pipelining in Deep Submicron Technology
Improving Multi-Core Performance Using Mixed-Cell Cache Architecture
Other diodes Electronic device and circuit Active learning assignments
YASHWANT SINGH, D. BOOLCHANDANI
A Seminar presentation on
Office of Undergraduate Research
Lecture 10 Power Device (1)
Chapter 3 Fabrication, Layout, and Simulation.
Noise Analysis of CMOS Readouot Ciruits
Memory Segmentation to Exploit Sleep Mode Operation
MOSFET The MOSFET (Metal Oxide Semiconductor Field Effect Transistor) transistor is a semiconductor device which is widely used for switching and amplifying.
Fast and Robust Hashing for Database Operators
Three Important Concepts
Device Structure & Simulation
SEMICONDUCTORS.
IMAGE PROCESSING INTENSITY TRANSFORMATION AND SPATIAL FILTERING
Auburn University COMP8330/7330/7336 Advanced Parallel and Distributed Computing Communication Costs (cont.) Dr. Xiao.
Digital Image Processing
Department of Electrical & Computer Engineering
Low Power and High Speed Multi Threshold Voltage Interface Circuits
Using manipulatives in math
Fine-Grain CAM-Tag Cache Resizing Using Miss Tags
EMC problems of DSOI device and circuits
Pyramid Sketch: a Sketch Framework
Hyunchul Park, Kevin Fan, Manjunath Kudlur,Scott Mahlke
Overview of Power Semiconductor Switches
ELEC 6970: Low Power Design Class Project By: Sachin Dhingra
Three Important Concepts
Power Electronics Introduction Bipolar Transistor Power Amplifiers
DC motor.
University of Colorado at Boulder
PRESENTATION ON TRI GATE TRANSISTOR PREPARED BY: SANDEEP ( )
Timing Optimization.
Semiconductor devices and physics
Lecture 7: Power.
Comparison of Various Multipliers for Performance Issues
Lecture 7: Power.
Day 2 At A Glance So today will have the following: 1) HW Talk
Lecture 10 Power Device (1)
Scalable light field coding using weighted binary images
Overview of Power Semiconductor Switches
FPGA Based Single Phase Motor Control Using Multistep Sine PWM Author Name1, Author Name2., Author Name3, (BE-Stream Name) Under the Guidance Of Guide.
Hangzhou Dianzi University
Clustering-based Studies on the upgraded ITS of the Alice Experiment
Fault Mitigation of Switching Lattices under the Stuck-At Model
Presentation On Schottky Diode. Course Code:3208 Course Title : Microwave radar and satellite communication lab Presented By Salma Akter BKH F.
Presentation transcript:

Body Bias Grain Size Exploration for a Coarse Grained Reconfigurable Accelerator  Yusuke Matsushita, Hayate Okuhara, Koichiro Masuyama, Yu Fujita, Hideharu Amano (Keio Univ.) Thank you Mr.Chairman for the introduction. / My name is Yusuke Matsushita from Keio university./ Today, I will talk about a novel low power technique for a coarse grained reconfigurable accelerator./

Background : SOTB technology A silicon on insulator (SOI) technology SOTB technology can widely control the trade-off between delay and leakage power by body bias. → The leakage current and delay of the transistors can be controlled by using the body bias. LEAP developed a novel SOI CMOS technology called silicon on thin BOX (SOTB) Silicon on Insulator technology, SOI has received an attention as a low power semiconductor device. The leakage power and delay time of transistors with SOI can be controlled by changing their threshold level of the transistors with the body bias control./ Here, we focused a novel SOI CMOS technology called Silicon on thin BOX or SOTB. It was developed by Japanese Low-power electronics association & project or (LEAP), / one of Japanese national projects. It can widely change the leakage and delay by the body bias control. That is, when reverse bias is given, the leakage power is surpressed while the delay is increased. On the other hand, when forward bias is given, the delay can be decreased but the leakage power is increased. Reverse BodyBias Forward BodyBias delay increase× decrease○ leakage power Increase ×

Background : CMA-SOTB µ-controller Data memory Cool Mega Array(CMA)[1] is one of Coarse Grained Reconfigurable Accelerators (CGRA) CMA consists of Processing Element(PE)-Array µ-controller Data Memory In order to keep the balance between PE-Array operation speed and µ-controller memory management speed with body bias is used. PE-Array (12x8) PE ・・・ µ-controller Data memory Body Bias Cool Mega Array called CMA is one of Coarse Grained Reconfigurable Accelerator. CMA consists of Processing Element(PE)-Array as a computation part , a micro-controller as memory management part, and Data memory to store the data. Note that PE-array is pure combinatorial circuits. According to the arithmetic intensity of the target application, the balance between PE-Array operation speed and micro-controller memory management speed can be kept with body bias control. We proposed methods to find the best body bias voltage. [1]

Problem of Body Bias Control and proposed method Problem of conventional Body Bias Control Since the same body bias is supplied to all PEs, this can cause a waste of leakage power. Ex) PE3~8 are not used. But zero bias is given to this PEs. Body Bias Domain Partitioning If body bias can be done in finer granularity, leakage power can be further reduced. However, area overhead is increased. Body bias optimization with Genetic Algorithm is proposed. However, there is a problem in the conventional Body Bias Control method. In conventional body bias control in CMA-SOTB, since the same body bias is supplied to all PEs, this can cause a waste of leakage power. As an example, please take a look a part of the PE-Array assigned operation. Case (a) shows an example of conventional body bias control. White colored PE shows that zero bias is given, grey one shows it uses weak reverse bias and dark shows that the strong bias is given. In conventional body bias control, case (a), all PEs must use zero bias, even if no operation is assigned. For this problem, Body Bias Domain Partitioning should be done. Case (b) shows a case with a body bias domain size 1x2. Case (c) shows a case with a body bias domain size 1x1. when 1x2 domain size is used, they can receive strong weak bias. Using the finest grain, that is in case ( c), more number of PEs can receive reverse bias. Obviously, if body bias can be given in finer granularity, leakage power can be further reduced. However, for the well separation between each body bias domains , fine grain power domain increases area overhead. In order to find the best body bias voltage for various size of power domain, we developed an optimization method based on Genetic Algorithm.

Evaluation Image processing application (alpha, af, sepia, and gray) Considering small area overhead(6%) and a large power reduction(35% on average), 2x1 is the best domain size. In this simulation, application programs for simple image processing (alpha, af, sepia, and gray) were used. Please take a look at two figures. The graph on the left shows area over head at each body bias domain size. The overhead can be evaluated based on a real layout. On the other hand, figure on the right shows power reduction ratio at each domain division size. When the optimized body bias voltage is given, the leakage power is evaluated. Note that it shows the reduction ratio, so larger one is more effective. As the result, Considering small area overhead and a large power reduction, 2x1 is the best domain size. Let’s discuss in front of our poster if you are interested in our project. Thank you.