Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.

Slides:



Advertisements
Similar presentations
Commercial FPGAs: Altera Stratix Family Dr. Philip Brisk Department of Computer Science and Engineering University of California, Riverside CS 223.
Advertisements

VADA Lab.SungKyunKwan Univ. 1 L3: Lower Power Design Overview (2) 성균관대학교 조 준 동 교수
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Lecture 7 FPGA technology. 2 Implementation Platform Comparison.
New ADSL2 Standards Which delivers improved rate and reach performance, advanced diagnostics capabilities, standby modes, and more to broadband designers.
A Survey of Logic Block Architectures For Digital Signal Processing Applications.
PRESENTED BY: PRIYANK GUPTA 04/02/2012 Generic Low Latency NoC Router Architecture for FPGA Computing Systems & A Complete Network on Chip Emulation Framework.
Hardwired networks on chip for FPGAs and their applications
FPGA Implementation of Closed-Loop Control System for Small-Scale Robot.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR SRAM-based FPGA n SRAM-based LE –Registers in logic elements –LUT-based logic element.
Submission May, 2000 Doc: IEEE / 086 Steven Gray, Nokia Slide Brief Overview of Information Theory and Channel Coding Steven D. Gray 1.
BIST for Logic and Memory Resources in Virtex-4 FPGAs Sachin Dhingra, Daniel Milton, and Charles Stroud Electrical and Computer Engineering Auburn University.
Caltech CS184a Fall DeHon1 CS184a: Computer Architecture (Structures and Organization) Day17: November 20, 2000 Time Multiplexing.
L27:Lower Power Algorithm for Multimedia Systems 성균관대학교 조 준 동
Zheming CSCE715.  A wireless sensor network (WSN) ◦ Spatially distributed sensors to monitor physical or environmental conditions, and to cooperatively.
Characterization Presentation Neural Network Implementation On FPGA Supervisor: Chen Koren Maria Nemets Maxim Zavodchik
Applications of Systolic Array FTR, IIR filtering, and 1-D convolution. 2-D convolution and correlation. Discrete Furier transform Interpolation 1-D and.
BEEKeeper Remote Management and Debugging of Large FPGA Clusters Terry Filiba Navtej Sadhal.
An FPGA Based Adaptive Viterbi Decoder Sriram Swaminathan Russell Tessier Department of ECE University of Massachusetts Amherst.
Implementation of DSP Algorithm on SoC. Mid-Semester Presentation Student : Einat Tevel Supervisor : Isaschar Walter Accompaning engineer : Emilia Burlak.
CS 151 Digital Systems Design Lecture 38 Programmable Logic.
Introduction to FPGA and DSPs Joe College, Chris Doyle, Ann Marie Rynning.
Low Power Design of Integrated Systems Assoc. Prof. Dimitrios Soudris
Virtual Architecture For Partially Reconfigurable Embedded Systems (VAPRES) Architecture for creating partially reconfigurable embedded systems Module.
HW/SW CODESIGN OF THE MPEG-2 VIDEO DECODER Matjaz Verderber, Andrej Zemva, Andrej Trost University of Ljubljana Faculty of Electrical Engineering Trzaska.
Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.
Using Programmable Logic to Accelerate DSP Functions 1 Using Programmable Logic to Accelerate DSP Functions “An Overview“ Greg Goslin Digital Signal Processing.
Dr. Konstantinos Tatas ACOE201 – Computer Architecture I – Laboratory Exercises Background and Introduction.
Viterbi Decoder Project Alon weinberg, Dan Elran Supervisors: Emilia Burlak, Elisha Ulmer.
156 / MAPLD 2005 Rollins 1 Reducing Energy in FPGA Multipliers Through Glitch Reduction Nathan Rollins and Michael J. Wirthlin Department of Electrical.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Benefits of Partial Reconfiguration Reducing the size of the FPGA device required to implement a given function, with consequent reductions in cost and.
1 Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 2 Micro2Gen Ltd., NCSR Demokritos, Greece 17th IEEE International Conference.
Highest Performance Programmable DSP Solution September 17, 2015.
Power Reduction for FPGA using Multiple Vdd/Vth
Low-Power Wireless Sensor Networks
High-Level Interconnect Architectures for FPGAs An investigation into network-based interconnect systems for existing and future FPGA architectures Nick.
(TPDS) A Scalable and Modular Architecture for High-Performance Packet Classification Authors: Thilan Ganegedara, Weirong Jiang, and Viktor K. Prasanna.
High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.
Tinoosh Mohsenin and Bevan M. Baas VLSI Computation Lab, ECE Department University of California, Davis Split-Row: A Reduced Complexity, High Throughput.
J. Christiansen, CERN - EP/MIC
Page 1 Reconfigurable Communications Processor Principal Investigator: Chris Papachristou Task Number: NAG Electrical Engineering & Computer Science.
FPGA-Based System Design: Chapter 3 Copyright  2004 Prentice Hall PTR FPGA Fabric n Elements of an FPGA fabric –Logic element –Placement –Wiring –I/O.
Distributed computing using Projective Geometry: Decoding of Error correcting codes Nachiket Gajare, Hrishikesh Sharma and Prof. Sachin Patkar IIT Bombay.
Swankoski MAPLD 2005 / B103 1 Dynamic High-Performance Multi-Mode Architectures for AES Encryption Eric Swankoski Naval Research Lab Vijay Narayanan Penn.
PADS Power Aware Distributed Systems Architecture Approaches USC Information Sciences Institute Brian Schott, Bob Parker UCLA Mani Srivastava Rockwell.
MAPLD 2005/254C. Papachristou 1 Reconfigurable and Evolvable Hardware Fabric Chris Papachristou, Frank Wolff Robert Ewing Electrical Engineering & Computer.
StrideBV: Single chip 400G+ packet classification Author: Thilan Ganegedara, Viktor K. Prasanna Publisher: HPSR 2012 Presenter: Chun-Sheng Hsueh Date:
4/19/20021 TCPSplitter: A Reconfigurable Hardware Based TCP Flow Monitor David V. Schuehler.
1 Leakage Power Analysis of a 90nm FPGA Authors: Tim Tuan (Xilinx), Bocheng Lai (UCLA) Presenter: Sang-Kyo Han (ECE, University of Maryland) Published.
M. ALSAFRJALANI D. DZENITIS Runtime PR for Software Radio 2/26/2010 UFL ECE Dept 1 PARTIAL RECONFIGURATION (PR)
ESS | FPGA for Dummies | | Maurizio Donna FPGA for Dummies Basic FPGA architecture.
Introduction to Field Programmable Gate Arrays Lecture 1/3 CERN Accelerator School on Digital Signal Processing Sigtuna, Sweden, 31 May – 9 June 2007 Javier.
SCORES: A Scalable and Parametric Streams-Based Communication Architecture for Modular Reconfigurable Systems Abelardo Jara-Berrocal, Ann Gordon-Ross NSF.
SR: 599 report Channel Estimation for W-CDMA on DSPs Sridhar Rajagopal ECE Dept., Rice University Elec 599.
1 Hardware-Software Co-Synthesis of Low Power Real-Time Distributed Embedded Systems with Dynamically Reconfigurable FPGAs Li Shang and Niraj K.Jha Proceedings.
A 1.2V 26mW Configurable Multiuser Mobile MIMO-OFDM/-OFDMA Baseband Processor Motivations –Most are single user, SISO, downlink OFDM solutions –Training.
Optimizing Packet Lookup in Time and Space on FPGA Author: Thilan Ganegedara, Viktor Prasanna Publisher: FPL 2012 Presenter: Chun-Sheng Hsueh Date: 2012/11/28.
Channel Coding and Error Control 1. Outline Introduction Linear Block Codes Cyclic Codes Cyclic Redundancy Check (CRC) Convolutional Codes Turbo Codes.
System on a Programmable Chip (System on a Reprogrammable Chip)
Runtime Reconfigurable Network-on- chips for FPGA-based systems Mugdha Puranik Department of Electrical and Computer Engineering
Optimizing Interconnection Complexity for Realizing Fixed Permutation in Data and Signal Processing Algorithms Ren Chen, Viktor K. Prasanna Ming Hsieh.
Topics SRAM-based FPGA fabrics: Xilinx. Altera..
Application-Specific Customization of Soft Processor Microarchitecture
Architecture & Organization 1
Architecture & Organization 1
Jian Huang, Matthew Parris, Jooheung Lee, and Ronald F. DeMara
Overview of Computer Architecture and Organization
Dynamic Partial Reconfiguration of FPGA
Application-Specific Customization of Soft Processor Microarchitecture
Presentation transcript:

Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland

 introduce a new approach to reduce FPGA power consumption.  By exploiting the time varying nature of a systems environment  closely tracking environmental changes and adapting the implementation accordingly using partial reconfiguration

 Partial Reconfiguration (PR) allows the reconfiguration of a part of the device while the rest of the FPGA continues operating  there have been multiple hardware enhancements to Xilinx FPGAs to better support partial reconfiguration.

 Smaller units of reconfiguration granularity. ◦ From the full device height reconfiguration frames in the Virtex-II and Virtex-II Pro families to the 16- CLB’s high in the Virtex-4 family.  Increased bandwidth in the internal configuration access port: ◦ From 50Mbytes/s in the Virtex-II and Virtex-II Pro families to 400Mbytes/s in the Virtex-4 family  Early Access Partial Reconfiguration (EAPR)

 Traditionally, partial reconfiguration has been used to time multiplex multiple mutual exclusive functions, hence reducing cost and static power consumption. ◦ it does not present any benefit in applications where all application functions are required on the FPGA at the same time

 use of partial reconfiguration to time- multiplex different implementations of the same function. ◦ reduce the FPGAs dynamic power consumption  specializing the implementation to the current subset of requirements, we can reduce average power consumption.

 We have applied this idea of adapting the implementation for power savings to the networking application domain ◦ using a forward error correction core (i.e., Viterbi decoder)

 most of the dynamic power dissipation in an FPGA fabric is due to the programmable interconnects and clocking resources  reductions in power consumption by increasing the number of pipeline stages in a FPGA design  Several authors have proposed low-power implementations of the Viterbi decoder

 The environment is the stimulus it receives from external sources ◦ e.g. number of users in a system, communication channel conditions, or total throughput. ◦ The number of users in a wireless base-station changes throughout the day. ◦ signal to noise ratio at a wireless base-station changes with the location of the mobile phone ◦ The mixture of voice and data users on a cellular base-station changes throughout the day

 Cost of electricity ◦ Google warned that the cost of electricity used to power their equipment could soon be greater than cost of the equipment itself  Reliability ◦ Average heat energy is the greatest determinant of digital electronics reliability  Thermal Engineering ◦ Thermal engineering is concerned with removing excess heat energy from a system.

 Application-level partial reconfiguration  Architecture-level partial reconfiguration ◦ the bit width of the data path or the number of pipeline stages in an arithmetic block implementation  Device-level partial reconfiguration ◦ loading the unused function’s FPGA area with the most power efficient idle configuration or directly controlling the FPGA clocking resources (i.e., clock buffers or DCM modules) from the configuration memory

 Forward error correction codes such as convolutional codes limit the effects of noise in digital communication  Viterbi algorithm is used for decoding convolutional codes  widely applied in networking applications due to its good noise tolerance

 adapting the Viterbi decoder implementation in two ways ◦ changes in the signal to noise ratio ◦ changes in the required throughput  Xilinx provides a Viterbi decoder core in Coregen.

 running at 100MHz  dual-port memory blocks (32Kbytes) implemented using on-chip BRAM’s  we connected a power supply with integrated ammeter to the FPGA internal core

 The Viterbi algorithm’s constraint length (K) greatly impacts the decoder’s Bit Error Rate (BER) performance  We verified this assumption experimentally using three implementations of the parallel Viterbi decoder with different constraint lengths.  significant impact that the constraint length parameter has on the number of FPGA resources used

 The Xilinx Viterbi core has a parameter that enables the user to select among a serial and a parallel architecture

 power consumption measurements reveal, that for this example, the parallel architecture is more power-efficient than the serial architecture  sample points for the 8.3Mbps throughput we can observe that there is a difference of 200mW (approx.)

 Reducing the number of LUTs and routing resources required to implement a function effectively reduces its capacitance  dynamic power consumption is also proportional to the switching activity of all nodes in the design  The serial architecture requires 12 clock cycles for each decoding process, while the parallel architecture only requires a single clock cycle ◦ Serial average power consumption of 0.7W (approx.), with peaks around 1W. ◦ Parallel average power of 0.5W (approx.) and peaks of 2.5W.