FPGA: Real needs and limits

Slides:



Advertisements
Similar presentations
Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.
Advertisements

ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Robert Barnes Utah State University Department of Electrical and Computer Engineering Thesis Defense, November 13 th 2008.
Fault Detection in a HW/SW CoDesign Environment Prepared by A. Gaye Soykök.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
BRASS Analysis of QuasiStatic Scheduling Techniques in a Virtualized Reconfigurable Machine Yury Markovskiy, Eylon Caspi, Randy Huang, Joseph Yeh, Michael.
Dynamic Hardware Software Partitioning A First Approach Komal Kasat Nalini Kumar Gaurav Chitroda.
1  Staunstrup and Wolf Ed. “Hardware Software codesign: principles and practice”, Kluwer Publication, 1997  Gajski, Vahid, Narayan and Gong, “Specification,
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Benefits of Partial Reconfiguration Reducing the size of the FPGA device required to implement a given function, with consequent reductions in cost and.
A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian.
1 © FASTER Consortium Catalin Ciobanu Chalmers University of Technology Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration.
Operating Systems for Reconfigurable Systems John Huisman ID:
POLITECNICO DI MILANO Reconfiguration 4 Reliability design methodology for reliability assessment and enhancement of FPGA-based systems Dynamic Reconfigurability.
Advanced Computer Architecture, CSE 520 Generating FPGA-Accelerated DFT Libraries Chi-Li Yu Nov. 13, 2007.
Heng Tan Ronald Demara A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management.
High Performance Embedded Computing © 2007 Elsevier Lecture 3: Design Methodologies Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte Based.
High Performance Embedded Computing © 2007 Elsevier Chapter 1, part 2: Embedded Computing High Performance Embedded Computing Wayne Wolf.
1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
Functional Verification of Dynamically Reconfigurable Systems Mr. Lingkan (George) Gong, Dr. Oliver Diessel The University of New South Wales, Australia.
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
POLITECNICO DI MILANO Blanket Team Blanket Reconfigurable architecture and (IP) runtime reconfiguration support in Dynamic Reconfigurability.
QCAdesigner – CUDA HPPS project
Software Engineering Chapter: Computer Aided Software Engineering 1 Chapter : Computer Aided Software Engineering.
1 Advanced Digital Design Reconfigurable Logic by A. Steininger and M. Delvai Vienna University of Technology.
Physically Aware HW/SW Partitioning for Reconfigurable Architectures with Partial Dynamic Reconfiguration Sudarshan Banarjee, Elaheh Bozorgzadeh, Nikil.
1 SHARCS: Secure Hardware-Software Architectures for Robust Computing Systems Sotiris Ioannidis FORTH.
POLITECNICO DI MILANO A SystemC-based methodology for the simulation of dynamically reconfigurable embedded systems Dynamic Reconfigurability in Embedded.
Creation and Utilization of a Virtual Platform for Embedded Software Optimization: An Industrial Case Study Sungpack Hong, Sungjoo Yoo, Sheayun Lee, Sangwoo.
April 15, 2013 Atul Kwatra Principal Engineer Intel Corporation Hardware/Software Co-design using SystemC/TLM – Challenges & Opportunities ISCUG ’13.
Automated Software Generation and Hardware Coprocessor Synthesis for Data Adaptable Reconfigurable Systems Andrew Milakovich, Vijay Shankar Gopinath, Roman.
Dynamic and On-Line Design Space Exploration for Reconfigurable Architecture Fakhreddine Ghaffari, Michael Auguin, Mohamed Abid Nice Sophia Antipolis University.
Unix Server Consolidation
Programmable Logic Devices
System-on-Chip Design
Programmable Hardware: Hardware or Software?
Current Generation Hypervisor Type 1 Type 2.
Memory COMPUTER ARCHITECTURE
Hiba Tariq School of Engineering
Dynamo: A Runtime Codesign Environment
The Development Process of Web Applications
Chapter 8 – Software Testing
THE PROCESS OF EMBEDDED SYSTEM DEVELOPMENT
Chapter 9 – Real Memory Organization and Management
Task Scheduling for Multicore CPUs and NUMA Systems
ENG3050 Embedded Reconfigurable Computing Systems
FPGA: Real needs and limits
Improving java performance using Dynamic Method Migration on FPGAs
Maintaining Data Integrity in Programmable Logic in Atmospheric Environments through Error Detection Joel Seely Technical Marketing Manager Military &
Bank-aware Dynamic Cache Partitioning for Multicore Architectures
Anne Pratoomtong ECE734, Spring2002
Hot & Spicy: Improving Productivity with Python and HLS for FPGAs
Economics, Administration & Information system
Improved schedulability on the ρVEX polymorphic VLIW processor
Matlab as a Development Environment for FPGA Design
Virtualization Techniques
Chapter 2: The Linux System Part 1
A High Performance SoC: PkunityTM
Introduction to Embedded Systems
A Simulator to Study Virtual Memory Manager Behavior
Chapter 1 Introduction.
HIGH LEVEL SYNTHESIS.
Optimizing stencil code for FPGA
Mark McKelvin EE249 Embedded System Design December 03, 2002
Department of Computer Science, University of Tennessee, Knoxville
Luca Simoncini PDCC, Pisa and University of Pisa, Pisa, Italy
Presentation transcript:

FPGA: Real needs and limits Politecnico di Milano NECST Meeting Room 9 November, 2016 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano

2 Motivations Reconfigurable systems, while providing new interesting features in the field of hardware/software co-design, and more in general in the embedded system design, also introduce new problems in their implementation and management. This is particularly true for systems that implement self partial reconfiguration, such as Xilinx platforms. This talk will present the different scenarios (i.e. flexibility, resource lack…) where the reconfiguration can be effective showing also the drawbacks introduced by this new feature. We will show the presence of two different kinds of limits, theoretical and physical ones, trying to highlight possible solutions to both of these.

3 Outline Some real needs Limits and drawbacks

What’s next Some real needs Limits and drawbacks 4 What’s next Some real needs Behavioral and structural flexibility Performance enhancement Fault tolerance Limits and drawbacks

Behavioral and Structural flexibility 5 Behavioral and Structural flexibility Speedup the overall computation of the final system Increasing need for behavioral flexibility in embedded systems design Support of new standards, e.g. in media processing Addition of new features New applications too large to fit on the device all at once

6 HW vs SW

7 HW vs SW

Digital Image Processing 8 Digital Image Processing The canny edge detector is used to detect the edges in a given input image i [Kb] 4 functionalites Image smoothing  remove the noise Gradient operator  highlight regions with high spatial derivative Non-maximum suppression  reveal the edges Hysteresis  remove false edges Each functionality has to be executed using an input of j [Kb] j ≤ i and x = i/j Time analysis to identify a first partition in HW core and SW core Non-maximum suppression implemented as a SW core Image smoothing, Gradient operator and Hysteresis implemented as HW cores 8

DIP: Partial Reconfiguration 9 DIP: Partial Reconfiguration Static side and IP-Cores resources requirement analysis Time analysis, resources requirement and reconfigurability evaluation Hysteresis implemented as a SW core Image smoothing and Gradient operator implemented as reconfigurable cores Reconfiguration time (fa into fb): 368ms 9

10 Damage/Reliability SRAM-based FPGAs are particularly sensible to radiation effects not only in critical environment, but also at terrestrial level alpha particles hitting devices cause temporary and permanent faults temporary faults can be modeled as Modification in the data being processed user-memory corruption Modification of the functionality being performed configuration-memory corruption Embedded systems implemented on FPGAs need “robustness” to radiations, achieved by means of by-design fault tolerance

What’s next Some real needs Limits and drawbacks 11 What’s next Some real needs Limits and drawbacks Simulation and verification Design flow support Reconfiguration time overhead

Limits and Drawbacks Simulation and Verification 12 Limits and Drawbacks Simulation and Verification Design flow: The need of a comprehensive framework which can guide designers through the whole implementation process is becoming stronger Reconfiguration times impact heavily on the final solution’s latency

Simulation and Verification 13 Simulation and Verification A new way of intending simulation Simulation used to explore the design space to find the best architectural solution Support to HW/SW codesing solutions but no standard ways to verify the overall (reconfigurable) design Unfocused tools for the verification of all the reconfiguration related aspects Xilinx Chipscope Jbits (no longer supported)

14 Design flow Dynamic reconfigurable embedded systems are gathering, an increasing interest from both the scientific and the industrial world The need of a comprehensive framework which can guide designers through the whole implementation process is becoming stronger There are several techniques to exploit partial reconfiguration, but.. Few approaches for frameworks and tools to design dynamically reconfigurable systems They don’t take into consideration both the HW and the SW side of the final architecture They are not able to support different devices They cannot be used to design systems for different architectural solution 14

SDx - Origin: Productivity Gap 15 SDx - Origin: Productivity Gap 19billion transistors today

SDx - Origin: Productivity Gap 16 SDx - Origin: Productivity Gap Normal mortals cannot easily program massively parallel systems 19billion transistors today

SDx - Origin: Productivity gap from another angle 17 SDx - Origin: Productivity gap from another angle (David Thomas, Imperial College, UK) NRE cost too steep FPGAs provide large speed-up and power savings – at a price! Days or weeks to get an initial version working Multiple optimisation and verification cycles to get high performance

Innovation: Evolution of Design Environments 18 Innovation: Evolution of Design Environments ISE, RTL-based design entry with IP library Legacy Microblaze, SDK, EDK Embedded CPU integration Vivado HLS SDNet (DSL PX) Block stitching and manual integration in platform in RTL Raised abstraction for accelerators SDSoC, SDNet, SDAccel Predefined methods for data transfer & automated implementation Simplified host integration & automated infrastructure creation Time Abstraction

Innovation: Evolution of Design Environments 19 Innovation: Evolution of Design Environments ISE, RTL-based design entry with IP library Legacy Microblaze, SDK, EDK Embedded CPU integration Vivado HLS SDNet (DSL PX) Block stitching and manual integration in platform in RTL Raised abstraction for accelerators SDSoC, SDNet, SDAccel Predefined methods for data transfer & automated implementation Simplified host integration & automated infrastructure creation Time Abstraction

Innovation: Evolution of Design Environments 20 Innovation: Evolution of Design Environments ISE, RTL-based design entry with IP library Legacy Microblaze, SDK, EDK Embedded CPU integration Vivado HLS SDNet (DSL PX) Block stitching and manual integration in platform in RTL Raised abstraction for accelerators SDSoC, SDNet, SDAccel Predefined methods for data transfer & automated implementation Simplified host integration & automated infrastructure creation Time Abstraction

Innovation: Evolution of Design Environments 21 Innovation: Evolution of Design Environments ISE, RTL-based design entry with IP library Legacy Microblaze, SDK, EDK Embedded CPU integration Vivado HLS SDNet (DSL PX) Block stitching and manual integration in platform in RTL Raised abstraction for accelerators SDSoC, SDNet, SDAccel Predefined methods for data transfer & automated implementation Simplified host integration & automated infrastructure creation Time Abstraction

Platform creation, monitoring & profiling, runtime OS, static and dynamic workload partitioning, cloud integration

Reconfiguration challenges 23 Reconfiguration challenges Reconfiguration times heavily impact on the final solution’s latency Hiding reconfiguration time is not sufficient Possible solution: Trivial Bitstream dimension reduction Complex Maximize the reuse of configured modules Reconfiguration hiding Alternative implementation (SW execution) Relocation 23 23

24 Tasks reuse Reconfiguration times impact heavily on the final solution’s latency, therefore: Not only try to hide the reconfigurations But try to maximize the reuse of reconfigurable modules Schedule length is on average at least 18.6% better than the shortest one and 19.7% better than the average.

Reconfiguration hiding 25 Reconfiguration hiding

Reconfiguration hiding 26 Reconfiguration hiding

Alternative implementation (SW execution) 27 Alternative implementation (SW execution) Object code implemented as hardware components do not always guarantee the best performance… Cryptography architecture 1 GPP running Linux 2 reconfigurable regions 2 cryptography services (AES and DES)

Relocation: The Problem 28 Relocation: The Problem

Relocation: The Problem 29 Relocation: The Problem

Relocation: The Problem 30 Relocation: The Problem

31 Relocation: Scenario

Relocation: Motivation 32 Relocation: Motivation

Relocation: Motivation 33 Relocation: Motivation

Relocation: Motivation 34 Relocation: Motivation

Relocation: Rationale Bitstreams relocation technique to: speedup the overall system execution reduce the amount of memory used to store partial bitstreams achieve a core preemptive execution assign at runtime the bitstreams placement

Relocation: Virtual homogeneity 36 Relocation: Virtual homogeneity

BiRF - Relocation management Empty A B 37 BiRF - Relocation management Create an integrated HW/SW system to manage relocation (1D and 2D) in reconfigurable architecture Maintain information on FPGA status Decide how to efficiently allocate tasks Provide support for effective task allocation Perform bitstream relocation 37 37 37

FPGA: Real needs and limits Questions… FPGA: Real needs and limits Politecnico di Milano NECST Meeting Room 9 November, 2016 Antonio R. Miele Marco D. Santambrogio Politecnico di Milano