Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Slides:



Advertisements
Similar presentations
Reconfigurable Computing After a Decade: A New Perspective and Challenges For Hardware-Software Co-Design and Development Tirumale K Ramesh, Ph.D. Boeing.
Advertisements

MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Making the System Operational
© 2003 Xilinx, Inc. All Rights Reserved Course Wrap Up DSP Design Flow.
Torino, Italy – June 27th, 2013 A2B: AN I NTEGRATED F RAMEWORK FOR D ESIGNING H ETEROGENEOUS AND R ECONFIGURABLE S YSTEMS C. Pilato, R. Cattaneo, G. Durelli,
D ARMSTADT, G ERMANY - 11/07/2013 A Framework for Effective Exploitation of Partial Reconfiguration in Dataflow Computing Riccardo Cattaneo ∗, Xinyu Niu†,
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Dataflow Programming with MaxCompiler.
From Model-based to Model-driven Design of User Interfaces.
Politecnico di Milano, Italy
Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Berlin, Germany – January 21st, 2013 A2B: A F RAMEWORK FOR F AST P ROTOTYPING OF R ECONFIGURABLE S YSTEMS Christian Pilato, R. Cattaneo, G. Durelli, A.A.
Hardwired networks on chip for FPGAs and their applications
Using DSVM to Implement a Distributed File System Ramon Lawrence Dept. of Computer Science
Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.
1 Student: Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי לישראל.
1 Students: Lin Ilia Khinich Fanny Instructor: Fiksman Evgeny המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.
1 Performed by: Lin Ilia Khinich Fanny Instructor: Fiksman Eugene המעבדה למערכות ספרתיות מהירות High Speed Digital Systems Laboratory הטכניון - מכון טכנולוגי.
Configurable System-on-Chip: Xilinx EDK
Software Issues Derived from Dr. Fawcett’s Slides Phil Pratt-Szeliga Fall 2009.
1 Chapter 7 Design Implementation. 2 Overview 3 Main Steps of an FPGA Design ’ s Implementation Design architecture Defining the structure, interface.
Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,
1 A survey on Reconfigurable Computing for Signal Processing Applications Anne Pratoomtong Spring2002.
© 2011 Xilinx, Inc. All Rights Reserved Intro to System Generator This material exempt per Department of Commerce license exception TSU.
© 2011 Xilinx, Inc. All Rights Reserved This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Juanjo Noguera Xilinx Research Labs Dublin, Ireland Ahmed Al-Wattar Irwin O. Irwin O. Kennedy Alcatel-Lucent Dublin, Ireland.
1 Down Place Hammersmith London UK 530 Lytton Ave. Palo Alto CA USA.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Benefits of Partial Reconfiguration Reducing the size of the FPGA device required to implement a given function, with consequent reductions in cost and.
COLLABORATIVE EXECUTION ENVIRONMENT FOR HETEROGENEOUS PARALLEL SYSTEMS Aleksandar Ili´c, Leonel Sousa 2010 IEEE International Symposium on Parallel & Distributed.
Performance and Overhead in a Hybrid Reconfigurable Computer O. D. Fidanci 1, D. Poznanovic 2, K. Gaj 3, T. El-Ghazawi 1, N. Alexandridis 1 1 George Washington.
A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian.
Department of Electrical Engineering Electronics Computers Communications Technion Israel Institute of Technology High Speed Digital Systems Lab. High.
1 © FASTER Consortium Catalin Ciobanu Chalmers University of Technology Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration.
3 April SOA: Services Oriented Architecture MDA: Model Driven Architecture.
Magnetic Field Measurement System as Part of a Software Family Jerzy M. Nogiec Joe DiMarco Fermilab.
GBT Interface Card for a Linux Computer Carson Teale 1.
© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.
Xilinx Programmable Logic Design Solutions Version 2.1i Designing the Industry’s First 2 Million Gate FPGA Drop-In 64 Bit / 66 MHz PCI Design.
Heng Tan Ronald Demara A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management.
Tools - Implementation Options - Chapter15 slide 1 FPGA Tools Course Implementation Options.
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
1 Fly – A Modifiable Hardware Compiler C. H. Ho 1, P.H.W. Leong 1, K.H. Tsoi 1, R. Ludewig 2, P. Zipf 2, A.G. Oritz 2 and M. Glesner 2 1 Department of.
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Field Programmable Gate Arrays (FPGAs) An Enabling Technology.
This material exempt per Department of Commerce license exception TSU Xilinx Tool Flow.
Functional Verification of Dynamically Reconfigurable Systems Mr. Lingkan (George) Gong, Dr. Oliver Diessel The University of New South Wales, Australia.
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
Introductory project. Development systems Design Entry –Foundation ISE –Third party tools Mentor Graphics: FPGA Advantage Celoxica: DK Design Suite Design.
POLITECNICO DI MILANO Blanket Team Blanket Reconfigurable architecture and (IP) runtime reconfiguration support in Dynamic Reconfigurability.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
Architecture View Models A model is a complete, simplified description of a system from a particular perspective or viewpoint. There is no single view.
Full and Para Virtualization
Evaluating Logic Resources Utilization in an FPGA-Based TMR CPU
Digital Design Using VHDL and PLDs ECOM 4311 Digital System Design Chapter 1.
CCNA1 v3 Module 1 v3 CCNA 1 Module 1 JEOPARDY K. Martin.
ASIC/FPGA design flow. Design Flow Detailed Design Detailed Design Ideas Design Ideas Device Programming Device Programming Timing Simulation Timing Simulation.
POLITECNICO DI MILANO A SystemC-based methodology for the simulation of dynamically reconfigurable embedded systems Dynamic Reconfigurability in Embedded.
Presenter: Yi-Ting Chung Fast and Scalable Hybrid Functional Verification and Debug with Dynamically Reconfigurable Co- simulation.
Dynamo: A Runtime Codesign Environment
School of Engineering University of Guelph
CS427 Multicore Architecture and Parallel Computing
FPGA: Real needs and limits
Rapid Overlay Builder for Xilinx FPGAs
Texas Instruments TDA2x and Vision SDK
FPGA: Real needs and limits
Reconfigurable Computing
Xilinx Alliance Series
Presentation transcript:

Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio Politecnico di Milano – Dip. di Elettronica, Informazione e Bioingegneria O. Kadlcek, O. Pell Maxeler Technologies Ltd., London, UK Runtime Adaptation on Dataflow HPC Platforms

Christian Pilato – Politecnico di Milano2 Context Definition  The portion of the application that needs to be accelerated is usually implemented in the hardware  Resource limitations can become a bottleneck  In some contexts, the HPC application should be able to adapt to the environment  Partial dynamic reconfiguration is a well-know technique to change the behavior at run time while reusing the same logic across different tasks

Christian Pilato – Politecnico di Milano3 Reconfigurable Computing “Reconfigurable computing is intended to fill the gap between hardware and software, achieving potentially much higher performance than software, while maintaing a higher level of flexibility than hardware” (K. Compton and S. Hauck, Reconfigurable Computing: a Survey of Systems and software,2002)

Christian Pilato – Politecnico di Milano4 Reasons Behind  Some applications require performance that cannot be achieved by software  Some applications require to be flexible, modifiable, adaptable. Traditional hardware cannot achieve these results  Reconfigurable Computing platforms allow to be altered after their deployment, turning into a high-performance device able to meet resources constraints, adaptability constraints and reliability constraints

Christian Pilato – Politecnico di Milano Maxeler Architecture Maxeler systems are based on the interaction between a CPU and an FPGA Maxeler exploits FPGAs only as devices devoted to hardware acceleration 5 Why do not try enhancing the flexibility and performance of Maxeler platforms by exploiting some intrinsic characteristics of the FPGAs?

Christian Pilato – Politecnico di Milano6 Objectives  Dynamic Partial Reconfiguration is a technique that can be applied to cope with problems such as the lack of available resources and the system adaptability and reliability  Maxeler architectures are very efficient for computation but they do not support the use of Dynamic Partial Reconfiguration  Designing a new tool flow able to support Dynamic Partial Reconfiguration in Maxeler architectures to offer adaptivity in the HPC domain Rationale Goals

Christian Pilato – Politecnico di Milano7 Canny edge detector

Christian Pilato – Politecnico di Milano8 FPGAs  FPGAs are the reconfigurable devices employed in the Maxeler systems  FPGAs allow to be configured after their deployment  FPGAs taken into consideration are those of the Xilinx Virtex-6 family

Christian Pilato – Politecnico di Milano9 Reconfiguration in FPGAs Useful Definitions  Full Bitstream  Reconfigurable partitions  Reconfigurable modules  Partial Bitstream  Configurations FPGA Full bitstream

Christian Pilato – Politecnico di Milano10 Maxeler Architecture

Christian Pilato – Politecnico di Milano11 Example application Manager SLiC

Christian Pilato – Politecnico di Milano12 MaxCompiler flow MaxIDE Java compilation VHDL BIT file Java runtime

Christian Pilato – Politecnico di Milano Preliminary Considerations 13  Hierarchical design VS flat design  NGDBuild, Map, PAR, Bitgen, are run as many times as the number of configurations  Need for the PXML file to lead the process

Christian Pilato – Politecnico di Milano Proposed Approach 14  Focusing on Kernels instead of Manager  Kernels in the same Reconfigurable Block must have the same characteristics;  In every Configuration, exactly one Kernel must be assigned to each Reconfigurable Bock;  The same Kernel can not be placed in two different Reconfigurable Blocks.  Preserving as much as possible MaxCompiler/Xilinx tool flow structure  Mask the details to the designer

Christian Pilato – Politecnico di Milano15 Reconfiguration on Kernels

Christian Pilato – Politecnico di Milano16 User interface: DFE code PRManagerMain... Configuration A =... Configuration B =... build(A,B) Reconfigurable Block = Reconfigurable Partition Kernel = Reconfigurable Module

Christian Pilato – Politecnico di Milano17 Considerations

Christian Pilato – Politecnico di Milano18 User interface: Host code max_reconfig_partial_bitstream DFE

Christian Pilato – Politecnico di Milano19 Case Study: Edge Detection  Canny edge detection is applied to a video  There are two Reconfigurable Blocks and a total of four filters  each filter represents a Reconfigurable Module  Initially, the first two filters are applied  Then, the device is partially reconfigured and the other two filters are applied 19 DFE

Christian Pilato – Politecnico di Milano MaxWorkstation 20  The targeted platform is MaxWorkstation  It contains a Intel i7 870 quad core CPU with 16 GB RAM  The Intel CPU is connected to the DFE via PCI Express  The DFE has 24 GB RAM, and it is a MAX3 board - XilinxV6

Christian Pilato – Politecnico di Milano21 Experimental Results  Methodology applied to a video taken from “Mission Impossible”  combined with a set of compiler extensions for the automatic code generation of the kernels  details are totally hidden to the designer [VIDEO]

Christian Pilato – Politecnico di Milano22 Conclusions and Future Work  The proposed approach integrated Partial Dynamic Reconfiguration in a dataflow architecture  The process is totally transparent to the designer  Future works will focus on the current limitations:  Reconfigurable Areas constraints can be specified only as multiple of clock regions  During the partial reconfiguration of some Reconfigurable Blocks, all the Kernels are in reset status

? Questions

Christian Pilato – Politecnico di Milano24 Implementation: design flow The build process is divided in four main stages

Christian Pilato – Politecnico di Milano25 First build stage When the build process starts, MaxDC, XST and NGCBuild are run for each Reconfigurable Block and for the static part independently; The result of this first stage is a large number of netlist files.

Christian Pilato – Politecnico di Milano26 Second build stage The second stage consist in running NGDBuild, MAP, Par, pr_verify and Bitgen for each configuration PXML file is automatically generated The static part is implemented only in the first configuration The reconfigurable modules are implemented only the first time they appear in a Configuration

Christian Pilato – Politecnico di Milano27 Final stage Once the full bitstream and all the partial ones have been generated, they are encapsulated in the.Max file The first Configuration passed to the build method is choosen as the “default” Configuration This means that its full bitstream will be loaded in the CFPGA when the program starts