Berlin, Germany – January 21st, 2013 A2B: A F RAMEWORK FOR F AST P ROTOTYPING OF R ECONFIGURABLE S YSTEMS Christian Pilato, R. Cattaneo, G. Durelli, A.A.

Slides:

Advertisements

Similar presentations

Torino, Italy – June 27th, 2013 A2B: AN I NTEGRATED F RAMEWORK FOR D ESIGNING H ETEROGENEOUS AND R ECONFIGURABLE S YSTEMS C. Pilato, R. Cattaneo, G. Durelli,

Advertisements

D ARMSTADT, G ERMANY - 11/07/2013 A Framework for Effective Exploitation of Partial Reconfiguration in Dataflow Computing Riccardo Cattaneo ∗, Xinyu Niu†,

Torino, Italy – June 25, 2013 NASA/ESA Conference on Adaptive Hardware and Systems (AHS-2013) C. Pilato R. Cattaneo, C. Pilato, M. Mastinu, M.D. Santambrogio.

Presenter MaxAcademy Lecture Series – V1.0, September 2011 Dataflow Programming with MaxCompiler.

Hardware/ Software Partitioning 2011 年 12 月 09 日 Peter Marwedel TU Dortmund, Informatik 12 Germany Graphics: © Alexandra Nolte, Gesine Marwedel, 2003 These.

ECOE 560 Design Methodologies and Tools for Software/Hardware Systems Spring 2004 Serdar Taşıran.

Politecnico di Milano, Italy

High Level Languages: A Comparison By Joel Best. 2 Sources The Challenges of Synthesizing Hardware from C-Like Languages  by Stephen A. Edwards High-Level.

Fast FPGA Resource Estimation Paul Schumacher & Pradip Jha Xilinx, Inc.

Hardwired networks on chip for FPGAs and their applications

FPGA Latency Optimization Using System-level Transformations and DFG Restructuring Daniel Gomez-Prado, Maciej Ciesielski, and Russell Tessier Department.

Extensible Processors. 2 ASIP Gain performance by:  Specialized hardware for the whole application (ASIC). −  Almost no flexibility. −High cost.  Use.

Ashish Gupta Manan Sanghi Integrated Framework for Visualization and Analysis of Platforms.

Lecture 26: Reconfigurable Computing May 11, 2004 ECE 669 Parallel Computer Architecture Reconfigurable Computing.

ENGIN112 L38: Programmable Logic December 5, 2003 ENGIN 112 Intro to Electrical and Computer Engineering Lecture 38 Programmable Logic.

CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.

Courseware High-Level Synthesis an introduction Prof. Jan Madsen Informatics and Mathematical Modelling Technical University of Denmark Richard Petersens.

Mahapatra-Texas A&M-Fall'001 cosynthesis Introduction to cosynthesis Rabi Mahapatra CPSC498.

Reconfigurable Computing S. Reda, Brown University Reconfigurable Computing (EN2911X, Fall07) Lecture 08: RC Principles: Software (1/4) Prof. Sherief Reda.

Hot Chips 16August 24, 2004 OptimoDE: Programmable Accelerator Engines Through Retargetable Customization Nathan Clark, Hongtao Zhong, Kevin Fan, Scott.

Tejas Bhatt and Dennis McCain Hardware Prototype Group, NRC/Dallas Matlab as a Development Environment for FPGA Design Tejas Bhatt June 16, 2005.

Trend towards Embedded Multiprocessors Popular Examples –Network processors (Intel, Motorola, etc.) –Graphics (NVIDIA) –Gaming (IBM, Sony, and Toshiba)

HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.

GanesanP91 Synthesis for Partially Reconfigurable Computing Systems Satish Ganesan, Abhijit Ghosh, Ranga Vemuri Digital Design Environments Laboratory.

Center for Embedded Computer Systems University of California, Irvine and San Diego SPARK: A Parallelizing High-Level Synthesis.

CS 151 Digital Systems Design Lecture 38 Programmable Logic.

Torino (Italy) – June 25th, 2013 Ant Colony Optimization for Mapping, Scheduling and Placing in Reconfigurable Systems Christian Pilato Fabrizio Ferrandi,

Out-of-Order OpenRISC 2 semesters project Semester A: Implementation of OpenRISC on XUPV5 board Final A Presentation By: Vova Menis-Lurie Sonia Gershkovich.

1  Staunstrup and Wolf Ed. “Hardware Software codesign: principles and practice”, Kluwer Publication, 1997  Gajski, Vahid, Narayan and Gong, “Specification,

Development in hardware – Why? Option: array of custom processing nodes Step 1: analyze the application and extract the component tasks Step 2: design.

Out-of-Order OpenRISC 2 semesters project Semester A: Implementation of OpenRISC on XUPV5 board Midterm Presentation By: Vova Menis-Lurie Sonia Gershkovich.

HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNES: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof.

ISE. Tatjana Petrovic 249/982/22 ISE software tools ISE is Xilinx software design tools that concentrate on delivering you the most productivity available.

A Flexible Interconnection Structure for Reconfigurable FPGA Dataflow Applications Gianluca Durelli, Alessandro A. Nacci, Riccardo Cattaneo, Christian.

1 © FASTER Consortium Catalin Ciobanu Chalmers University of Technology Facilitating Analysis and Synthesis Technologies for Effective Reconfiguration.

Eric Keller, Evan Green Princeton University PRESTO /22/08 Virtualizing the Data Plane Through Source Code Merging.

Automated Design of Custom Architecture Tulika Mitra

© 2003 Xilinx, Inc. All Rights Reserved For Academic Use Only Xilinx Design Flow FPGA Design Flow Workshop.

Research on Reconfigurable Computing Using Impulse C Carmen Li Shen Mentor: Dr. Russell Duren February 1, 2008.

High Performance Embedded Computing © 2007 Elsevier Chapter 1, part 2: Embedded Computing High Performance Embedded Computing Wayne Wolf.

Configurable, reconfigurable, and run-time reconfigurable computing.

IEEE ICECS 2010 SysPy: Using Python for processor-centric SoC design Evangelos Logaras Elias S. Manolakos {evlog, Department of Informatics.

1 Towards Optimal Custom Instruction Processors Wayne Luk Kubilay Atasu, Rob Dimond and Oskar Mencer Department of Computing Imperial College London HOT.

ESL and High-level Design: Who Cares? Anmol Mathur CTO and co-founder, Calypto Design Systems.

- 1 - EE898_HW/SW Partitioning Hardware/software partitioning  Functionality to be implemented in software or in hardware? No need to consider special.

CS244-Introduction to Embedded Systems and Ubiquitous Computing Instructor: Eli Bozorgzadeh Computer Science Department UC Irvine Winter 2010.

MILAN: Technical Overview October 2, 2002 Akos Ledeczi MILAN Workshop Institute for Software Integrated.

DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.

An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.

6. A PPLICATION MAPPING 6.3 HW/SW partitioning 6.4 Mapping to heterogeneous multi-processors 1 6. Application mapping (part 2)

An Overview of Hardware Design Methodology Ian Mitchelle De Vera.

A Hybrid Design Space Exploration Approach for a Coarse-Grained Reconfigurable Accelerator Farhad Mehdipour, Hamid Noori, Hiroaki Honda, Koji Inoue, Kazuaki.

ANALYSIS PHASE OF BUSINESS SYSTEM DEVELOPMENT METHODOLOGY.

HIGH LEVEL SYNTHESIS WITH AREA CONSTRAINTS FOR FPGA DESIGNS: AN EVOLUTIONARY APPROACH Tesi di Laurea di: Christian Pilato Matr.n Relatore: Prof.

Whole Test Suite Generation. Abstract Not all bugs lead to program crashes, and not always is there a formal specification to check the correctness of.

Software Systems Division (TEC-SW) ASSERT process & toolchain Maxime Perrotin, ESA.

1 of 14 Lab 2: Formal verification with UPPAAL. 2 of 14 2 The gossiping persons There are n persons. All have one secret to tell, which is not known to.

POLITECNICO DI MILANO A SystemC-based methodology for the simulation of dynamically reconfigurable embedded systems Dynamic Reconfigurability in Embedded.

Software tools for digital LLRF system integration at CERN 04/11/2015 LLRF15, Software tools2 Andy Butterworth Tom Levens, Andrey Pashnin, Anthony Rey.

Automated Software Generation and Hardware Coprocessor Synthesis for Data Adaptable Reconfigurable Systems Andrew Milakovich, Vijay Shankar Gopinath, Roman.

Programmable Hardware: Hardware or Software?

Dynamo: A Runtime Codesign Environment

FPGA: Real needs and limits

Introduction to cosynthesis Rabi Mahapatra CSCE617

Reconfigurable Computing

Subject Name: Embedded system Design Subject Code: 10EC74

Matlab as a Development Environment for FPGA Design

THE ECE 554 XILINX DESIGN PROCESS

THE ECE 554 XILINX DESIGN PROCESS

Cloud-DNN: An Open Framework for Mapping DNN Models to Cloud FPGAs

Presentation transcript:

Berlin, Germany – January 21st, 2013 A2B: A F RAMEWORK FOR F AST P ROTOTYPING OF R ECONFIGURABLE S YSTEMS Christian Pilato, R. Cattaneo, G. Durelli, A.A. Nacci, M.D. Santambrogio, D. Sciuto Politecnico di Milano Dipartimento di Elettronica. Informazione e Bioingegneria, Italy W ORKSHOP ON R ECONFIGURABLE C OMPUTING (WRC)

Berlin, Germany – January 21st, Motivations  The design of reconfigurable systems is a difficult task Interactions between the different phases have to be taken into account  Decision in the frontend phase may highly affect the backend implementation E.g.: Mapping onto reconfigurable regions and floorplacing of the tasks may generate low-quality solutions due to a wrong partitioning or assignment of implementations  The optimal design methodology (and the number of its iterations) cannot be known in advance A2B is an ongoing project at Politecnico di Milano to assist the design of such complex systems

Berlin, Germany – January 21st, Agenda  Framework Overview  How do we explore the design space?  How do we generate a design solution?  Conclusions and Future Work

Berlin, Germany – January 21st, Framework Overview Evaluation Exploration  Inputs: Information about the target device (.XML) Application source files (.C)  Decision Making (Exploration): Task graph generation Library generation Mapping, Scheduling, Floorplacing Architectural modification  Refinement (Evaluation): Specification of the platform Generation of the SW code  Output: Project files ready for the synthesis with back-end tools

Berlin, Germany – January 21st, XML Exchange Format XML file  The entire project can be represented through an XML file Architecture: components’ characteristics (e.g., reconfigurable regions), … Applications: source code files and profiling information Library: task implementations with the characterization (time, resources,...) Partitions: task graph, mapping and scheduling, … modular organization  It allows a modular organization of the framework, but also the sharing of information among the different phases The phases can be applied in any order to progressively optimize the design The designer can perform as many iterations as he/she wants to refine the solution  Specific details of the target architecture are taken into account only in the refinement phase (interactions with backend tools)

Berlin, Germany – January 21st, Task Graph and Library Generation  Application source code files can be analyzed to extract the task graphs Profiling information can drive the generation of such solutions  Task graph will be then specified in the XML file as processing nodes connected by data transfers Currently they can be designed by hand, but automated methodologies for automatic parallelization will be investigated in the future Transformations to improve the description by splitting/merging the tasks  LLVM-based compiler to extract the DFG of each task Estimation of required resources (including bit-width analysis) Interaction with HLS synthesis tools to obtain more accurate results  Generated implementations are then store in the XML file to offer opportunities to the mapping phase and information to the floorplacer

Berlin, Germany – January 21st, Mapping, Scheduling and Floorplacing one or more configurations  We generate one or more configurations where each task of the applications is analyzed and assigned to An available and admissible implementation A component of the architecture (e.g., processor or reconfigurable region)  This supports to hardware sharing “share” implementations across different tasks (hardware sharing) task relocation move a task implementation to another processing element at run-time (task relocation)  Tasks assigned to the same reconfigurable region are analyzed to determine its constraints and requirement of resources Floorplacing of the regions to determine their positions and the feasibility of the mapping

Berlin, Germany – January 21st, Architecture Exploration  An additional step can be included to explore the target architecture Adding/removing processing elements Modifying their parameters Determining the proper interconnection topology  It can affect: task graph transformations and library generation mapping and floorplacing: modification to the computational resources (especially the number of reconfigurable regions)  It allows a progressive refinement of the solution and a concurrent customization of both architecture and application E.g.: mapping and floorplacing can suggest which resources should be added

Berlin, Germany – January 21st, Supported Platforms  Virtex-5 XC5VLX110T Two XCF32P Platform Flash PROMs (32Mbyte each) SystemACE™ Compact Flash configuration controller 64-bit wide 256Mbyte DDR2 small outline DIMM (SODIMM)  MPC Research Platform MaxWorkstation Intel i7 16GB RAM, 500GB HDD Max3 dataflow engine (DFE) Virtex 6 SX475T FPGA, 24GB memory DFE connected to CPU via PCI Express gen2 x8 XUPV5 Reconf. Area DDR2 (256MB) CPU0 CPU1 CPU MAX3 DFE DRAM (16GB) Interface FPGA Compute FPGA DRAM (24GB)

Berlin, Germany – January 21st, Solution Refinement for the Target Platforms CPU Compiler.c.xml Bitstream Generation HLS (MaxJ-VHDL) -Source code for CPU -DFGs for HW tasks -Mapping configurations Bitstream Generation exec bin bit Manual VHDL Implementations DFG-C HLS (C-VHDL) Manual MaxJ Implementations FPGA-based embedded system MaxWorkstation The code can be always further optimized by hand; e.g., glue code for data transfers MaxIDE DFG-MaxJ

Berlin, Germany – January 21st, Graphical User Interface (GUI)  Practical GUI to support the designer, to limit the errors in the interactions with the XML and to allow custom design methodologies

Berlin, Germany – January 21st, Conclusions and Future Work  A2B is a modular framework to design reconfigurable systems Easy to plug alternative methods for each of the phase Possibility to perform progressive refinement of both application and architecture  A2B is becoming part of a larger project (ASAP – Advanced Synthesis of Applications and Platforms) Refinement will also include the generation of SystemC TLM models of the target system for (co-)simulation and early validation Closer interaction with actual synthesis (e.g., high-level synthesis) Automated methodologies to accelerate the design Our goal is to make them available to the community (open-source) as soon as possible (a.k.a. ASAP)!

Berlin, Germany – January 21st, 2013 Thank you! Christian Pilato the European Community’s Seventh Framework Programme, FASTER project. Research partially funded by the European Community’s Seventh Framework Programme, FASTER project.