Professur für Technische Informatik A Self Distributing Virtual Machine for FPGA Multicores Klaus Waldschmidt J. W. Goethe-University Technische Informatik.

Slides:



Advertisements
Similar presentations
Fakultät für informatik informatik 12 technische universität dortmund Optimizations - Compilation for Embedded Processors - Peter Marwedel TU Dortmund.
Advertisements

Embedded Systems & Parallel Programming P. Marwedel, Univ. Dortmund/Informatik 12 + ICD/ES, 2007 Universität Dortmund A view on embedded systems.
Integrating 3D Geodata in Service-Based Visualization Systems Jan Klimke, Dieter Hildebrandt, Benjamin Hagedorn, and Jürgen Döllner Computer Graphics Systems.
MicroKernel Pattern Presented by Sahibzada Sami ud din Kashif Khurshid.
Tintu David Joy. Agenda Motivation Better Verification Through Symmetry-basic idea Structural Symmetry and Multiprocessor Systems Mur ϕ verification system.
1 VIRTUAL MACHINES By: Sai Siddharth Kumar Dantu.
Interconnection Test Framework Josef Hammer jun. Marc Magrans de Abril · Christian Hartl · Thomas Themel · Franz Mittermayr 15 June 2011.
INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Multi-core processors. 2 Processor development till 2004 Out-of-order Instruction scheduling Out-of-order Instruction scheduling.
Technische universiteit eindhoven November 2000Ad Verschueren and Bart Theelen1 The Multi Micro Processor Eindhoven.
2  Industry trends and challenges  Windows Server 2012: Beyond virtualization  Complete virtualization platform  Improved scalability and performance.
Executional Architecture
7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
Corso di Sistemi in Tempo Reale Laurea in Ingegneria dell‘Automazione a.a Paolo Pagano
Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.
1 SECURE-PARTIAL RECONFIGURATION OF FPGAs MSc.Fisnik KRAJA Computer Engineering Department, Faculty Of Information Technology, Polytechnic University of.
Prof. Srinidhi Varadarajan Director Center for High-End Computing Systems.
CSCI 8150 Advanced Computer Architecture Hwang, Chapter 1 Parallel Computer Models 1.2 Multiprocessors and Multicomputers.
Hier wird Wissen Wirklichkeit Computer Architecture – Part 5 – page 1 of 25 – Prof. Dr. Uwe Brinkschulte, Prof. Dr. Klaus Waldschmidt Part 5 Fundamentals.
Memory Management 2010.
HW/SW Co-Synthesis of Dynamically Reconfigurable Embedded Systems HW/SW Partitioning and Scheduling Algorithms.
Presenter MaxAcademy Lecture Series – V1.0, September 2011 Introduction and Motivation.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Computer System Architectures Computer System Software
Predictive Runtime Code Scheduling for Heterogeneous Architectures 1.
A brief overview about Distributed Systems Group A4 Chris Sun Bryan Maden Min Fang.
Distributed Systems 1 CS- 492 Distributed system & Parallel Processing Sunday: 2/4/1435 (8 – 11 ) Lecture (1) Introduction to distributed system and models.
Performance Tuning on Multicore Systems for Feature Matching within Image Collections Xiaoxin Tang*, Steven Mills, David Eyers, Zhiyi Huang, Kai-Cheung.
N. GSU Slide 1 Chapter 02 Cloud Computing Systems N. Xiong Georgia State University.
DiProNN Resource Management System (DiProNN = Distributed Programmable Network Node) Tomáš Rebok Faculty of Informatics MU, Brno Czech.
Distributed Systems: Concepts and Design Chapter 1 Pages
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
SJSU SPRING 2011 PARALLEL COMPUTING Parallel Computing CS 147: Computer Architecture Instructor: Professor Sin-Min Lee Spring 2011 By: Alice Cotti.
ESC499 – A TMD-MPI/MPE B ASED H ETEROGENEOUS V IDEO S YSTEM Tony Zhou, Prof. Paul Chow April 6 th, 2010.
Embedded Runtime Reconfigurable Nodes for wireless sensor networks applications Chris Morales Kaz Onishi 1.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Issues in (Financial) High Performance Computing John Darlington Director Imperial College Internet Centre Fast Financial Algorithms and Computing 4th.
Embedding Constraint Satisfaction using Parallel Soft-Core Processors on FPGAs Prasad Subramanian, Brandon Eames, Department of Electrical Engineering,
© 2004 Mercury Computer Systems, Inc. FPGAs & Software Components Graham Bardouleau & Jim Kulp Mercury Computer Systems, Inc. High Performance Embedded.
Jump to first page One-gigabit Router Oskar E. Bruening and Cemal Akcaba Advisor: Prof. Agarwal.
Kyung Hee University 1/41 Introduction Chapter 1.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Parallelization of Classification Algorithms For Medical Imaging on a Cluster Computing System 指導教授 : 梁廷宇 老師 系所 : 碩光通一甲 姓名 : 吳秉謙 學號 :
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
DIPARTIMENTO DI ELETTRONICA E INFORMAZIONE Novel, Emerging Computing System Technologies Smart Technologies for Effective Reconfiguration: The FASTER approach.
Task Graph Scheduling for RTR Paper Review By Gregor Scott.
POLITECNICO DI MILANO Blanket Team Blanket Reconfigurable architecture and (IP) runtime reconfiguration support in Dynamic Reconfigurability.
A few issues on the design of future multicores André Seznec IRISA/INRIA.
Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.
Abstract A Structured Approach for Modular Design: A Plug and Play Middleware for Sensory Modules, Actuation Platforms, Task Descriptions and Implementations.
1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.
Internet of Things. IoT Novel paradigm – Rapidly gaining ground in the wireless scenario Basic idea – Pervasive presence around us a variety of things.
Survey of multicore architectures Marko Bertogna Scuola Superiore S.Anna, ReTiS Lab, Pisa, Italy.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Addressing Data Compatibility on Programmable Network Platforms Ada Gavrilovska, Karsten Schwan College of Computing Georgia Tech.
Background Computer System Architectures Computer System Software.
Primitive Concepts of Distributed Systems Chapter 1.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 April 28, 2005 Session 29.
Compiler Research How I spent my last 22 summer vacations Philip Sweany.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Conclusions on CS3014 David Gregg Department of Computer Science
Dynamo: A Runtime Codesign Environment
For Massively Parallel Computation The Chaotic State of the Art
Definition of Distributed System
Grid Computing.
Distributed System 電機四 陳伯翰 b
by Manuel Saldaña, Daniel Nunes, Emanuel Ramalho, and Paul Chow
Department of Electrical Engineering Joint work with Jiong Luo
Research: Past, Present and Future
Presentation transcript:

Professur für Technische Informatik A Self Distributing Virtual Machine for FPGA Multicores Klaus Waldschmidt J. W. Goethe-University Technische Informatik Frankfurt/Main, Germany Dagstuhl April 2008

Professur für Technische Informatik a Slide 2 Klaus Waldschmidt – Dagstuhl April 08 One or more processors support the intelligence which is necessary for the smart behaviour. Things that think Things that think, a definition originally presented by MIT. Internet of things Embedded systems are more a or less networked systems. In consequence an Internet of things exists additionally to the wellknow Internet of information

Professur für Technische Informatik a Slide 3 Klaus Waldschmidt – Dagstuhl April 08 Embedded systems and System-on-chips Modern System-on-chips become more and more complex Time to market becomes more and more a necessity Robustness and trust in electronic systems is a big challenge in future Power reduction for mobile applications become more and more important A parallel, flexible, scalable, and generic architecture will be required in future. System-specification Hardware synthesis Communication synthesis Software- compilation Hardware/Software- partitioning Environment Reconfigurable system InputOutput ObserverController

Professur für Technische Informatik a Slide 4 Klaus Waldschmidt – Dagstuhl April 08 FPGA ObserverController FPGA software model Environment InOut Environment Reconfigurable system InputOutput ObserverController

Professur für Technische Informatik a Slide 5 Klaus Waldschmidt – Dagstuhl April 08 Multi-core FPGA (MP-SoC) Multi-core FPGAs create a new kind of system realization… …but there are still a lot of problems to solve: Power Manage- ment Reliability Perform- ance Flexibility Reliability, Flexibility and Power- Management Performance: Algorithms and programming (software) model Reliability: Increase of lifespan and robustness Flexibility for adaptivity and self-organization Power management: Energy reduction for mobility

Professur für Technische Informatik a Slide 6 Klaus Waldschmidt – Dagstuhl April 08 Autonomous and organic behaviour of multi-core computing systems parallel computing reconfigurable (dynamic) computing adaptive computing self- organization

Professur für Technische Informatik a Slide 7 Klaus Waldschmidt – Dagstuhl April 08 Multi-core Systems based on FPGA fxfx f1f1 f2f2 fyfy Processing element (PE) Custom HW function FPGA unite several PEs to form a parallel system increase number of PEs if needed use available space on the FPGA implement special functionality on the FPGA reconfigure at runtime M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR What we need is a software model for FPGAs to make these features manageable. FPGA M PR M PR fyfy M PR f3f3 M PR M PR

Professur für Technische Informatik a Slide 8 Klaus Waldschmidt – Dagstuhl April 08 FPGA layer The Self Distributing Virtual Machine (SDVM) application Core type ACore type B The SDVM as a middleware between application and hardware ? application SDVM layer Application runs transparently distributed on several sites application site Application to be run on heterogeneous, distributed hardware Network on chip (NoC) (bus, mesh, crossbar, Clos Net, …) The SDVM is a virtualization of a parallel, adaptive, and heterogeneous cluster. ?? Sites can join and leave the cluster without disturbing the execution SDVM Sites can join … M PR M PR HW type X fyfy

Professur für Technische Informatik a Slide 9 Klaus Waldschmidt – Dagstuhl April 08 FPGA SDVM uses the dataflow principle to automatically distribute applications and data code is needed at execution time only and thus the params are moved separatelySDVM features (virtual) global shared memory using COMA principle SDVM features distributed dynamic scheduling (work stealing principle) sites may vanish (data is pushed out before) or join (new sites automatically ask for work) at runtime NoC (bus, mesh, crossbar, Clos Net, …) Working principle of the SDVM code fragments can be dynamically subsituted by configware memory processor reconf hardware site memory processor reconf hardware site memory processor reconf hardware site code params execute … params code params code params execute … config ware shutdown! code

Professur für Technische Informatik a Slide 10 Klaus Waldschmidt – Dagstuhl April 08 System-Virtualization using the SDVM 1. FPGAs allow for parallel systems: multiple hardcores multiple softcores multiple custom function units 2. FPGAs allow for heterogeneous systems: PowerPC hardcore MicroBlaze softcore custom function units 3. Runtime-reconfigurable FPGAs make dynamic systems possible. M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR M PR f1f1 f2f2 The adapted SDVM R will act as a virtual layer for dynamic reconfigurable Multi-Core FPGAs. FPGA Application NoC Core type A M PR Core type B M PR SDVMSDVM R

Professur für Technische Informatik a Slide 11 Klaus Waldschmidt – Dagstuhl April 08 SDVM R Objectives 1.Combine all PEs on the FPGA to create a parallel system. 2.Provide task mobility between all PEs even if they are heterogeneous. 3.Virtualize the I/O-system to enable the execution of a task on an arbitrary PE. 4.Combine the distributed memory of each PE to form a virtually shared memory. 5.Manage the reconfiguration of the FPGA. 6.Adjust the number of active PEs at runtime. 7.Hide the actual number of PEs from the application to ease programming. 8.Provide dynamic scheduling as well as code and data distribution. These features will be provided by the SDVM R software layer.

Professur für Technische Informatik a Slide 12 Klaus Waldschmidt – Dagstuhl April 08 FPGA Implementation architecture SDVM R site NoC SDVM R site The SDVM R is implemented as software running on each core. Each core forms an independent site of the SDVM R cluster. Custom function units will get attached to a core. Custom function units as independent sites are planned. (bus, mesh, crossbar, Clos Net, …) M PR PowerPC Hardcore M PR MicroBlaze Softcore SDVM R site M PR MicroBlaze Softcore fyfy custom function unit SDVM R site fyfy custom function unit

Professur für Technische Informatik a Slide 13 Klaus Waldschmidt – Dagstuhl April 08 Partial reconfiguration FPGA SDVM R site NoC (bus, mesh, crossbar, Clos Net, …) SDVM R site SDVM R site SDVM R site Custom function units can be reconfigured without changing the number of sites Reconfiguring a site: The site to reconfigure drops out of the cluster Some other site controls the partial reconfiguration of the FPGA The SDVM R layer is started on the new softcore The new site joins the cluster M PR PowerPC Hardcore fyfy custom function unit M PR MicroBlaze Softcore M PR MicroBlaze Softcore fzfz custom function unit M PR Softcore type A M PR Softcore type B

Professur für Technische Informatik a Slide 14 Klaus Waldschmidt – Dagstuhl April 08 SDVM Test bench Site 1 Site 3 Site 2 Site 4 Ethernet Each site simulates one core of the multi-core chip Cluster consisting of four equal Intel PCs core 1 core 2 core 4core 3

Professur für Technische Informatik a Slide 15 Klaus Waldschmidt – Dagstuhl April 08 Another application: Energy Management – contd 1.The parallelism of most applications changes dynamically. 2.The SDVM features: Autonomous scaling Dynamic workload distribution Distributed dynamic scheduling The dynamic scheduling and workload distribution offers new degrees of freedom when choosing an energy management policy. HFMOFF possible EM-state transitions due to workload variation ? HFMLFM OFFLFM HFM SLEEP OFF

Professur für Technische Informatik a Slide 16 Klaus Waldschmidt – Dagstuhl April 08 Conclusion The SDVM R … is a virtualization layer for dynamic reconfigurable FPGAs separates the application from the number and type of cores exploits the parallelism and dynamic features of todays FPGAs For further information visit the SDVM´s homepage at

Professur für Technische Informatik a Slide 17 Klaus Waldschmidt – Dagstuhl April 08 Thank you for your attention!