Download presentation
Presentation is loading. Please wait.
Published byClement Aldous Cameron Modified over 6 years ago
1
Ottawa, January 9, FETCH FlexTiles: runtime mapping of hardware accelerators on 3D self-adaptive heterogeneous manycore Olivier Sentieys INRIA and University of Rennes CAIRN project-team Jan. 2014
2
The Story What is a heterogeneous multi-core?
Coupling FPGA fabrics to many-core in 2D Then what brings 3D? Flexibility and better resource usage FlexTiles architecture Specific features for task migration Virtual Bit-Stream The case for heterogeneous fabrics CAIRN project-team Jan. 2014
3
Domain-Specific System-on-chip (SoC) with Hardware Reconfiguration
Dynamically adapt the hardware to the application Energy-efficiency High-performance Flexibility Self-adapting devices Continuously adapt to changing environments Other advantages Error, fault and variation tolerance Security against attacks Complementary to general-purpose architectures, “domain-specific soc” are specifically designed for an application domain (such as a wireless terminal or a set-top-box). They classically include several processors (GPP, DSP) and many HW IP blocks so that the energy efficiency and the performance are maximized. Our aim is to propose new architectures and tools for these SoCs with a particular emphasis on reconfigurable hardware. By adapting the hardware to the application, this flexible hardware exhibit a good trade-off between performance, energy and flexibility. Moreover, by taking advantage of dynamic reconfiguration, which means that the hardware can be reconfigured at run-time during execution, we can propose self-adapting devices that can continuously adapt their structure to the environments and of course to the application that is run on it. Finally, other constraints can benefit from these reconfigurable SoC. hardware reconfiguration can be used to mitigate errors, temporary faults or process variation, and can moreover be used to increase the protection against attacks when security matters. SoC from CEA with DART reconfigurable architecture from IRISA/INRIA - CAIRN CAIRN project-team Jan. 2014
4
Heterogeneous Multicores
Many cores on a single chip for both general-purpose and embedded computing Heterogeneous manycores to cope with energy and performance constraints Core Another strong trend in our domain is the possibility to integrate in a near future thousands of cores on a single chip. And this is true for both general-purpose and embedded computing architectures. Cairn will of course continue to focus only on the second category. We foresee that these systems will be heterogeneous multicores to cope with both energy and performance constraints. In our case a heterogeneous multicore architecture is a regular multicore in which each basic core is not only a processor+memory, but also some HW accelerators. And of course we will continue to study the impact of reconfigurable (fine-grain or coarse-grain) accelerators in these multicores and their ability for • power management and for • fault tolerance. Proc. Reconf. HW M HW IPs CAIRN project-team July 2012
5
Multicores Coupled with Reconfig. HW
2D SoC Tightly- coupled HW Proc. Reconf. HW M Proc. HW IP M CAIRN project-team Jan. 2014
6
Multicores Coupled with Reconfig. HW
2D SoC Loosely- coupled HW I/O Configuration RAM Configuration Controler DSP RAM HW Accelerator #1 Core CAIRN project-team Jan. 2014
7
Can 3D Stacking Help? 3D-Stacked Reconfigurable Accelerators
Improved performance Improved flexibility Improved resource usage Core reconfigurable layer multicore layer As just mentioned we will particularly focus our research on “how to embed into a multicore efficient hardware accelerators with run-time reconfiguration”. In this aim we will propose and design new configuration structures such that the dynamic reconfiguration becomes much more efficient and such that the reconfiguration layer can be virtualized to be more efficiently managed by the software interface. It means we want that a task can be moved inside the fabric very easily without a new place&route. We will also study how such architectures can take advantage of 3D stacking. In the context of the FP7 FLEXTILES project we will propose • 3D-stacked reconfigurable accelerators. The main advantage is to have a unified reconfigurable fabric which is linked to the multicore by a 3D network on chip. In that case, there is no predefined area dedicated to one specific core and this is much more flexible than the 2D case. Core CAIRN project-team Jan. 2014
8
Reconfiguration Controler
What’s new with 3D? 3D NI Reconfiguration RAM Reconfiguration Controler RAM DSP CAIRN project-team Jan. 2014
9
FP7 FlexTiles Project in a Nutshell
FlexTiles: Self adaptive heterogeneous manycore based on Flexible Tiles Oct — Sept. 2014 Partners: Thales, Sundance, ACE, UR1, CEA, KIT, TU/e, CSEM, RUB Provide a heterogeneous many-core architecture offering Large flexibility High-performance, energy efficiency Raised programming efficiency Self-adaptation through virtualisation CAIRN project-team Jan. 2014
10
FlexTiles Architecture Overview
3D-Stacked Heterogeneous manycore General Purpose Processors (GPP), for flexibility and programming homogeneity Accelerators, for computing efficiency Digital Signal Processors (DSP) Dedicated hardware accelerators on an embedded FPGA (eFPGA) Network On Chip (NoC): ANoC and Aethereal Reconfigurable layer with improved relocation and migration capabilities Virtualization layer to provide an abstraction of the manycore and self adaptive services Tool-chain for parallelisation and compilation CAIRN project-team Jan. 2014
11
FlexTiles Architecture Overview
Physical nodes GPP node DSP node DDR node eFPGA acc. A “Tile” associates 1 master node 1+ slave nodes A tile is a logical view for architecture programming GPP Node AI DSP eFPGA Fabric NI NoC Config. Ctrl. DDR Node Tile I/O HW acc. Physical nodes in consideration Logical nodes from a functional point of view CAIRN project-team Jan. 2014
12
FlexTiles Architecture Overview
CAIRN project-team Jan. 2014
13
Outline What is a heterogeneous multi-core?
Coupling FPGA fabrics to many-core in 2D Then what brings 3D? Flexibility and better resource usage FlexTiles architecture Specific features for task migration Virtual Bit-Stream The case for heterogeneous fabrics CAIRN project-team Jan. 2014
14
Task Allocation & Migration in FPGA
Predefined reconfigurable regions Bit-stream depends on task location HW Accelerator #1 BS #1 HW Accelerator #1 BS #2 CAIRN project-team Jan. 2014
15
HW Task Migration in eFPGA
3D NI RAM HW Accelerator #1 BS #1 HW Accelerator #2 BS #2 CAIRN project-team Jan. 2014
16
Concept of Virtual Bit-Stream
A task is synthesized and place&route into a Virtual Bit-Stream (VBS) Independent from task physical location in the fabric No predefined configuration domains Resource sharing/distribution easiness, simplified task migration Quartus II CAIRN project-team Jan. 2014
17
Virtual Bit-Stream: Example
CLBIN[1] CLBIN[2] CLBIN[3] CLBOUT CLBIN[0] 16 17 18 Hiding routing details Full BS is 129 bits Could be reduced by giving less details CAIRN project-team Jan. 2014
18
Virtual Bit-Stream: Example
16 17 18 Hiding routing details List of I/O and connections 20 8 1 9 5 18 CAIRN project-team Jan. 2014
19
Virtual Bit-Stream VBS generation principle can be extended for a set of routing resources Smaller size in configuration memory CAIRN project-team Jan. 2014
20
Results: BS Sizes on MCNC Benchmarks
CAIRN project-team Jan. 2014
21
Results: VBS Sizes on MCNC Benchmarks
CAIRN project-team Jan. 2014
22
eFPGA Architecture using VBS
Reconfiguration controller generate final BS at run-time Reconfiguration controller External memory VBS 1 VBS 2 VBS 3 VBS N … Buffer memory data control 1 2 VBS en mémoire externe Requête d’un nœud de supervision => chargement VBS ou refus Finalisation du VBS (routage), placement relatif des éléments fixe CAIRN project-team Jan. 2014
23
Outline What is a heterogeneous multi-core?
Coupling FPGA fabrics to many-core in 2D Then what brings 3D? Flexibility and better resource usage FlexTiles architecture Specific features for task migration Virtual Bit-Stream The case for heterogeneous fabrics CAIRN project-team Jan. 2014
24
Task Placement & Migration
Homogeneous case No constraint on task placement Regular routing architecture Cope with heterogeneity RAM, DSP, 3D I/Os Migration is limited vertically to the same column to the next column containing same complex blocks Logic Element (LE) Configured LE Task CAIRN project-team Jan. 2014
25
eFPGA: Complex blocks handling
Heterogeneous blocks routing is abstracted from logic routing Long lines allow a trade-off between placement flexibility and routing complexity A two-level routing is performed at runtime: Logic routing (as in the homogeneous case) Heterogeneous block routing through long lines CAIRN project-team Jan. 2014
26
eFPGA: Complex blocks handling
Delay depends on final placement Only worst-case delay can be estimated offline Flexibility is still limited in the vertical axis multiple of block height Length of long lines and connections long- lines – routing-resources should be limited Area overhead CAIRN project-team Jan. 2014
27
Conclusion FlexTiles: a self-adaptive heterogeneous multicore
eFPGA layer 3D-stacked to processor layer Flexible resource allocation/sharing Seamless task migration Virtual Bit-Stream CAIRN project-team Jan. 2014
28
Thanks! Questions? CAIRN project-team Jan. 2014
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.