Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling

Similar presentations


Presentation on theme: "Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling"— Presentation transcript:

1 Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
ZHANG Jie CURE

2 Outline Introduction Reconfiguration Scheduling Summary

3 Introduction

4 Introduction Manycore process
Also known as multicore or chip multiprocessor (CMP) Integrate numbers of cores on a single die Architecture for parallel execution E.g. TILE64 processor, intel 80-core teraflop processor

5 Network-on Chip (NoC) NoC is generally regarded as the most promising on-chip communication architecture. Share Bus Bad Scalability Point to Point Hardware Overhead NoC

6 Hardware is not perfect.
Hard faults/Permanent faults Manufacturing defects Wear-out faults, aging effects Both cores and NoC are fault-prone. Core-level redundancy, e.g. GeForce 8800 (192-96) Redundancy

7 Reconfiguration

8 Reconfiguration Hardware failures Hardware designers’ concern:
cannot be predicted. cause NoC to be irregular and diverse from each other. Hardware designers’ concern: Mitigate the performance degradation in the presence of faults with the redundancy Software designers’ concern: How to optimize applications under diverse hardware platforms.

9 Solution Virtual Topology: a layer between SW and HW
How to efficiently and effectively choose the VT.

10 Scheduling

11 Scheduling Core scheduling & task scheduling Requirement
Core scheduling assigns the required number of cores to the jobs. Task scheduling determines the order in which incoming jobs are executed. Requirement fast

12 Traditional Solutions
Scheduling Problem: Given a sequence of jobs which have diverse core requirement. Minimize the execution time Solution Contiguous scheduling: Cores assigned are physically adjacent. Aim: Let as many active cores as possible Problem: Ignore topology asymmetry of NoC Ignore performance asymmetry of cores

13 Proposed Solution Requirement: fast NoC asymmetry Core asymmetry
Virtual Topology: provide the best platform to OS Core asymmetry Basic idea: set weights for each core High weight for high-performance core Low weight for low-performance core

14 Summary

15 Summary Reliability and performance should been considered together.
HW should provide both reliable and high-performance platform to OS, while SW should properly adopt it to achieve full performance.


Download ppt "Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling"

Similar presentations


Ads by Google