Download presentation
Presentation is loading. Please wait.
Published byClement Hawkins Modified over 6 years ago
1
Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling
ZHANG Jie CURE
2
Outline Introduction Reconfiguration Scheduling Summary
3
Introduction
4
Introduction Manycore process
Also known as multicore or chip multiprocessor (CMP) Integrate numbers of cores on a single die Architecture for parallel execution E.g. TILE64 processor, intel 80-core teraflop processor
5
Network-on Chip (NoC) NoC is generally regarded as the most promising on-chip communication architecture. Share Bus Bad Scalability Point to Point Hardware Overhead NoC
6
Hardware is not perfect.
Hard faults/Permanent faults Manufacturing defects Wear-out faults, aging effects Both cores and NoC are fault-prone. Core-level redundancy, e.g. GeForce 8800 (192-96) Redundancy
7
Reconfiguration
8
Reconfiguration Hardware failures Hardware designers’ concern:
cannot be predicted. cause NoC to be irregular and diverse from each other. Hardware designers’ concern: Mitigate the performance degradation in the presence of faults with the redundancy Software designers’ concern: How to optimize applications under diverse hardware platforms.
9
Solution Virtual Topology: a layer between SW and HW
How to efficiently and effectively choose the VT.
10
Scheduling
11
Scheduling Core scheduling & task scheduling Requirement
Core scheduling assigns the required number of cores to the jobs. Task scheduling determines the order in which incoming jobs are executed. Requirement fast
12
Traditional Solutions
Scheduling Problem: Given a sequence of jobs which have diverse core requirement. Minimize the execution time Solution Contiguous scheduling: Cores assigned are physically adjacent. Aim: Let as many active cores as possible Problem: Ignore topology asymmetry of NoC Ignore performance asymmetry of cores
13
Proposed Solution Requirement: fast NoC asymmetry Core asymmetry
Virtual Topology: provide the best platform to OS Core asymmetry Basic idea: set weights for each core High weight for high-performance core Low weight for low-performance core
14
Summary
15
Summary Reliability and performance should been considered together.
HW should provide both reliable and high-performance platform to OS, while SW should properly adopt it to achieve full performance.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.