Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling

Slides:



Advertisements
Similar presentations
System-level Trade-off of Networks-on-Chip Architecture Choices Network-on-Chip System-on-Chip Group, CSE-IMM, DTU.
Advertisements

Thread Criticality Predictors for Dynamic Performance, Power, and Resource Management in Chip Multiprocessors Abhishek Bhattacharjee Margaret Martonosi.
Scheduling Algorithms for Unpredictably Heterogeneous CMP Architectures J. Winter and D. Albonesi, Cornell University International Conference on Dependable.
Embedded Parallel Systems Based on Dynamic Look-Ahead Reconfiguration in Redundant Systems Stephen Holmes.
An Integrated Framework for Dependable Revivable Architectures Using Multi-core Processors Weiding Shi, Hsien-Hsin S. Lee, Laura Falk, and Mrinmoy Ghosh.
What Great Research ?s Can RAMP Help Answer? What Are RAMP’s Grand Challenges ?
1 Multi - Core fast Communication for SoPC Multi - Core fast Communication for SoPC Technion – Israel Institute of Technology Department of Electrical.
Virtualization Virtualization is the creation of substitutes for real resources – abstraction of real resources Users/Applications are typically unaware.
Computer System Architectures Computer System Software
COLLABORATIVE EXECUTION ENVIRONMENT FOR HETEROGENEOUS PARALLEL SYSTEMS Aleksandar Ili´c, Leonel Sousa 2010 IEEE International Symposium on Parallel & Distributed.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Andrea Marongiu Luca Benini ETH Zurich Daniele Cesarini University of Bologna.
Multicore In Real-Time Systems – Temporal Isolation Challenges Due To Shared Resources Ondřej Kotaba, Jan Nowotsch, Michael Paulitsch, Stefan.
Architectural Support for Fine-Grained Parallelism on Multi-core Architectures Sanjeev Kumar, Corporate Technology Group, Intel Corporation Christopher.
Virtualization: Not Just For Servers Hollis Blanchard PowerPC kernel hacker.
Operating Systems Lecture 02: Computer System Overview Anda Iamnitchi
C OMPUTER O RGANIZATION AND D ESIGN The Hardware/Software Interface 5 th Edition Chapter 1 Computer Abstractions and Technology Sections 1.5 – 1.11.
Heterogeneous Multikernel OS Yauhen Klimiankou BSUIR
Advanced Principles of Operating Systems (CE-403).
CMT OS scheduling summary Yipkei Kwok 03/18/2008.
Secure Systems Research Group - FAU 1 Active Replication Pattern Ingrid Buckley Dept. of Computer Science and Engineering Florida Atlantic University Boca.
5 May CmpE 516 Fault Tolerant Scheduling in Multiprocessor Systems Betül Demiröz.
Qiang XU CUhk REliable computing laboratory (CURE)
DTM and Reliability High temperature greatly degrades reliability
Mixed Criticality Systems: Beyond Transient Faults Abhilash Thekkilakattil, Alan Burns, Radu Dobrin and Sasikumar Punnekkat.
Shouqing Hao Institute of Computing Technology, Chinese Academy of Sciences Processes Scheduling on Heterogeneous Multi-core Architecture.
“Temperature-Aware Task Scheduling for Multicore Processors” Masters Thesis Proposal by Myname 1 This slides presents title of the proposed project State.
INTRODUCTION TO HIGH PERFORMANCE COMPUTING AND TERMINOLOGY.
Feeding Parallel Machines – Any Silver Bullets? Novica Nosović ETF Sarajevo 8th Workshop “Software Engineering Education and Reverse Engineering” Durres,
CHaRy Software Synthesis for Hard Real-Time Systems
Applied Operating System Concepts
Lynn Choi School of Electrical Engineering
Chapter 1: Introduction
Chapter 1: Introduction
Dynamo: A Runtime Codesign Environment
A comparison between a Computational Grid and a High-end Multicore Server in an academic environment David Risinamhodzi – North-west University- South.
Virtualization Virtualization is the creation of substitutes for real resources – abstraction of real resources Users/Applications are typically unaware.
ELEC 7770 Advanced VLSI Design Spring 2016 Introduction
For Massively Parallel Computation The Chaotic State of the Art
Chapter 1: Introduction
System On Chip.
“Temperature-Aware Task Scheduling for Multicore Processors”
Morgan Kaufmann Publishers
Chapter 1: Introduction
Storage Virtualization
Real-time Software Design
Chapter 1: Introduction
ELEC 7770 Advanced VLSI Design Spring 2014 Introduction
Virtualization Virtualization is the creation of substitutes for real resources – abstraction of real resources Users/Applications are typically unaware.
Eiman Ebrahimi, Kevin Hsieh, Phillip B. Gibbons, Onur Mutlu
ELEC 7770 Advanced VLSI Design Spring 2012 Introduction
Real-Time Systems Group
ELEC 7770 Advanced VLSI Design Spring 2010 Introduction
Introduction to locality sensitive approach to distributed systems
Mattan Erez The University of Texas at Austin July 2015
Operating System Concepts
Chapter 1: Introduction
Introduction to Embedded Systems
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Subject Name: Operating System Concepts Subject Number:
Chapter 1: Introduction
Chapter 1: Introduction
Department of Electrical Engineering Joint work with Jiong Luo
Chapter 1: Introduction
Mark McKelvin EE249 Embedded System Design December 03, 2002
Chapter 1: Introduction
Operating System Concepts
Communication Driven Remapping of Processing Element (PE) in Fault-tolerant NoC-based MPSoCs Chia-Ling Chen, Yen-Hao Chen and TingTing Hwang Department.
Chapter 1: Introduction
Anand Bhat*, Soheil Samii†, Raj Rajkumar* *Carnegie Mellon University
Presentation transcript:

Fault-Tolerant NoC-based Manycore system: Reconfiguration & Scheduling ZHANG Jie CURE

Outline Introduction Reconfiguration Scheduling Summary

Introduction

Introduction Manycore process Also known as multicore or chip multiprocessor (CMP) Integrate numbers of cores on a single die Architecture for parallel execution E.g. TILE64 processor, intel 80-core teraflop processor

Network-on Chip (NoC) NoC is generally regarded as the most promising on-chip communication architecture. Share Bus Bad Scalability Point to Point Hardware Overhead NoC

Hardware is not perfect. Hard faults/Permanent faults Manufacturing defects Wear-out faults, aging effects Both cores and NoC are fault-prone. Core-level redundancy, e.g. GeForce 8800 (192-96) Redundancy

Reconfiguration

Reconfiguration Hardware failures Hardware designers’ concern: cannot be predicted. cause NoC to be irregular and diverse from each other. Hardware designers’ concern: Mitigate the performance degradation in the presence of faults with the redundancy Software designers’ concern: How to optimize applications under diverse hardware platforms.

Solution Virtual Topology: a layer between SW and HW How to efficiently and effectively choose the VT.

Scheduling

Scheduling Core scheduling & task scheduling Requirement Core scheduling assigns the required number of cores to the jobs. Task scheduling determines the order in which incoming jobs are executed. Requirement fast

Traditional Solutions Scheduling Problem: Given a sequence of jobs which have diverse core requirement. Minimize the execution time Solution Contiguous scheduling: Cores assigned are physically adjacent. Aim: Let as many active cores as possible Problem: Ignore topology asymmetry of NoC Ignore performance asymmetry of cores

Proposed Solution Requirement: fast NoC asymmetry Core asymmetry Virtual Topology: provide the best platform to OS Core asymmetry Basic idea: set weights for each core High weight for high-performance core Low weight for low-performance core

Summary

Summary Reliability and performance should been considered together. HW should provide both reliable and high-performance platform to OS, while SW should properly adopt it to achieve full performance.