Namyoon Woo and Heon Y. Yeom

Slides:



Advertisements
Similar presentations
Exploiting Deadline Flexibility in Grid Workflow Rescheduling Wei Chen Alan Fekete Young Choon Lee.
Advertisements

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Minimum Clique Partition Problem with Constrained Weight for Interval Graphs Jianping Li Department of Mathematics Yunnan University Jointed by M.X. Chen.
U of Houston – Clear Lake
Scheduling in Distributed Systems Gurmeet Singh CS 599 Lecture.
Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.
Optimal Instruction Scheduling for Multi-Issue Processors using Constraint Programming Abid M. Malik and Peter van Beek David R. Cheriton School of Computer.
5th International Conference, HiPEAC 2010 MEMORY-AWARE APPLICATION MAPPING ON COARSE-GRAINED RECONFIGURABLE ARRAYS Yongjoo Kim, Jongeun Lee *, Aviral Shrivastava.
Module 5 – Networks and Decision Mathematics Chapter 24 – Directed Graphs.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
Towards Feasibility Region Calculus: An End-to-end Schedulability Analysis of Real- Time Multistage Execution William Hawkins and Tarek Abdelzaher Presented.
1 Internet Networking Spring 2006 Tutorial 6 Network Cost of Minimum Spanning Tree.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Code Generation for Basic Blocks Introduction Mooly Sagiv html:// Chapter
Ziliang Zong, Adam Manzanares, and Xiao Qin Department of Computer Science and Software Engineering Auburn University Energy Efficient Scheduling for High-Performance.
QoS-constrained List Scheduling Heuristics for Parallel Applications on Grids 16-th Euromicro PDP Toulose, February 2008 QoS-CONSTRAINED LIST SCHEDULING.
1 Internet Networking Spring 2004 Tutorial 6 Network Cost of Minimum Spanning Tree.
1 Internet Networking Spring 2002 Tutorial 6 Network Cost of Minimum Spanning Tree.
A Tool for Partitioning and Pipelined Scheduling of Hardware-Software Systems Karam S Chatha and Ranga Vemuri Department of ECECS University of Cincinnati.
On the Task Assignment Problem : Two New Efficient Heuristic Algorithms.
1 IOE/MFG 543 Chapter 7: Job shops Sections 7.1 and 7.2 (skip section 7.3)
1 of 14 1 / 18 An Approach to Incremental Design of Distributed Embedded Systems Paul Pop, Petru Eles, Traian Pop, Zebo Peng Department of Computer and.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
 Escalonamento e Migração de Recursos e Balanceamento de carga Carlos Ferrão Lopes nº M6935 Bruno Simões nº M6082 Celina Alexandre nº M6807.
Network Aware Resource Allocation in Distributed Clouds.
ROBUST RESOURCE ALLOCATION OF DAGS IN A HETEROGENEOUS MULTI-CORE SYSTEM Luis Diego Briceño, Jay Smith, H. J. Siegel, Anthony A. Maciejewski, Paul Maxwell,
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
A Clustering Algorithm based on Graph Connectivity Balakrishna Thiagarajan Computer Science and Engineering State University of New York at Buffalo.
Introduction to Graphs. Introduction Graphs are a generalization of trees –Nodes or verticies –Edges or arcs Two kinds of graphs –Directed –Undirected.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
LATA: A Latency and Throughput- Aware Packet Processing System Author: Jilong Kuang and Laxmi Bhuyan Publisher: DAC 2010 Presenter: Chun-Sheng Hsueh Date:
A Graph Based Algorithm for Data Path Optimization in Custom Processors J. Trajkovic, M. Reshadi, B. Gorjiara, D. Gajski Center for Embedded Computer Systems.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Real-Time Support for Mobile Robotics K. Ramamritham (+ Li Huan, Prashant Shenoy, Rod Grupen)
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture.
Presentation: Genetic clustering of social networks using random walks ELSEVIER Computational Statistics & Data Analysis February 2007 Genetic clustering.
Resource Allocation in Network Virtualization Jie Wu Computer and Information Sciences Temple University.
Static Process Scheduling
Informatics tools in network science
Jamie Unger-Fink John David Eriksen.  Allocation and Scheduling Problem  Better MPSoC optimization tool needed  IP and CP alone not good enough  Communication.
Example Apply hierarchical clustering with d min to below data where c=3. Nearest neighbor clustering d min d max will form elongated clusters!
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
Graphs Definition: a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected.
Genetic algorithms for task scheduling problem J. Parallel Distrib. Comput. (2010) Fatma A. Omara, Mona M. Arafa 2016/3/111 Shang-Chi Wu.
Great Theoretical Ideas in Computer Science.
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
Scheduling Algorithms Performance Evaluation in Grid Environments R, Zhang, C. Koelbel, K. Kennedy.
Resource Provision for Batch and Interactive Workloads in Data Centers Ting-Wei Chang, Pangfeng Liu Department of Computer Science and Information Engineering,
On a Network Creation Game
Paul Pop, Petru Eles, Zebo Peng
Isabella Cerutti, Andrea Fumagalli, Sonal Sheth
A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids e-Science IEEE 2007 Report: Wei-Cheng Lee
Nithin Michael, Yao Wang, G. Edward Suh and Ao Tang Cornell University
Babak Sorkhpour, Prof. Roman Obermaisser, Ayman Murshed
Period Optimization for Hard Real-time Distributed Automotive Systems
Dejun Yang (Arizona State University)
Department of Computer Science University of York
by Xiang Mao and Qin Chen
Richard Anderson Lecture 6 Greedy Algorithms
Algorithms for Budget-Constrained Survivable Topology Design
Tree -decomposition * 竹内 和樹 * 藤井 勲.
Richard Anderson Autumn 2016 Lecture 7
Richard Anderson Lecture 7 Greedy Algorithms
Optimization of Real-Time Systems with Deadline Miss Ratio Constraints
Richard Anderson Winter 2019 Lecture 7
Richard Anderson Autumn 2015 Lecture 7
Richard Anderson Autumn 2019 Lecture 7
Presentation transcript:

K-Depth Look-ahead Task Scheduling in Network of Heterogeneous Processors Namyoon Woo and Heon Y. Yeom School of Computer Science and Engineering Seoul National University, Korea {nywoo, yeom}@dcslab.snu.ac.kr

List Scheduling Heurstic Introduction (1) Problem Definition Input Task precedence graph (Directed weighted acyclic graph) Processor-network graph Objective Minimizing the overall task execution time. Satisfying the precedence order of tasks. Before the run time. NP-Complete problem List Scheduling Heurstic It is know as Cost-effective heuristic

Introduction (2) : List Scheduling (3) (1) Time T4 T1 T2 T0 T1 T3 T4 T2 T0 T3 T0 T2 T1 T3 (2) T4 T0 P0 P1 P3 P2 T3 P0 P1 P3 P2 P0 P1 P3 P2

“Earilist Start Time” (EST) Earliest Finish Time” (HEFT) Related Works (1) “Earilist Start Time” (EST) Homogeneous Processing “Heterogeneous Earliest Finish Time” (HEFT) [topcuoglu99HCS] Heterogeneous Processing Tx Ti Ty Tz … Tx Tx Ty Ty Ti Ti Ty Ty Ty Ty P0 P1 P2 P0 P1 P2

“Bubble Scheduling and Allocation” (BSA) Related Works (2) “Bubble Scheduling and Allocation” (BSA) [kwok2000CC] Tx Tx Tx Tx Tx Ti Ti Ti Ti Ty Ty Ti Ti Ty Ty Ty Tz Tz Tz Ty Tz Tz Tz Tz P0 P1 P2 P3 P0 P1 P2 P3 P0 P1 P2 P3 pivot pivot pivot

Heterogeneous Network Links “Successor’s Expected Start Time” (SEST) Motivation (1) Heterogeneous Network Links “Successor’s Expected Start Time” (SEST) T0 e1 e2 e3 e4 e5 Tx Ty Ti Ty Ty P0 T0 e1 e3 e2 e4 e5 P1 ? Tz P0 P1 P2

Clustering k successive tasks. Motivation (2) Clustering k successive tasks. T0 P0 P1 P2 T0 T3 T2 T4 T5 T6 T2 T3 K-depth T4 T’ T5 T6

k-Depth Look-ahead Heuristic ID of Task X ID of Processor K Predefined Depth w’(i,x) Heterogeneous exe. Time of task i on Processor x h’x Average network cost of Processor x c(i) Average weight of out-edges from task i SUCC(Ti) A set of Task I’s successor tasks NB(Px) A set of neighbor processors of processor x

k-DLA Scheduling Heuristic List the tasks in the pre-defined order while the list is not empty do Select the first task Ti and remove it from the list. For all Px, calculate est(i,x) + ebl(i,x,k). Select Px which gives the minimum value of the sum Schedule Ti on Px end while

Experimental Environment Directed acyclic graphs Random Graph # of tasks (t ) : 50~900 # of edges = from 2t to 5t Real Application Stencil / LU-Decomposition / Laplace Transform # of tasks : over 2000. Processor network architecture 16 nodes –Ring / Mesh / Fully Connected Network Variables Heterogeneous Factor (HF) : 5, 10, 20, 40 Communication to Computation Ratio (CCR) : 0.1, 1, 10.0

Metrics for the Performance Comparison Metircs Normalized Schedule Length (NSL) Schedule Length /  the weight of tasks on critical path NSL shows how close to the optimum the scheduling result is. Running Time The cost of the scheduling heuristic itself Used Processor The tendency or locality of task-processor mapping Heuristics BSA, HEFT, k-DLA (k=1, 5, infinite)

Results (1) : Number of Tasks (CCR=1.0, HF=20) Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

Results (2) : CCR (n=500, HF=20) Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

Results (3) : Scheduling Time (CCR=1.0, HF=20) Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

Results (4) : # of scheduled processors Ring Mesh Clique BSA HEFT 1-DLA 5-DLA -DLA

Results (5) : Conventional graph (CCR=1.0, HF=20) LU 64 Stencil Laplace

Analysis and Conclusions Low High CCR 1-DLA -DLA (except in clique) HF Network Connectivity -DLA 1-DLA or HEFT The DLA heuristic with large k is suitable for the heterogeneous computing system where the network resource is expensive. We can adjust the value k according to the characteristic of a given computing system.