Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu 2011-10-03 1.

Slides:



Advertisements
Similar presentations
Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
CALTECH CS137 Fall DeHon 1 CS137: Electronic Design Automation Day 19: November 21, 2005 Scheduling Introduction.
ECE 667 Synthesis and Verification of Digital Circuits
Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.
Chapter 3: Planning and Scheduling Lesson Plan
Coverage by Directional Sensors Jing Ai and Alhussein A. Abouzeid Dept. of Electrical, Computer and Systems Engineering Rensselaer Polytechnic Institute.
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
Martha Garcia.  Goals of Static Process Scheduling  Types of Static Process Scheduling  Future Research  References.
Distributed Process Scheduling Summery Distributed Process Scheduling Summery BY:-Yonatan Negash.
A SYSTEM PERFORMANCE MODEL CSC 8320 Advanced Operating Systems Georgia State University Yuan Long.
Reference: Message Passing Fundamentals.
ECE Synthesis & Verification - Lecture 2 1 ECE 697B (667) Spring 2006 ECE 697B (667) Spring 2006 Synthesis and Verification of Digital Circuits Scheduling.
1 Friday, September 29, 2006 If all you have is a hammer, then everything looks like a nail. -Anonymous.
Parallel Simulation etc Roger Curry Presentation on Load Balancing.
Process Scheduling for Performance Estimation and Synthesis of Hardware/Software Systems Slide 1 Process Scheduling for Performance Estimation and Synthesis.
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
7th Biennial Ptolemy Miniconference Berkeley, CA February 13, 2007 Scheduling Data-Intensive Workflows Tim H. Wong, Daniel Zinn, Bertram Ludäscher (UC.
Distributed Constraint Optimization * some slides courtesy of P. Modi
Summary for Chapter 5 --Distributed Process Scheduling
Dynamic Load Sharing and Balancing Sig Freund. Outline Introduction Distributed vs. Traditional scheduling Process Interaction models Distributed Systems.
Scheduling Parallel Task
Summary :- Distributed Process Scheduling Prepared BY:- JAYA KALIDINDI.
Chapter 5 Distributed Process Scheduling. 5.1 A System Performance Model --Niharika Muriki.
Challenges of Process Allocation in Distributed System Presentation 1 Group A4: Syeda Taib, Sean Hudson, Manasi Kapadia.
Task Alloc. In Dist. Embed. Systems Murat Semerci A.Yasin Çitkaya CMPE 511 COMPUTER ARCHITECTURE.
Latest Techniques in Real Time Scheduling Srikanth Pathuri.
VOLTAGE SCHEDULING HEURISTIC for REAL-TIME TASK GRAPHS D. Roychowdhury, I. Koren, C. M. Krishna University of Massachusetts, Amherst Y.-H. Lee Arizona.
Network Aware Resource Allocation in Distributed Clouds.
1 Scheduling CEG 4131 Computer Architecture III Miodrag Bolic Slides developed by Dr. Hesham El-Rewini Copyright Hesham El-Rewini.
Chapter 3 Parallel Algorithm Design. Outline Task/channel model Task/channel model Algorithm design methodology Algorithm design methodology Case studies.
1 Distributed Process Scheduling: A System Performance Model Vijay Jain CSc 8320, Spring 2007.
May 2004 Department of Electrical and Computer Engineering 1 ANEW GRAPH STRUCTURE FOR HARDWARE- SOFTWARE PARTITIONING OF HETEROGENEOUS SYSTEMS A NEW GRAPH.
Computer Science and Engineering Parallel and Distributed Processing CSE 8380 March 01, 2005 Session 14.
Graph Algorithms. Definitions and Representation An undirected graph G is a pair (V,E), where V is a finite set of points called vertices and E is a finite.
Spring 2015 Mathematics in Management Science Machine Scheduling Problem Statement of MSP Assumptions & Goals Priority Lists List Processing Algorithm.
SOFTWARE / HARDWARE PARTITIONING TECHNIQUES SHaPES: A New Approach.
Stochastic DAG Scheduling using Monte Carlo Approach Heterogeneous Computing Workshop (at IPDPS) 2012 Extended version: Elsevier JPDC (accepted July 2013,
 A System Performance Model  Static Process Scheduling  Dynamic Load Sharing and Balancing  Real-Time Scheduling.
1 Andreea Chis under the guidance of Frédéric Desprez and Eddy Caron Scheduling for a Climate Forecast Application ANR-05-CIGC-11.
Static Process Scheduling Section 5.2 CSc 8320 Alex De Ruiter
Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.
Summary :-Distributed Process Scheduling Prepared By:- Monika Patel.
6. Application mapping 6.1 Problem definition
CSC 8420 Advanced Operating Systems Georgia State University Yi Pan.
Outline Introduction Minimizing the makespan Minimizing total flowtime
O PTIMAL SERVICE TASK PARTITION AND DISTRIBUTION IN GRID SYSTEM WITH STAR TOPOLOGY G REGORY L EVITIN, Y UAN -S HUN D AI Adviser: Frank, Yeong-Sung Lin.
Lecture 4 TTH 03:30AM-04:45PM Dr. Jianjun Hu CSCE569 Parallel Computing University of South Carolina Department of.
CS 484 Designing Parallel Algorithms Designing a parallel algorithm is not easy. There is no recipe or magical ingredient Except creativity We can benefit.
Distributed Process Scheduling : A Summary
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
Static Process Scheduling
Pipelined and Parallel Computing Partition for 1 Hongtao Du AICIP Research Nov 3, 2005.
CDP Tutorial 3 Basics of Parallel Algorithm Design uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison.
A System Performance Model Distributed Process Scheduling.
Introduction to Real-Time Systems
A stochastic scheduling algorithm for precedence constrained tasks on Grid Future Generation Computer Systems (2011) Xiaoyong Tang, Kenli Li, Guiping Liao,
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
Genetic algorithms for task scheduling problem J. Parallel Distrib. Comput. (2010) Fatma A. Omara, Mona M. Arafa 2016/3/111 Shang-Chi Wu.
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University
COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University
Pradeep Konduri Static Process Scheduling:  Proceedance process model  Communication system model  Application  Dicussion.
COMP8330/7330/7336 Advanced Parallel and Distributed Computing Decomposition and Parallel Tasks (cont.) Dr. Xiao Qin Auburn University
Auburn University
Auburn University
Chapter – 5.2 Static Process Scheduling
Parallel Programming By J. H. Wang May 2, 2017.
Auburn University COMP7330/7336 Advanced Parallel and Distributed Computing Mapping Techniques Dr. Xiao Qin Auburn University.
Parallel Programming in C with MPI and OpenMP
Presentation transcript:

Static Process Schedule Csc8320 Chapter 5.2 Yunmei Lu

Outline Definition and Goal Models  Precedence process model  Communication system model Future work Reference 2

What is Static Process Schedule?(SPS) Scheduling a set of partially ordered tasks on a non-preemptive multiprocessor system of identical processors to minimize the overall finishing time (makespan)[1] 3

Implications ? Mapping of processes to processors is determined before execution of a process. Process behavior, process execution time, precedence relationships, and communication patterns need to be known before execution Non-preemptive, once started, process stays on processor until completed. 4

Goal ? Minimize the overall finish time (makespan) on a non-preemptive multiprocessor system (of identical processors) Scheduling algorithm that can best balance and overlap computation and communication 5

Other Characteristics? Optimize makespan  NP-complete Need approximate or heuristic algorithms… For classical definition, inter-processor communication is considered to be negligible, but for distributed system, it is non-negligible. 6

Models? Precedence Process Model(PPM) Communication Process Model(CPM) 7

Precedence Process Model(PPM) Program is represented by a directed acyclic graph (DAG) (Figure a in following slide). Precedence constraints among tasks in a program are explicitly specified. It can be characterized by a communication system model showing unit communication delays between processors (Figure b in follow slide). The communication cost between two tasks = unit communication cost in the communication system graph multiply the message units in the DAG. 8

Example of DAG 9 In figure a, each node denotes a task with a known execution time An edge represents a precedence relationship between two tasks, arrow represents the priority of execution; The label show message units to be transferred [Chow and Johnson 1997]

Precedence process and communication system models 10 Figure b is an example of a communication system model with three processors(p1,p2,p3), the unit communication costs are non-negligible for inter- processor communication and negligible (zero weight on the internal edge) for intra-processor communication. Communication cost between A(on p1) and E(on p3) is: 4*2=8 [Chow and Johnson 1997]

Precedence Process Model Algorithms:  List Scheduling (LS): A simple greedy heuristic: no processor remains idle if there are some tasks available that it could process. Without considering communication.  Extended List Scheduling (ELS): the actual scheduling results of LS with communication consideration.  Earliest Task First scheduling (ETF): the earliest schedulable task (with communication delay considered) is scheduled first. 11 [Chow and Johnson 1997]

12 (critical path): is the longest execution path in the DAG Dashed-lines represent waiting for communication Algorithms [Chow and Johnson 1997]

Communication Process Model(CPM) Modeled by a undirected graph G, nodes represent processes and weight on the edge is the amount of communication messages between two connected processes. There are no precedence constrains among processes Processors are not identical(different in speed and hardware) Scheduling goal: maximize the resource utilization and minimize inter-process communication. 13 [Chow and Johnson 1997]

Communication Process Model The problem is to find an optimal assignment of m processes to P processors with respect to the target function(called Module Allocation problem): P: a set of processors. e j (p i ): computation cost of execution process p j in processor P i. c i,j (p i,p j ): communication overhead between processes p i and p j. Assume a uniform communicating speed between processors. 14 [Chow and Johnson 1997]

Communication Process Model Stone’s two processors model to achieve minimum total execution and communication cost: 15 ProcessCost on ACost on B infinity (a) Computation cost Figure (a) shows execution time of each process on either processor, (b) shows inter-process communication [Chow and Johnson 1997]

How to map processes to processors? Partition the graph by drawing a line cutting through some edges  Result in two disjoint graphs, one for each process  Set of removed edges  cut set Cost of cut set  sum of weights of the edges, which represents the total inter-process communication cost between processors  The cost of cut sets is 0 if all processes are assigned to the same node, but it makes no sense Computation constraints (no more k, distribute evenly…) 16 [Chow and Johnson 1997]

How to map process to processors? 17 The weight assigned to an edge between A and process i is the cost to execute process i on B. [Chow and Johnson 1997] Minimum-cost cut

Extend of stone’s two processors model To generalize the problem beyond two processors, Stone uses a repetitive approach based on two-processor algorithm to solve n- processor problems. Treat (n-1) processors as one super processor The processors in the super-processor are further broken down based on the results from previous step. 18 [Chow and Johnson 1997]

Problems? Too complex The optimization objectives of minimizing computation and communication costs are often conflicting Therefore, we use some other heuristic solutions 19

Some heuristic solutions Separate the optimization of computation and communicati on into two independent phases Merge processes with higher inter-process interaction into clusters of processes Processes in each cluster are then assigned to the processor that minimizes the computation cost 20 [Chow and Johnson 1997]

Problem and Solution Merging processes eliminates inter-processor communication but may impose a higher computation burden on the processor and thus reduce concurrency. Solution  Merge only processes with communication costs higher than a certain threshold C  Constrain the number of processes in a cluster, like that total execution cost of the processes in a single cluster cannot exceed another threshold X 21 [Chow and Johnson 1997]

Cluster of processes For C = 9, We get three clusters (2,4), (1,6 )and (3,5) Clusters (2,4) and (1,6) must be mapped to processors A and B. Cluster (3,5) can be assigned to A 0r B, according to the goal of minimizing of computation cost or communication cost Assigning (3,5) to A has a lower communication cost but higher computation cost If we assign (3,5) to A, the total Cost = 41 ( Computation cost = 17 on A and 14 on B Communication cost = 6+4= 10) 22 (2,4) ) (1,6) ) (3,5) ) [Chow and Johnson 1997]

Summary of static process schedule Non-preemptive, once a process is assigned to a processor, it remain there until its execution has been completed Need prior knowledge of execution time and communication behavior Scheduling decision is centralized and non-adaptive Not effective Not realistic To find the optimal solution is NP-hard, so always use heuristic algorithms. 23

Future work With the advancements in processors and networking hardware technologies, parallel processing can be accomplished in a wide spectrum of platforms. Designing diverse platforms makes the scheduling problem even more complex and challenging. Designing scheduling algorithms for efficient parallel processing, should consider the following aspects: 24

C ONT … Performance: Scheduling algorithm should produce high quality solution Time-complexity: It is an important factor insofar as the quality of solution is compromised. A fast algorithm is necessary for finding good solutions efficiently. Scalability: Have to consistently give good performance even for large input. Given more processors for a problem, the algorithm should produce solutions with comparable quality in a shorter period of time. Applicability: Must be applicable in practical environments, so it should take into account realistic assumptions about the program and multiprocessor models such as arbitrary computation and communication weights… 25

C ONT … The above mentioned goals are conflicting and thus pose a number of challenges to researchers. To combat these challenges, several new ideas are:  Genetic algorithms  Randomization approaches  Parallelization techniques  Extend DAG scheduling to heterogeneous computing platforms 26

Reference 1. Randy Chow, Theodore Johnson, “Distributed Operating Systems & Algorithms”, Addison Wesley, pp Yu-Kwong Kwok, Ishfaq Ahmad; Static scheduling algorithms for allocating directed task graphs to multiprocessors; ACM Computing Surveys; December Sachi Gupta, Gaurav Agarwal, Vikas Kumar “Task Scheduling in Multiprocessor System Using Genetic Algorithm”, /ICMLC Hongze Qiu, Wanli Zhou, Hailong Wang. “A Genetic Algorithm-based Approach to Flexible Job-shop Scheduling Problem”. DOI /ICNC Xueyan Tang & Samuel T. Chanson. “ Optimizing Static Job Scheduling in a Network of Heterogeneous Computers”. pp , icpp, IEEE

Thank you ! 28