A SYSTEM PERFORMANCE MODEL CSC 8320 Advanced Operating Systems Georgia State University Yuan Long
O UTLINE Overview Basic Theory Process Integration Models Process Models System Performance Model Efficiency Loss Workload Distribution Processor-Pool and Workstation Queuing Models Comparison of Performance for Workload Sharing Recent Work Future Work
O VERVIEW Why scheduling? Communication and synchronization facilities are essential system components for supporting concurrent execution of interacting processes. Before the execution, processes need to be scheduled and allocated with resources.
O VERVIEW ( CONT.) What is the goal of scheduling? Enhance overall system performance metrics. Process completion time Processor utilization. Achieve location and performance transparency in distributed systems.
O VERVIEW ( CONT.) Issues? The communication overhead can not be ignored. The effect of underlying architecture can not be ignored. Dynamic behavior of the system.
P ROCESS INTERACTION MODEL Four processes mapped to a two-processor multiple computer system. Precedence process model ( Directed Acyclic Graph (DAG) ) Communication process model Disjoint process model
P ROCESS MODELS Precedence process model Represent precedence relationships between processes Minimize total completion time of task (computation + communication) P1 P2 P3 P4 Communication overhead
P ROCESS MODELS Communication process model Represent the need for communication between processes Optimize the total cost of communication and computation
S YSTEM PERFORMANCE MODEL Disjoint process model Processes can run independently and completed in finite time Maximize utilization of processors and minimize turnaround time of processes
S YSTEM PERFORMANCE MODEL Speedup the algorithm design underlying system architecture efficiency of the scheduling algorithm.
S YSTEM PERFORMANCE MODEL S can also be written as OSPT( optimal sequential processing time ): the best time that can be achieved on a single processor using the best sequential algorithm CPT( concurrent processing time ): the actual time achieved on a n-processor system with the concurrent algorithm and a specific scheduling method being considered OCPT ideal ( optimal concurrent processing time on an ideal system ): the best time that can achieved with the concurrent algorithm being considered on an ideal n- processor system(no inter-communication overhead) and scheduled by an optimal scheduling policy S i : the ideal speedup by using a multiple processor system over the best sequential time S d : the degradation of the system due to actual implementation compared to an ideal system
n=number of processors. m= number of tasks in the algorithm. =total computation of the concurrent algorithm S YSTEM PERFORMANCE MODEL S i can be rewritten as RP=Relative Processing requirement. (RP 1) RC=Relative Concurrency. RC=1 best use of the processors
---the efficiency less the ratio of the real system overhead due to all causes to the ideal optimal processing time. Two parts: sched + syst S YSTEM PERFORMANCE MODEL S d can be rewritten as Finally we can get (The bigger the better)
E FFICIENCY LOSS How to illustrate the interdependence between scheduling and system factors ? The efficiency loss p can be expressed as Real systemIdeal system Multiple computer system X’X Scheduling policy Y’Y Ideal system Non-Ideal system
E FFICIENCY LOSS Following figure demonstrates the decomposition of efficiency loss due to scheduling and system communication. The significance of the impact of communication on system performance must be carefully addressed in the design of distributed scheduling algorithm.
W ORKLOAD D ISTRIBUTION Performance can be further improved by workload distribution Loading sharing: static workload distribution Dispatch processes to the idle processors statically upon arrival Corresponding to processor pool model Load balancing: dynamic workload distribution Migrate processes dynamically from heavily loaded processors to lightly loaded processors Corresponding to migration workstation model
W ORKLOAD D ISTRIBUTION Model by queuing theory: X/Y/c An arrival process X, a service time distribution of Y, and c servers. : arrival rate; : service rate; : migration rate : depends on channel bandwidth, migration protocol, context and state information of the process being transferred.
P ROCESSOR -P OOL AND W ORKSTATION Q UEUING M ODELS Static Load Sharing Dynamic Load Balancing M for Markovian distribution
C OMPARISON OF P ERFORMANCE FOR W ORKLOAD S HARING =0 M/M/1 = M/M/2
R ECENT W ORK Scheduling dynamic load-balancing in parallel and distributed computers By developing effective methods the whole program time execution will be decreased and process utilization will be optimized. Simple scheduling method Round Robin algorithm Genetic algorithm using modified genetic algorithm
R ECENT W ORK A performance model for analyzing large-scale systems Develop models of DNS( Direct Numerical Simulation ) Captures its key performance characteristics. Can be used for the prediction of performance on existing as well as non-existing systems.
F UTURE W ORK Integrated Power and Performance Model Predict the optimal number of active processors for a given application. Can model the increases in power consumption that resulted from the increases in temperature. Unlike previous models, which may require Measured execution times Hardware performance counters Or architectural simulation
F UTURE W ORK Scheduling in multi-processor systems based on PSO. Based on PSO method.(Particle swarm optimization algorithm) Each swarm is modeled by particles in multidimensional space. Every particle is specified by a position and velocity and starts a search in the search space. Minimize the maximum span and average utilization of all processors in an optimal way.
R EFERENCE [1]B.Veltman, Multiprocessor scheduling with communication delays, [2] Javad Mohammadzadeh, Scheduling dynamic load-balancing in parallel and distributed computers using modified genetic algorithm with time dependent fitness function,2009 [3] Darren J.Kerbyson, A performance model of direct numerical simulation for analyzing large- scale systems,2011 [4] Sunpyo Hong, Integrated GPU Power and Performance Model,2010 [5] OmidReza Kiyarazm, A new method for scheduling load balancing in multi-processor systems based on PSO,2011 [6] Randy Chow, Theodore Johnson, Distributed Operating Systems & Algorithms, 1997