IPDPS 2003, Nice, France Agent-Based Grid Load Balancing Using Performance-Driven Task Scheduling Junwei Cao (C&C Research Labs, NEC Europe Ltd., Germany) D. P. Spooner, S. A. Jarvis and G. R. Nudd (Dept. Computer Science, Univ. of Warwick, UK) S. Saini (NASA Ames Research Center, USA)
IPDPS 2003, Nice, France Outline Overview Performance Prediction Performance-Driven Task Scheduling Agent-Based Grid Load Balancing Experimental Results Conclusions
IPDPS 2003, Nice, France Overview Grid Users Grid Resources Global Grid Management (GGM): Agent-Based Load Balancing Local Grid Management (LGM): Performance-Driven Task Scheduling PACE Application Tools PACE Performance Evaluation Engine PACE Resource Tools
IPDPS 2003, Nice, France The PACE Toolkit Application Tools Resource Tools Evaluation Engine Source Code Analysis Object Editor Object Library PSL Compiler CPU Network (MPI, PVM) Cache (L1, L2) HMCL Compiler
IPDPS 2003, Nice, France LGM - FIFO Algorithm Processor 1 Processor 2 Processor 3 Processor 4 Processor 5 Processor 6 Processor 7 Processor 8 2 n -1
IPDPS 2003, Nice, France LGM - Genetic Algorithm Heuristic Evolutionary Near-optimal: Makespan Idletime Deadlines
IPDPS 2003, Nice, France LGM – System Implementation Communication Module PACE Evaluation Engine Task Management Performance-Drvien Task Scheduling Resource MonitoringTask Execution RequestsResultsService
IPDPS 2003, Nice, France GGM – Agent Methodology Agent structure Communication layer Decision-making layer Local management layer Agent hierarchy Service advertisement Service discovery Agent Capability Tables A AA AA Use r
IPDPS 2003, Nice, France GGM - Optimization Strategies A AA AA M Advertisement Data-push & data-pull Periodic & event-driven Discovery Local services Services in ACTs Lower or upper agents Optimisation Modeling Simulation User
IPDPS 2003, Nice, France GGM – System Implementation A AA AA M LLLLL Service information PACE resc models Makespan Request information Exec scripts PACE app model Deadline Matchmaking Estimation (FIFO) Deadline User
IPDPS 2003, Nice, France Load Balancing Metrics Total makespan Average advance time of task execution completions (required deadline - actual task completion time) Average processor utilisation rate (busy time / total makespan) Load balancing level (1 - mean square deviation of processor utilisation rates / average processor utilisation rate) Total number of network packages for both advertisement and discovery
IPDPS 2003, Nice, France Experiment Design S 1 (SGIOrigin2000, 16) S 2 (SGIOrigin2000, 16) S 4 (SunUltra10, 16) S 3 (SunUltra10, 16) S 5 (SunUltra5, 16) S 6 (SunUltra5, 16) S 12 (SunSPARCstati on2, 16) S 11 (SunSPARCstati on2, 16) S 8 (SunUltra1, 16) S 7 (SunUltra5, 16) S 10 (SunUltra1, 16) S 9 (SunUltra1, 16) Tasks: sweep3d fft improc closure jacobi memsort cpi
IPDPS 2003, Nice, France Experiment 1 FIFO
IPDPS 2003, Nice, France Experiment 1 FIFO
IPDPS 2003, Nice, France Experiment 2 GA
IPDPS 2003, Nice, France Experiment 3 GA
IPDPS 2003, Nice, France Task Execution Both GA and agents contribute towards the improvement in task executions.
IPDPS 2003, Nice, France Resource Utilisation Less powerful S11 & S12 benefit mainly from the GA. More powerful S1 & S2 benefit mainly from agents.
IPDPS 2003, Nice, France Load Balancing The GA contributes more to local grid load balancing. Agents contribute more to global grid load balancing.
IPDPS 2003, Nice, France Total Makespan The centralised pure data-pull can always achieve the best results Distributed agents with the hierarchical model can also achieve reasonably good results
IPDPS 2003, Nice, France Network Package The network overhead for the pure data-pull strategy to achieve better results is very high. Distributed agent- based service advertisement and discovery can scale well.
IPDPS 2003, Nice, France Conclusions Application performance prediction can be utilized for grid QoS support and resource scheduling. An multi-agent paradigm provides a clear high-level abstraction of grid management system. Distributed service advertisement and discovery strategies can be utilized to achieve grid load balancing.
IPDPS 2003, Nice, France Related Works Medical Simulation Services via the Grid G. Lonsdale, NEC Europe Ltd. Friday, Industrial Track II, IPDPS 2003 Performance Prediction and its use in Parallel and Distributed Computing Systems S. A. Jarvis, University of Warwick Saturday, PMEO Workshop, IPDPS 2003 GridFlow: Workflow Management for Grid Computing To appear in CCGrid 2003, Tokyo, Japan