Euro-Par 2008, Las Palmas, 27 August 2008 1 DGSim : Comparing Grid Resource Management Architectures Through Trace-Based Simulation Alexandru Iosup, Ozan.

Slides:



Advertisements
Similar presentations
Challenge the future Delft University of Technology Overprovisioning for Performance Consistency in Grids Nezih Yigitbasi and Dick Epema Parallel.
Advertisements

7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
1 GridSim 2.0 Adv. Grid Modelling & Simulation Toolkit Rajkumar Buyya, Manzur Murshed (Monash), Anthony Sulistio, Chee Shin Yeo Grid Computing and Distributed.
Samford University Virtual Supercomputer (SUVS) Brian Toone 4/14/09.
19 November 2013 Exploring Portfolio Scheduling for Long-term Execution of Scientific Workloads in IaaS Clouds Alexandru Iosup Delft University of Technology.
All Hands Meeting, 2006 Title: Grid Workflow Scheduling in WOSE (Workflow Optimisation Services for e- Science Applications) Authors: Yash Patel, Andrew.
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
WS-VLAM: Towards a Scalable Workflow System on the Grid V. Korkhov, D. Vasyunin, A. Wibisono, V. Guevara-Masis, A. Belloum Institute.
June 1, Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema PDS Group, TU Delft, NL Todd Tannenbaum, Matt Farrellee,
June 1, GrenchMark : Towards a Generic Framework for Analyzing, Testing, and Comparing Grids ASCI Conference 2006 A. Iosup, D.H.J. Epema PDS Group,
June 2, GrenchMark : A Framework for Analyzing, Testing, and Comparing Grids CCGrid 2006 A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
June 3, ServMark A Hierarchical Architecture for Testing Grids Santiago, Chile A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft C.
June 3, 2015 Synthetic Grid Workloads with Ibis, K OALA, and GrenchMark CoreGRID Integration Workshop, Pisa A. Iosup, D.H.J. Epema Jason Maassen, Rob van.
The Performance of Bags-Of-Tasks in Large-Scale Distributed Computing Systems Alexandru Iosup, Ozan Sonmez, Shanny Anoep, and Dick Epema ACM/IEEE Int’l.
Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema, Hashim Mohamed,Mathieu Jan, Ozan Sonmez 3 rd Grid Initiative Summer School,
Discrete-Event Simulation: A First Course Steve Park and Larry Leemis College of William and Mary.
DAS-3/Grid’5000 meeting: 4th December The KOALA Grid Scheduler over DAS-3 and Grid’5000 Processor and data co-allocation in grids Dick Epema, Alexandru.
1 A Performance Study of Grid Workflow Engines Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Corina Stratan Parallel.
Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.
Fault-tolerant Adaptive Divisible Load Scheduling Xuan Lin, Sumanth J. V. Acknowledge: a few slides of DLT are from Thomas Robertazzi ’ s presentation.
1 Trace-Based Characteristics of Grid Workflows Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Simon Ostermann,
Security-Driven Heuristics and A Fast Genetic Algorithm for Trusted Grid Job Scheduling Shanshan Song, Ricky Kwok, and Kai Hwang University of Southern.
June 25, GrenchMark: A synthetic workload generator for Grids KOALA Workshop A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
Present by Chen, Ting-Wei Adaptive Task Checkpointing and Replication: Toward Efficient Fault-Tolerant Grids Maria Chtepen, Filip H.A. Claeys, Bart Dhoedt,
June 25, GrenchMark: Synthetic workloads for Grids First Demo at TU Delft A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
GHS: A Performance Prediction and Task Scheduling System for Grid Computing Xian-He Sun Department of Computer Science Illinois Institute of Technology.
1 Dong Lu, Peter A. Dinda Prescience Laboratory Computer Science Department Northwestern University Virtualized.
June 28, Resource and Test Management in Grids Rapid Prototyping in e-Science VL-e Workshop, Amsterdam, NL Dick Epema, Catalin Dumitrescu, Hashim.
Workload Management Massimo Sgaravatto INFN Padova.
June 6, 2002D.H.J. Epema/PDS/TUD1 Processor Co-Allocation in Multicluster Systems DAS-2 Workshop Amsterdam June 6, 2002 Anca Bucur and Dick Epema Parallel.
University of Dortmund June 30, On Grid Performance Evaluation using Synthetic Workloads JSSPP 2006 Alexandru Iosup, Dick Epema PDS Group, ST/EWI,
July 13, “How are Real Grids Used?” The Analysis of Four Grid Traces and Its Implications IEEE Grid 2006 Alexandru Iosup, Catalin Dumitrescu, and.
1 Efficient Management of Data Center Resources for Massively Multiplayer Online Games V. Nae, A. Iosup, S. Podlipnig, R. Prodan, D. Epema, T. Fahringer,
Scientific Computing Department Faculty of Computer and Information Sciences Ain Shams University Supervised By: Mohammad F. Tolba Mohammad S. Abdel-Wahab.
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
August 28, Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing Berkeley, CA, USA Alexandru Iosup, Nezih Yigitbasi,
Euro-Par 2007, Rennes, 29th August 1 The Characteristics and Performance of Groups of Jobs in Grids Alexandru Iosup, Mathieu Jan *, Ozan Sonmez and Dick.
LDBC-Benchmarking Graph-Processing Platforms: A Vision Benchmarking Graph-Processing Platforms: A Vision (A SPEC Research Group Process) Delft University.
1 TUD-PDS A Periodic Portfolio Scheduler for Scientific Computing in the Data Center Kefeng Deng, Ruben Verboon, Kaijun Ren, and Alexandru Iosup Parallel.
1 Cloud Computing Research at TU Delft – A. Iosup Alexandru Iosup Parallel and Distributed Systems Group Delft University of Technology The Netherlands.
1 EuroPar 2009 – POGGI: Puzzle-Based Online Games on Grid Infrastructures POGGI: Puzzle-Based Online Games on Grid Infrastructures Alexandru Iosup Parallel.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
Meta Scheduling Sathish Vadhiyar Sources/Credits/Taken from: Papers listed in “References” slide.
Young Suk Moon Chair: Dr. Hans-Peter Bischof Reader: Dr. Gregor von Laszewski Observer: Dr. Minseok Kwon 1.
1 Time & Cost Sensitive Data-Intensive Computing on Hybrid Clouds Tekin Bicer David ChiuGagan Agrawal Department of Compute Science and Engineering The.
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
1 Challenge the future KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Presenter: Lipu Fei Authors: Lipu Fei, Bogdan.
1 ROIA 2009 – CAMEO: Continuous Analytics for Massively Multiplayer Online Games CAMEO: Continuous Analytics for Massively Multiplayer Online Games Alexandru.
October 23, Grid Computing: From Old Traces to New Applications Fribourg, Switzerland Alexandru Iosup, Ozan Sonmez, Nezih Yigitbasi, Hashim Mohamed,
Experiments in computer science Emmanuel Jeannot INRIA – LORIA Aleae Kick-off meeting April 1st 2009.
Performance Evaluation of a SNAP-based Community Resource Broker Mohammed H. Haji, Peter Dew, Karim Djemame and Iain Gourlay.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
A Hyper-heuristic for scheduling independent jobs in Computational Grids Author: Juan Antonio Gonzalez Sanchez Coauthors: Maria Serna and Fatos Xhafa.
Performance Analysis of Preemption-aware Scheduling in Multi-Cluster Grid Environments Mohsen Amini Salehi, Bahman Javadi, Rajkumar Buyya Cloud Computing.
Going Large-Scale in P2P Experiments Using the JXTA Distributed Framework Mathieu Jan & Sébastien Monnet Projet PARIS Paris, 13 February 2004.
1/22 Optimization of Google Cloud Task Processing with Checkpoint-Restart Mechanism Speaker: Sheng Di Coauthors: Yves Robert, Frédéric Vivien, Derrick.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Scalable and Coordinated Scheduling for Cloud-Scale computing
Author Utility-Based Scheduling for Bulk Data Transfers between Distributed Computing Facilities Xin Wang, Wei Tang, Raj Kettimuthu,
Euro-Par, HASTE: An Adaptive Middleware for Supporting Time-Critical Event Handling in Distributed Environments ICAC 2008 Conference June 2 nd,
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
Joint Institute for Nuclear Research Synthesis of the simulation and monitoring processes for the data storage and big data processing development in physical.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.
Dynamic Resource Allocation for Shared Data Centers Using Online Measurements By- Abhishek Chandra, Weibo Gong and Prashant Shenoy.
A Grid Research Toolbox
On Dynamic Resource Availability in Grids
Resource and Test Management in Grids
ANALYSIS OF USER SUBMISSION BEHAVIOR ON HPC AND HTC
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Euro-Par 2008, Las Palmas, 27 August DGSim : Comparing Grid Resource Management Architectures Through Trace-Based Simulation Alexandru Iosup, Ozan Sonmez, and Dick Epema PDS Group Delft University of Technology The Netherlands

Euro-Par 2008, Las Palmas, 27 August A Grid Research Toolbox Hypothesis: (a) is better than (b). DGSim For scenario 1, …

Euro-Par 2008, Las Palmas, 27 August A Grid Research Toolbox Hypothesis: (a) is better than (b). DGSim For scenario 1, …

Euro-Par 2008, Las Palmas, 27 August The Problem with Grid Simulations Three decades of writing simulators in computer science → writing the simulator is not the problem The problem: getting from solution design to experimental results with an automated simulation tool Experimental setup Tool to generate realistic experimental setups Experiment support for grid resource management Tool to manage large numbers of related simulations Performance Not the simulation time (decades of optimizations there) Tool proved to work with large simulations (number of resources, workload size, etc.)

Euro-Par 2008, Las Palmas, 27 August Outline 1.Problem Statement 2.The DGSim Framework 3.DGSim Validation 4.DGSim Examples 5.Future Work

Euro-Par 2008, Las Palmas, 27 August The DGSim Framework Name, Goal, and Challenges DGSim = Delft Grid Simulator Simulate various grid resource management architectures Multi-cluster grids Grids of grids (THE grid) Challenges Many types of architectures Generating and replaying grid workloads Management of the simulations Many repetitions of a simulation for statistical relevance Simulations with many parameters Managing results (e.g., analysis tools) Enabling collaborative experiments Two GRM architectures

Euro-Par 2008, Las Palmas, 27 August The DGSim Framework Overview Discrete-Event Simulator

Euro-Par 2008, Las Palmas, 27 August The DGSim Framework Model Details: Inter-Operation Architectures Hybrid hierarchical/ decentralized Decentralized Hierarchical IndependentCentralized

Euro-Par 2008, Las Palmas, 27 August The DGSim Framework Model Details: Resource Dynamics & Evolution Resource dynamics Short-term changes in resource availability status Resource evolution Long-term changes in number & … of resources A. Iosup, M. Jan, O. Sonmez, and D.H.J. Epema, On the Dynamic Resource Availability in Grids, IEEE/ACM Grid, 2007.

Euro-Par 2008, Las Palmas, 27 August The DGSim Framework Workloads: Generation and Model(s) Parallel jobs Adapting the Lublin-Feitelson model to grids Bags-of-Tasks: groups of independent single-processor tasks Validated with seven long-term grid traces A. Iosup, O.O. Sonmez, S. Anoep, D.H.J.Epema, The Performance of Bags-of-Tasks in Large-Scale Distributed Computing Systems, IEEE HPDC, A. Iosup, D.H.J.Epema, T. Tannenbaum, M. Farrellee, M. Livny, Inter-Operating Grids through Delegated MatchMaking, ACM/IEEE SuperComputing, Workload Generation Generate synthetic workload with realistic characteristics Iterative workload generation: incur specified load on a grid

Euro-Par 2008, Las Palmas, 27 August Outline 1.Problem Statement 2.The DGSim Framework 3.DGSim Validation 4.DGSim Examples 5.Future Work

Euro-Par 2008, Las Palmas, 27 August DGSim Validation Functional Validation Functional validation (simple scenario) Workload = 100 jobs ct. size 10,000 arrive at t=0 System: grid scheduler over one 10-resource cluster resource = 1 work unit/second, information delay = s

Euro-Par 2008, Las Palmas, 27 August DGSim Validation Real vs. Simulated DAS-3 Multi-Cluster Grid Simulator setup Application: synthetic parallel, communication-intensive (all-gather) Measured: runtime for various configurations (co-allocation) System: heterogeneous clusters, Koala co-allocating scheduler Workload: 300 jobs, submitted over a period of 6 hours All jobs submitted through central cluster gateways Results Scheduling algorithm leads to similar results in real and simulated environments → can use simulator for analyzing scheduling trends Under-estimation of waiting time (failures lead to more contention)

Euro-Par 2008, Las Palmas, 27 August Outline 1.Problem Statement 2.The DGSim Framework 3.DGSim Validation 4.DGSim Examples 5.Future Work

Euro-Par 2008, Las Palmas, 27 August DGSim Examples Sample 1/3 Investigate mechanisms for inter-operating grids New mechanism: DMM Trace-based performance evaluation through simulations Real and model-based traces Largest trace: 1.4M jobs Simulate Grid’5000+DAS-2 Explored a design space of over 1 million design points A. Iosup, D.H.J.Epema, T. Tannenbaum, M. Farrellee, M. Livny, Inter-Operating Grids through Delegated MatchMaking, ACM/IEEE SuperComputing, 2007.

Euro-Par 2008, Las Palmas, 27 August DGSim Examples Sample 2/3 What is the performance impact of the dynamic grid resource availability? Four models for grid resource availability information Trace-based performance evaluation through simulations Real traces Simulate Grid’5000 KA = AMA > HMA >> SA A. Iosup, M. Jan, O. Sonmez, and D.H.J. Epema, On the Dynamic Resource Availability in Grids, IEEE/ACM Grid, Resource availability StaticDynamic Availability Information Delay On-Time (0) Short period Long period SAKA AMA HMA Avg. Norm. G’put. [cpuseconds/day/proc] Goodput decreases with intervention delay Model SAKAAMA 60s AMA 1h HMA 1w HMA 1mo HMA Never

Euro-Par 2008, Las Palmas, 27 August DGSim Examples Sample 3/3 Analyze performance of bag- of-tasks scheduling algorithms Information availability framework: Known, Unknown, Historical records Trace-based performance evaluation through simulations Real and model-based traces Simulate Grid’5000+DAS Evaluated 8 scheduling algorithms Explored a design space of over 2 million design points A. Iosup, O.O. Sonmez, S. Anoep, D.H.J.Epema, The Performance of Bags-of-Tasks in Large-Scale Distributed Computing Systems, IEEE HPDC, Task Information Resource Information KHU K H U ECT, FPLT FPFECT-P DFPLT, MQD STFR RR, WQR

Euro-Par 2008, Las Palmas, 27 August Outline 1.Problem Statement 2.The DGSim Framework 3.DGSim Validation 4.DGSim Examples 5.Future Work

Euro-Par 2008, Las Palmas, 27 August Conclusion and Future Work The DGSim framework Tool to generate realistic experimental setups Tool to manage large numbers of grouped simulations Tool proved to work with large simulations Validated underlying models and assumptions Resource dynamics and evolution model Workload model Comparing grid resource management architectures Proven in various settings Future work More scenarios Library of ready-to-use scenarios

Euro-Par 2008, Las Palmas, 27 August Thank you! Questions? Remarks? Observations? Contact: [google “Iosup“] Web sites: ohttp:// : VL-e project ohttp:// : PDS group articles & software