1 A Performance Study of Grid Workflow Engines Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Corina Stratan Parallel.

Slides:



Advertisements
Similar presentations
A Workflow Engine with Multi-Level Parallelism Supports Qifeng Huang and Yan Huang School of Computer Science Cardiff University
Advertisements

7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
SLA-Oriented Resource Provisioning for Cloud Computing
A system Performance Model Instructor: Dr. Yanqing Zhang Presented by: Rajapaksage Jayampthi S.
June 1, Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema PDS Group, TU Delft, NL Todd Tannenbaum, Matt Farrellee,
June 1, GrenchMark : Towards a Generic Framework for Analyzing, Testing, and Comparing Grids ASCI Conference 2006 A. Iosup, D.H.J. Epema PDS Group,
Nadia Ranaldo - Eugenio Zimeo Department of Engineering University of Sannio – Benevento – Italy 2008 ProActive and GCM User Group Orchestrating.
June 2, GrenchMark : A Framework for Analyzing, Testing, and Comparing Grids CCGrid 2006 A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
June 3, ServMark A Hierarchical Architecture for Testing Grids Santiago, Chile A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft C.
June 3, 2015 Synthetic Grid Workloads with Ibis, K OALA, and GrenchMark CoreGRID Integration Workshop, Pisa A. Iosup, D.H.J. Epema Jason Maassen, Rob van.
The Performance of Bags-Of-Tasks in Large-Scale Distributed Computing Systems Alexandru Iosup, Ozan Sonmez, Shanny Anoep, and Dick Epema ACM/IEEE Int’l.
Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema, Hashim Mohamed,Mathieu Jan, Ozan Sonmez 3 rd Grid Initiative Summer School,
1 Google Workshop at TU Delft, 2010 – Online Games and Clouds Cloudifying Games: Rain for the Thirsty Alexandru Iosup Parallel and Distributed Systems.
Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide, Leigh Stoller, Tim Stack, Juliana Freire, and Jay Lepreau and Jay Lepreau.
1 Trace-Based Characteristics of Grid Workflows Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Simon Ostermann,
June 25, GrenchMark: A synthetic workload generator for Grids KOALA Workshop A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
June 25, GrenchMark: Synthetic workloads for Grids First Demo at TU Delft A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
June 28, Resource and Test Management in Grids Rapid Prototyping in e-Science VL-e Workshop, Amsterdam, NL Dick Epema, Catalin Dumitrescu, Hashim.
University of Dortmund June 30, On Grid Performance Evaluation using Synthetic Workloads JSSPP 2006 Alexandru Iosup, Dick Epema PDS Group, ST/EWI,
July 13, “How are Real Grids Used?” The Analysis of Four Grid Traces and Its Implications IEEE Grid 2006 Alexandru Iosup, Catalin Dumitrescu, and.
Euro-Par 2008, Las Palmas, 27 August DGSim : Comparing Grid Resource Management Architectures Through Trace-Based Simulation Alexandru Iosup, Ozan.
1 Efficient Management of Data Center Resources for Massively Multiplayer Online Games V. Nae, A. Iosup, S. Podlipnig, R. Prodan, D. Epema, T. Fahringer,
COST IC804 – IC805 Joint meeting, February Jorge G. Barbosa, Altino M. Sampaio, Hamid Harabnejad Universidade do Porto, Faculdade de Engenharia,
The Impact of Performance Asymmetry in Multicore Architectures Saisanthosh Ravi Michael Konrad Balakrishnan Rajwar Upton Lai UW-Madison and, Intel Corp.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
August 28, Performance Analysis of Cloud Computing Services for Many-Tasks Scientific Computing Berkeley, CA, USA Alexandru Iosup, Nezih Yigitbasi,
Euro-Par 2007, Rennes, 29th August 1 The Characteristics and Performance of Groups of Jobs in Grids Alexandru Iosup, Mathieu Jan *, Ozan Sonmez and Dick.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
LDBC-Benchmarking Graph-Processing Platforms: A Vision Benchmarking Graph-Processing Platforms: A Vision (A SPEC Research Group Process) Delft University.
1 TUD-PDS A Periodic Portfolio Scheduler for Scientific Computing in the Data Center Kefeng Deng, Ruben Verboon, Kaijun Ren, and Alexandru Iosup Parallel.
1 Cloud Computing Research at TU Delft – A. Iosup Alexandru Iosup Parallel and Distributed Systems Group Delft University of Technology The Netherlands.
1 EuroPar 2009 – POGGI: Puzzle-Based Online Games on Grid Infrastructures POGGI: Puzzle-Based Online Games on Grid Infrastructures Alexandru Iosup Parallel.
Database Laboratory Regular Seminar TaeHoon Kim.
การติดตั้งและทดสอบการทำคลัสเต อร์เสมือนบน Xen, ROCKS, และไท ยกริด Roll Implementation of Virtualization Clusters based on Xen, ROCKS, and ThaiGrid Roll.
20 October 2006Workflow Optimization in Distributed Environments Dynamic Workflow Management Using Performance Data David W. Walker, Yan Huang, Omer F.
Energy Prediction for I/O Intensive Workflow Applications 1 MASc Exam Hao Yang NetSysLab The Electrical and Computer Engineering Department The University.
1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.
The Limitation of MapReduce: A Probing Case and a Lightweight Solution Zhiqiang Ma Lin Gu Department of Computer Science and Engineering The Hong Kong.
1 Challenge the future KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Presenter: Lipu Fei Authors: Lipu Fei, Bogdan.
1 ROIA 2009 – CAMEO: Continuous Analytics for Massively Multiplayer Online Games CAMEO: Continuous Analytics for Massively Multiplayer Online Games Alexandru.
Simulating a $2M Commercial Server on a $2K PC Alaa R. Alameldeen, Milo M.K. Martin, Carl J. Mauer, Kevin E. Moore, Min Xu, Daniel J. Sorin, Mark D. Hill.
Condor Week 2005Optimizing Workflows on the Grid1 Optimizing workflow execution on the Grid Gaurang Mehta - Based on “Optimizing.
Suzhen Lin, A. Sai Sudhir, G. Manimaran Real-time Computing & Networking Laboratory Department of Electrical and Computer Engineering Iowa State University,
Experiments in computer science Emmanuel Jeannot INRIA – LORIA Aleae Kick-off meeting April 1st 2009.
Common Set of Tools for Assimilation of Data COSTA Data Assimilation Summer School, Sibiu, 6 th August 2009 COSTA An Introduction Nils van Velzen
A High Performance Middleware in Java with a Real Application Fabrice Huet*, Denis Caromel*, Henri Bal + * Inria-I3S-CNRS, Sophia-Antipolis, France + Vrije.
This poster has been developed with support from the CATIIS project Program doctoral interregional și transnațional de excelență în domeniile “Calculatoare.
Microsoft Management Seminar Series SMS 2003 Change Management.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
1 Grid Activity Summary » Grid Testbed » CFD Application » Virtualization » Information Grid » Grid CA.
Supporting Load Balancing for Distributed Data-Intensive Applications Leonid Glimcher, Vignesh Ravi, and Gagan Agrawal Department of ComputerScience and.
Grid Appliance The World of Virtual Resource Sharing Group # 14 Dhairya Gala Priyank Shah.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
INFSO-RI Enabling Grids for E-sciencE Quality Assurance Gabriel Zaquine - JRA2 Activity Manager - CS SI EGEE Final EU Review
+ Support multiple virtual environment for Grid computing Dr. Lizhe Wang.
INFSO-RI JRA2 Test Management Tools Eva Takacs (4D SOFT) ETICS 2 Final Review Brussels - 11 May 2010.
INFSOM-RI ETICS: E-infrastructure for Testing, Integration and Configuration of Software Alberto Di Meglio Project Manager.
Collection and storage of provenance data Jakub Wach Master of Science Thesis Faculty of Electrical Engineering, Automatics, Computer Science and Electronics.
1 Evaluation of Cooperative Web Caching with Web Polygraph Ping Du and Jaspal Subhlok Department of Computer Science University of Houston presented at.
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
18 May 2006CCGrid2006 Dynamic Workflow Management Using Performance Data Lican Huang, David W. Walker, Yan Huang, and Omer F. Rana Cardiff School of Computer.
Cloud Benchmarking, Tools, and Challenges
Cloud benchmarking, tools and challenges
On Dynamic Resource Availability in Grids
Resource and Test Management in Grids
ANALYSIS OF USER SUBMISSION BEHAVIOR ON HPC AND HTC
A benchmark for Minecraft-like services
Presentation transcript:

1 A Performance Study of Grid Workflow Engines Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Corina Stratan Parallel and Distributed Systems Group Politehnica University of Bucharest Romania IEEE/ACM Grid 2008, Tsukuba, JP.

2 Why are Grid Workflows Interesting? Grids promise reliable and easy-to-use computational infrastructure for e-Science Full automation from experiment design to final result Often, automation = workflows Jobs comprising inter-related computing and data-transfer tasks

3 Why is the Performance of Real Grid Workflow Engines Interesting? For our users Is this system suitable for its users? Are other systems better? For focusing on the right research problems What are the interesting problems? System configuration? Which workflow characteristics? Other problems… For simulation studies Unrealistic assumptions limit the applicability of results. How scalable are GWFEs? What overheads do they have?

4 Problem: How to Assess the Performance of Grid Workflow Engines? What do we want to assess? Is testing in real environments appropriate? What performance metrics are important? What workflows to use? Our goal is to develop and validate a methodology for assessing GWFEs.

5 Outline 1.Introduction 2.Methodology for Testing GWFEs 3.The Methodology in Practice 4.Conclusion and Future Work

6 2. Methodology for Testing GWFEs What to Assess? Traditional: raw performance metrics 1.Runtime, wait time, etc. In addition, for Grids (failure-prone, complex environments): 2.Overhead What is the cost of using a GWFE? 3.Stability Does the system behave consistently? 4.Scalability Does the system support grid-size workloads? 5.Reliability What is the impact of dynamic resource availability?

7 2. Methodology for Testing GWFEs Is Testing in Real Environments Appropriate? Our approach (novel) Testing complete grid middleware stacks in real grid environments. Alternatives Simulation [Ahmad & Kwok, JPDC’99] Math. Analysis Testing GWFEs in isolation (think unit vs. integration testing)

8 2. Methodology for Testing GWFEs What Performance Metrics are Important? Grid Resource Manager Overheads components: Oi, Oa, Os, Ost, Of Raw performance: Makespan (MS), Speed-Up vs. Single/Infinite Machine, … Stability: internal (MS IQR/Med.), overall (MS Range/Median) Scalability, Reliability [see article]. Grid Workflow Engine Workflow Tasks

9 2. Methodology for Testing GWFEs What Workflows to Use? No accepted workload; no real system traces. Sources: related simulation work, Standard Task Graph Set, our investigation of test workflows from 2 long-term grid traces [CG Symp.’08], our model of grid bags-of-tasks validated with 7 long-term grid traces [HPDC’08]. Number of graph nodes Graph traversal height

10 Outline 1.Introduction 2.Methodology for Testing GWFEs 3.The Methodology in Practice 4.Conclusion and Future Work

11 3. The Methodology in Practice (Selected Results) Experimental Setup Testing complete grid middleware stacks Generic GWFE: a baseline GWFE implementation 15 PCs, 2GB RAM, 1Gbps Ethernet Tools: MonALISA, ServMark = DiPerF + GrenchMark.

12 3. The Methodology in Practice (Selected Results) Overhead: Impact of WL Size and Type Setup: DAGMan, empty jobs, C-4 (left) / many (right). Oi >> Ost = Of. Internal state update very important. S-1, S-3: many often updates lower system throughput.

13 3. The Methodology in Practice (Selected Results) Raw Perf.: Performance vs. Consumption Karajan performs better than DAGMan, but runs quickly out of resources. !!!!!!!!!!!!!!!!!!!!!!!! KarajanDAGMan

14 3. The Methodology in Practice (Selected Results) Stability: Internal and Overall Stability Setup: DAGMan, 10 independent runs, C-4, 10 WFs. System is: Internally stable Overall not stable Need to react to system dynamics to favor under- served workflows.

15 Outline 1.Introduction 2.Methodology for Testing GWFEs 3.The Methodology in Practice 4.Conclusion and Future Work

16 Conclusion and Future Work Methodology for testing Grid Workflow Engines Goals Metrics Workflows Testing grid middleware stacks, not GWFEs in isolation! Analysis of two much used GWFEs vs. a baseline GWFE Future work Apply method to more middleware stacks, in more environments Design domain-specific workloads and assess the performance impact of the inter-domain differences (do different domains raise different challenges?)

17 Thank you! Questions? Remarks? Observations? Help building our community’s Grid Workloads Archive: Contact: [google “Iosup“] Web site: PDS group articles & software Have (workflow-based) grid traces? Additional References [HPDC’08] A. Iosup, O. Sonmez, S. Anoep, and D.H.J. Epema, The Performance of Bags-Of-Tasks in Large-Scale Distributed Computing Systems, In IEEE HPDC'08, [CG Symp.’08] S. Ostermann, R. Prodan, T. Fahringer, and A. Iosup, On the characteristics of grid workflows, In CoreGRID Symp