University of Dortmund June 30, 2015 1 On Grid Performance Evaluation using Synthetic Workloads JSSPP 2006 Alexandru Iosup, Dick Epema PDS Group, ST/EWI,

Slides:



Advertisements
Similar presentations
Challenge the future Delft University of Technology Overprovisioning for Performance Consistency in Grids Nezih Yigitbasi and Dick Epema Parallel.
Advertisements

7 april SP3.1: High-Performance Distributed Computing The KOALA grid scheduler and the Ibis Java-centric grid middleware Dick Epema Catalin Dumitrescu,
SDN + Storage.
1 User Analysis Workgroup Update  All four experiments gave input by mid December  ALICE by document and links  Very independent.
IoP HEPP 2004 Birmingham, 7/4/04 David Cameron, University of Glasgow 1 Simulation of Replica Optimisation Strategies for Data.
Polish Infrastructure for Supporting Computational Science in the European Research Space EUROPEAN UNION Services and Operations in Polish NGI M. Radecki,
Interactive Analytical Processing in Big Data Systems: A Cross-Industry Study of MapReduce Workloads Jackie.
June 1, Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema PDS Group, TU Delft, NL Todd Tannenbaum, Matt Farrellee,
June 1, GrenchMark : Towards a Generic Framework for Analyzing, Testing, and Comparing Grids ASCI Conference 2006 A. Iosup, D.H.J. Epema PDS Group,
June 2, GrenchMark : A Framework for Analyzing, Testing, and Comparing Grids CCGrid 2006 A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
June 3, ServMark A Hierarchical Architecture for Testing Grids Santiago, Chile A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft C.
June 3, 2015 Synthetic Grid Workloads with Ibis, K OALA, and GrenchMark CoreGRID Integration Workshop, Pisa A. Iosup, D.H.J. Epema Jason Maassen, Rob van.
Universität Dortmund Robotics Research Institute Information Technology Section Grid Metaschedulers An Overview and Up-to-date Solutions Christian.
Project 4 U-Pick – A Project of Your Own Design Proposal Due: April 14 th (earlier ok) Project Due: April 25 th.
Operating Systems CS451 Brian Bershad
The Performance of Bags-Of-Tasks in Large-Scale Distributed Computing Systems Alexandru Iosup, Ozan Sonmez, Shanny Anoep, and Dick Epema ACM/IEEE Int’l.
Inter-Operating Grids through Delegated MatchMaking Alexandru Iosup, Dick Epema, Hashim Mohamed,Mathieu Jan, Ozan Sonmez 3 rd Grid Initiative Summer School,
DAS-3/Grid’5000 meeting: 4th December The KOALA Grid Scheduler over DAS-3 and Grid’5000 Processor and data co-allocation in grids Dick Epema, Alexandru.
1 A Performance Study of Grid Workflow Engines Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Corina Stratan Parallel.
1 Trace-Based Characteristics of Grid Workflows Alexandru Iosup and Dick Epema PDS Group Delft University of Technology The Netherlands Simon Ostermann,
June 25, GrenchMark: A synthetic workload generator for Grids KOALA Workshop A. Iosup, H. Mohamed, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
June 25, GrenchMark: Synthetic workloads for Grids First Demo at TU Delft A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
June 28, Resource and Test Management in Grids Rapid Prototyping in e-Science VL-e Workshop, Amsterdam, NL Dick Epema, Catalin Dumitrescu, Hashim.
July 13, GrenchMark: A workload generator for Grids Demo at TU Delft A. Iosup, D.H.J. Epema PDS Group, ST/EWI, TU Delft.
July 13, “How are Real Grids Used?” The Analysis of Four Grid Traces and Its Implications IEEE Grid 2006 Alexandru Iosup, Catalin Dumitrescu, and.
Euro-Par 2008, Las Palmas, 27 August DGSim : Comparing Grid Resource Management Architectures Through Trace-Based Simulation Alexandru Iosup, Ozan.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
Euro-Par 2007, Rennes, 29th August 1 The Characteristics and Performance of Groups of Jobs in Grids Alexandru Iosup, Mathieu Jan *, Ozan Sonmez and Dick.
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
MobSched: An Optimizable Scheduler for Mobile Cloud Computing S. SindiaS. GaoB. Black A.LimV. D. AgrawalP. Agrawal Auburn University, Auburn, AL 45 th.
Grid Data Management A network of computers forming prototype grids currently operate across Britain and the rest of the world, working on the data challenges.
The MicroGrid: A Scientific Tool for Modeling Grids Andrew A. Chien SAIC Chair Professor Department of Computer Science and Engineering University of California,
IE 585 Introduction to Neural Networks. 2 Modeling Continuum Unarticulated Wisdom Articulated Qualitative Models Theoretic (First Principles) Models Empirical.
03/27/2003CHEP20031 Remote Operation of a Monte Carlo Production Farm Using Globus Dirk Hufnagel, Teela Pulliam, Thomas Allmendinger, Klaus Honscheid (Ohio.
An Autonomic Framework in Cloud Environment Jiedan Zhu Advisor: Prof. Gagan Agrawal.
1 Performance Evaluation of Computer Systems and Networks Introduction, Outlines, Class Policy Instructor: A. Ghasemi Many thanks to Dr. Behzad Akbari.
Liam Newcombe BCS Data Centre Specialist Group Secretary Modelling Data Centre Energy Efficiency and Cost.
Test Loads Andy Wang CIS Computer Systems Performance Analysis.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
GRID’2012 Dubna July 19, 2012 Dependable Job-flow Dispatching and Scheduling in Virtual Organizations of Distributed Computing Environments Victor Toporkov.
1 Challenge the future KOALA-C: A Task Allocator for Integrated Multicluster and Multicloud Environments Presenter: Lipu Fei Authors: Lipu Fei, Bogdan.
October 18, 2005 Charm++ Workshop Faucets A Framework for Developing Cluster and Grid Scheduling Solutions Presented by Esteban Pauli Parallel Programming.
Heavy and lightweight dynamic network services: challenges and experiments for designing intelligent solutions in evolvable next generation networks Laurent.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
MROrder: Flexible Job Ordering Optimization for Online MapReduce Workloads School of Computer Engineering Nanyang Technological University 30 th Aug 2013.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Uppsala, April 12-16th 2010EGEE 5th User Forum1 A Business-Driven Cloudburst Scheduler for Bag-of-Task Applications Francisco Brasileiro, Ricardo Araújo,
DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.
International Symposium on Grid Computing (ISGC-07), Taipei - March 26-29, 2007 Of 16 1 A Novel Grid Resource Broker Cum Meta Scheduler - Asvija B System.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
Efficient Gigabit Ethernet Switch Models for Large-Scale Simulation Dong (Kevin) Jin David Nicol Matthew Caesar University of Illinois.
Super Scaling PROOF to very large clusters Maarten Ballintijn, Kris Gulbrandsen, Gunther Roland / MIT Rene Brun, Fons Rademakers / CERN Philippe Canal.
Xi He Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, NY THERMAL-AWARE RESOURCE.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
BDTS and Its Evaluation on IGTMD link C. Chen, S. Soudan, M. Pasin, B. Chen, D. Divakaran, P. Primet CC-IN2P3, LIP ENS-Lyon
1 Performance Impact of Resource Provisioning on Workflows Gurmeet Singh, Carl Kesselman and Ewa Deelman Information Science Institute University of Southern.
MaGate Experiments on Scenarios GridGroup EIF, Feb 5th, 2009 Ye HUANG Pervasive Artificial Intelligence Group, Dept of Informatics, University of Fribourg,
Introduction to Distributed Platforms
Software Architecture in Practice
Grid Computing.
On Dynamic Resource Availability in Grids
Resource and Test Management in Grids
Wide Area Workload Management Work Package DATAGRID project
Resource and Service Management on the Grid
The Performance of Big Data Workloads in Cloud Datacenters
Experiences in Running Workloads over OSG/Grid3
Presentation transcript:

University of Dortmund June 30, On Grid Performance Evaluation using Synthetic Workloads JSSPP 2006 Alexandru Iosup, Dick Epema PDS Group, ST/EWI, TU Delft Carsten Franke, Alexander Papaspyrou, Lars Schley, Baiyi Song, and Ramin Yahyapour UniDo

University of Dortmund June 30, Outline A Brief Introduction to Grid Computing On Grid Performance Evaluation  Experimental Environments  Performance Indicators  General Workload Modeling  Grid-Specific Workload Modeling  The GrenchMark Framework Future Work Conclusions

University of Dortmund June 30, A Brief Introduction to Grid Computing Typical grid environment Applications [!] Unitary, composite Data Resources Compute (Clusters) Storage (Dedicated) Network Virtual Organizations, Projects Groups, Users Grids vs. parallel production environments Dynamic Heterogeneous Very large-scale (world) No central administration → Most resource management problems are NP-hard

University of Dortmund June 30, Experimental Environments Real-World Testbeds Real-World Testbed DAS, NorduGrid, Grid3/OSG, Grid’5000… Pros True performance, also shows “it works!” Infrastructure in place Cons Time-intensive Exclusive access (repeatability) Controlled environment problem (limited scenarios) Workload structure (little or no realistic data) What to measure (new environment)

University of Dortmund June 30, Experimental Environments Simulated and Emulated Testbeds Simulated and Emulated Testbeds GridSim, SimGrid, GangSim, MicroGrid … Essentially trade-off precision vs. speed Pros Exclusive access (repeatability) Controlled environment (unlimited scenarios) Cons Synthetic Grids: What to generate? How to generate? Clusters, Disks, Network, VOs, Groups, Users, Applications, etc. Workload structure (little or no realistic data) What to measure (new environment) Validity of results (accuracy vs. time)

University of Dortmund June 30, Grid Performance Evaluation Current Practice Performance Indicators Define my own metrics, or use U and AWT/ART, or both Workload Structure Run my own workload, or use traces that are not validated by peer researchers; do not make comparisons! Run benchmarks from typical parallel production environments Mostly all users are created equal assumption Need a common performance evaluation framework for Grid

University of Dortmund June 30, Grid Performance Evaluation Current Issues Performance Indicators What should be the metrics for the new environment? Workload Structure Which general aspects could be important? Which Grid-specific aspects need to be addressed? Need a common performance evaluation framework for Grid

University of Dortmund June 30, Performance Indicators Time-, Resource-, and System-Related Metrics Traditional: utilization, A(W)RT, A(W)WT, A(W)SD New: waste, fairness (or service quality reliability) Workload Completion and Failure Metrics “ In Grids, functionality may be even more important than performance ” Workload Completion (WC) Task and Enabled Task Completion (TC, ETC) System Failure Factor (SFF)

University of Dortmund June 30, General Aspects for Workload Modeling User/Group/VO model Detailed modeling for top-5/10 users, then clustering (Use squash area to group) Submission patterns Yearly, monthly, weekly, daily Do daily patterns exist? (Are Grids truly global?) Temporal patterns Repeated submission (batches of jobs) Job dependencies (composite applications common in Grid(?)) Feedback Empiric rules (don’t submit jobs when system busy). But, reactive submission tools, co-allocators, evolving applications, etc.

University of Dortmund June 30, Grid-Specific Workload Modeling Computation Management Processor co-allocation Fixed, non-fixed, semi-fixed jobs Job flexibility and composition Moldable, evolvable, flexible, etc. Batches, workflows, other dependecies Other aspects Background load: define top jobs (by consumption), model the rest as background load Project stage

University of Dortmund June 30, Grid-Specific Workload Modeling Data and Network Management Clearly Defined I/O Requirements Files, streams, others Data location and size Replicas Replica location Other aspects HDD occupancy Clearly Defined Network Requirements Bandwidth, latency Communication pattern Special Situations Dedicated paths, other QoS Other aspects Background load

University of Dortmund June 30, Grid-Specific Workload Modeling Locality/Origin Management Job issuer and execution site Not all VOs are created equal ! Two-level view: Which VO generates the next job? Within a VO, which user generates the next job? Three-level view, Multi-level view (Project, VO, Group, User) (Usage) Service Level Agreements Use my system 50% for 7 days, or 20% for 30 days Dedicated paths, other QoS Other aspects Background load pertaining to same (u)SLA

University of Dortmund June 30, Grid-Specific Workload Modeling Failure Modeling Error level Infrastructure Middleware Application User Fault tolerance scheme for submitted jobs Catch the system feedback into the model Other aspects Cascading errors

University of Dortmund June 30, Grid-Specific Workload Modeling Economic Models Utility Resource utility Application utility Pricing policies Time-dependent pricing: pay cheaper on off-peak hours Load-dependent pricing: pay cheaper for unused resources Package pricing: pay cheaper for bundles of resources Trust-building pricing: pay cheaper as old users Other aspects Available information Penalty / user satisfaction

University of Dortmund June 30, GrenchMark: a Framework for Analyzing, Testing, and Comparing grids What’s in a name? grid benchmark → working towards a generic tool for the whole community: help standardizing the testing procedures, but benchmarks are too early; we use synthetic grid workloads instead What’s it about? A systematic approach to analyzing, testing, and comparing grid settings, based on synthetic workloads A set of metrics for analyzing grid settings A set of representative grid applications Both real and synthetic Easy-to-use tools to create synthetic grid workloads Flexible, extensible framework

University of Dortmund June 30, GrenchMark Overview: Easy to Generate and Run Synthetic Workloads

University of Dortmund June 30, … but More Complicated Than You Think Workload structure User-defined and statistical models Dynamic jobs arrival Burstiness and self-similarity Feedback, background load Machine usage assumptions Users, VOs Metrics A(W) Run/Wait/Resp. Time Efficiency, MakeSpan Failure rate [!] (Grid) notions Co-allocation, interactive jobs, malleable, moldable, … Measurement methods Long workloads Saturated / non-saturated system Start-up, production, and cool-down scenarios Scaling workload to system Applications Synthetic Real Workload definition language Base language layer Extended language layer Other Can use the same workload for both simulations and real environments GrenchMark may become a vehicle for proving (performance indicators, workload modeling) research in dynamic, heterogeneous, very large-scale environments

University of Dortmund June 30, GrenchMark: Iterative Research Roadmap

University of Dortmund June 30, GrenchMark: Iterative Research Roadmap Simple functional system A.Iosup, J.Maassen, R.V.van Nieuwpoort, D.H.J.Epema, Synthetic Grid Workloads with Ibis, KOALA, and GrenchMark, CoreGRID IW, Nov 2005.

University of Dortmund June 30, GrenchMark: Iterative Research Roadmap Open- GrenchMark Community Effort This work Complex extensible system A.Iosup, D.H.J.Epema, GrenchMark: A Framework for Analyzing, Testing, and Comparing Grids, IEEE CCGrid'06, May 2006.

University of Dortmund June 30, Performance Evaluation of Grid Systems - need a common performance evaluation framework for grids - need real grid traces (scheduling, accounting, monitoring, etc.) - need more research on workload modeling and performance indicators Performance indicators - failure metrics as important as traditional performance metrics Workload modeling - generic workload modeling needs validation based on real grid traces - computation/data/network management - locality/origin management - failure modeling - economic models GrenchMark - generic tool for the whole community - generates diverse grid workloads - easy-to-use, flexible, portable, extensible, … Take home message

University of Dortmund June 30, Thank you! Questions? Remarks? Observations? All welcome! GrenchMark

University of Dortmund June 30,

University of Dortmund June 30, Representative Grid applications (3/4) Composite: DAG-based DAG-based applications Real DAG Chain of tools Try to model real or predicted (use) cases Input Output User task Linker Identity (one task’s output = other’s input, unmodified) App1 > Linker1 > App2 > Final result > out_1-2.dat param1.in out_1-1.dat huge-data.out perf2.dat param2.insome-list.in > out2.res l1p.dat perf1.dat