VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.

Slides:

Advertisements

Similar presentations

SALSA HPC Group School of Informatics and Computing Indiana University.

Advertisements

WSUS Presented by: Nada Abdullah Ahmed.

1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.

GridFlow: Workflow Management for Grid Computing Kavita Shinde.

Two Broad Categories of Software

Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.

Usage Policy (UPL) Research for GriPhyN & iVDGL Catalin L. Dumitrescu, Michael Wilde, Ian Foster The University of Chicago.

Common System Components

Workload Management Massimo Sgaravatto INFN Padova.

LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.

Grid Information Systems. Two grid information problems Two problems  Monitoring  Discovery We can use similar techniques for both.

CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.

Polish Infrastructure for Supporting Computational Science in the European Research Space QoS provisioning for data-oriented applications in PL-Grid D.

ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.

User Driven Innovation in a technology driven project Anastasius Gavras Eurescom GmbH

1 Research Groups : KEEL: A Software Tool to Assess Evolutionary Algorithms for Data Mining Problems SCI 2 SMetrology and Models Intelligent.

NeSC Apps Workshop July 20 th, 2002 Customizable command line tools for Grids Ian Kelley + Gabrielle Allen Max Planck Institute for Gravitational Physics.

Liam Newcombe BCS Data Centre Specialist Group Secretary Modelling Data Centre Energy Efficiency and Cost.

Policy-based CPU-scheduling in VOs Catalin Dumitrescu, Mike Wilde, Ian Foster.

Grid Workload Management & Condor Massimo Sgaravatto INFN Padova.

Grid Workload Management Massimo Sgaravatto INFN Padova.

INDIANAUNIVERSITYINDIANAUNIVERSITY Grid Monitoring from a GOC perspective John Hicks HPCC Engineer Indiana University October 27, 2002 Internet2 Fall Members.

Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.

RAL Site Report John Gordon IT Department, CLRC/RAL HEPiX Meeting, JLAB, October 2000.

Cloud Status Laurence Field IT/SDC 09/09/2014. Cloud Date Title 2 SaaS PaaS IaaS VMs on demand.

Maintaining and Updating Windows Server Monitoring Windows Server It is important to monitor your Server system to make sure it is running smoothly.

And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR

Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,

Performance Monitoring of SLAC Blackbox Nodes Using Perl, Nagios, and Ganglia Roxanne Martinez Mentor: Yemi Adesanya United States Department of Energy.

What is SAM-Grid? Job Handling Data Handling Monitoring and Information.

Review of Condor,SGE,LSF,PBS

US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.

WLCG infrastructure monitoring proposal Pablo Saiz IT/SDC/MI 16 th August 2013.

Introduction to Grids By: Fetahi Z. Wuhib [CSD2004-Team19]

Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.

26/05/2005 Research Infrastructures - 'eInfrastructure: Grid initiatives‘ FP INFRASTRUCTURES-71 DIMMI Project a DI gital M ulti M edia I nfrastructure.

Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.

Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.

7. Grid Computing Systems and Resource Management

INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.

Globus and PlanetLab Resource Management Solutions Compared M. Ripeanu, M. Bowman, J. Chase, I. Foster, M. Milenkovic Presented by Dionysis Logothetis.

GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.

Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.

Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.

CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.

Miron Livny Computer Sciences Department University of Wisconsin-Madison Condor and (the) Grid (one of.

1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.

Microsoft ® Official Course Module 6 Managing Software Distribution and Deployment by Using Packages and Programs.

Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –

The Earth System Grid (ESG) A Fault Monitoring System for ESG Components DOE SciDAC ESG Project Review Argonne National Laboratory, Illinois May 8-9, 2003.

Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.

Planning Session. ATLAS(-CMS) End-to-End Demo Kaushik De is the Demo Czar Need to put team together Atlfast production jobs –Atlfast may be unstable over.

Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.

CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.

EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.

1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI,

Review of the WLCG experiments compute plans

DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S

INFNGRID Monitoring Group report

ALICE Monitoring

Sergio Fantinel, INFN LNL/PD

20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.

Basic Grid Projects – Condor (Part I)

DotSlash: An Automated Web Hotspot Rescue System

Chapter 2: Operating-System Structures

Wide Area Workload Management Work Package DATAGRID project

GRUBER: A Grid Resource Usage SLA Broker

Experiences in Running Workloads over OSG/Grid3

Chapter 2: Operating-System Structures

Presentation transcript:

VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago

➢ Part I: The Grid-enabled Monitoring Tool ➢ Part II: From Monitoring to Simulation ➢ Part III: Features / Extended Model ➢ Shortcomings ➢ Future Work / Conclusions 2

➢ P2P Reporting ✗ implicit hierarchic infrastructures ➢ Interface with Other Monitoring Tools ✗ Nagios, MDS 2 ➢ Grid/Globus Specific Metrics ✗ Gatekeeper Information / Cluster RM Status ➢ Per VO Monitoring Support ✗ Collected metrics were aggregated and VO specific as well ➢ Resource Management ➢ Preference Specifications ➢ Usage Policy Enforcement 3

4

5

➢ Implemented Ideas ● VO based Metric Reporting ● Usage Policy Metric Incorporation ● Distributed Infrastructure for Usage Policy ➢ Time Spent with Development ● Enhanced Monitoring ~ 3 month ● Policy ~ 6 months ● Simulator ~ 3 months ➢ Are Other Alternatives Around? ➢ MonaLisa ➢ Standard Ganglia 6

➢ Difficult to Find Always Acceptable Grid Testbeds ➢ Deployment Takes Time ➢ Computing Time Represents an Issue in Production Environments ➢ What Do Some Well Known TestBeds offer Today? ➢ Grid3: many clusters with similar software AND Globus ➢ PlanetLab: individual machines with similar characteristics 7

8 ➢ CPU Management / Task Assignment Policies ➢ Disk Management / Space Assignment Policies ➢ Network Management / Maximum Capacity (so far) ➢ Usage Policy Specification Interface ➢ Data File Management (replica selection problem)

9 ➢ Before: ✗ Metric collection by means of specific collectors ➢ Now: ✗ Special modules that generate metrics about different loads ✗ Similar to a discrete simulator but integrated with a real tool ➢ “How exactly?” ✗ Periodic invocations (instead of monitoring collectors) ✗ State management for workloads, data file migration, CPU and disk allocations, network usages

10

11 ➢ Part I: The Grid-enabled Monitoring Tool ➢ Part II: From Monitoring to Simulation ➢ Part III: Features / Extended Model ➢ Shortcomings ➢ Future Work / Conclusions

12 ➢ Idea: Is it possible to run several simulators on different machines and configure each instance to report to a set of specified neighbors? ➢ Advantages: ✗ Simplicity in connecting several local simulators working on different data ✗ Support for metric distribution and visualization

13 [...]

14 for each Gi with EPi, BPi, BEi do # Case 1: fill BPi + BEi if (Sum(BAj) == 0) & (BAi < BPi) & (Qi has jobs) then schedule a job from some Qi to the least loaded site # Case2: BAi<BPi (resources available) else if (SUM (BAk) < TOTAL) & (BAi < BPi) & (Qi has jobs) schedule a job from some Qi to the least loaded site # Case 3: fill EPi (resource contention) else if (sum(BAk) == TOTAL) & (BAi < EPi) & (Qi exists) then if (j exists such that BAj >= EPj) then stop scheduling jobs for VOj # Need to fill with extra jobs? if (BAi < EPi + BEi) then schedule a job from some Qi to the least loaded site # ?? if (EAi < EPi) & (Qi has jobs) then schedule additional backfill jobs

15 99% 80% 20% 60% 90% VO1 VO2

16

17 ➢ RRD / Disk Access ➢ Perl / Interpreted Language Speed ➢ Result Interpretation ➢ Result Validation in Real Contexts

18 ➢ “What Is Next? ” ✗ More work Resource Usage Policy Analsys ✗ “Export” ideas from VO-Ganglia in real pratice

19 ➢ “Why VO-Ganglia Is So 'Cool‘ for me?” ✗ Some creative ideas ✗ Easy to use ✗ “Possibility to run on my laptop” ✗ Provisioning tools for ✔ Workload generation ✔ Result formatting ➢ “Why Did I Invest More Than a Year in Developing It?”

20 ?