1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI,

Slides:



Advertisements
Similar presentations
A Lightweight Platform for Integration of Mobile Devices into Pervasive Grids Stavros Isaiadis, Vladimir Getov University of Westminster, London {s.isaiadis,
Advertisements

USING THE GLOBUS TOOLKIT This summary by: Asad Samar / CALTECH/CMS Ben Segal / CERN-IT FULL INFO AT:
A Model for Grid User Management Rich Baker Dantong Yu Tomasz Wlodek Brookhaven National Lab.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
October 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.
A tool to enable CMS Distributed Analysis
1 of 5 This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT. © 2007 Microsoft Corporation.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
December 1, 2004Rob Quick - iVDGL Grid Operations Center1 Grid Operations Rob Quick Grid Technologist Indiana University Open Science Grid.
Large scale data flow in local and GRID environment V.Kolosov, I.Korolko, S.Makarychev ITEP Moscow.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Grid Information Systems. Two grid information problems Two problems  Monitoring  Discovery We can use similar techniques for both.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
1 Dynamic Application Installation (Case of CMS on OSG) Introduction CMS Software Installation Overview Software Installation Issues Validation Considerations.
OSG Services at Tier2 Centers Rob Gardner University of Chicago WLCG Tier2 Workshop CERN June 12-14, 2006.
OSG Middleware Roadmap Rob Gardner University of Chicago OSG / EGEE Operations Workshop CERN June 19-20, 2006.
Mark L. Green, Ph.D. Grid Computational Scientist Monitoring and Information Services Architecture and Deployment University at Buffalo The State University.
ACAT 2003 Iosif Legrand Iosif Legrand California Institute of Technology.
Ramiro Voicu December Design Considerations  Act as a true dynamic service and provide the necessary functionally to be used by any other services.
A. Sim, CRD, L B N L 1 OSG Applications Workshop 6/1/2005 OSG SRM/DRM Readiness and Plan Alex Sim / Jorge Rodriguez Scientific Data Management Group Computational.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
1 DIRAC – LHCb MC production system A.Tsaregorodtsev, CPPM, Marseille For the LHCb Data Management team CHEP, La Jolla 25 March 2003.
Overview of Monitoring and Information Systems in OSG MWGS08 - September 18, Chicago Marco Mambelli - University of Chicago
A monitoring tool for a GRID operation center Sergio Andreozzi (INFN CNAF), Sergio Fantinel (INFN Padova), David Rebatto (INFN Milano), Gennaro Tortone.
1 1 Service Composition for LHC Computing Grid Monitoring Beob Kyun Kim e-Science Division, KISTI
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
VO-Ganglia Grid Simulator Catalin Dumitrescu, Mike Wilde, Ian Foster Computer Science Department The University of Chicago.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
Microsoft Virtual Academy. STANDARDIZATION SELF SERVICEAUTOMATION Give Customers of IT services the ability to identify, access and request services.
ESFRI & e-Infrastructure Collaborations, EGEE’09 Krzysztof Wrona September 21 st, 2009 European XFEL.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
OSG Integration Activity Report Rob Gardner Leigh Grundhoefer OSG Technical Meeting UCSD Dec 16, 2004.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
CHEP 2006, February 2006, Mumbai 1 LHCb use of batch systems A.Tsaregorodtsev, CPPM, Marseille HEPiX 2006, 4 April 2006, Rome.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
April 2003 Iosif Legrand MONitoring Agents using a Large Integrated Services Architecture Iosif Legrand California Institute of Technology.
PPDG February 2002 Iosif Legrand Monitoring systems requirements, Prototype tools and integration with other services Iosif Legrand California Institute.
XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,
3D Testing and Monitoring Lee Lueking LCG 3D Meeting Sept. 15, 2005.
GraDS MacroGrid Carl Kesselman USC/Information Sciences Institute.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
1 CMS Software Installation, Bockjoo Kim, 23 Oct. 2008, T3 Workshop, Fermilab CMS Commissioning and First Data Stan Durkin The Ohio State University for.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
VOX Project Status T. Levshina. 5/7/2003LCG SEC meetings2 Goals, team and collaborators Purpose: To facilitate the remote participation of US based physicists.
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
D.Spiga, L.Servoli, L.Faina INFN & University of Perugia CRAB WorkFlow : CRAB: CMS Remote Analysis Builder A CMS specific tool written in python and developed.
DataTAG is a project funded by the European Union CERN, 8 May 2003 – n o 1 / 10 Grid Monitoring A conceptual introduction to GridICE Sergio Andreozzi
OSG Status and Rob Gardner University of Chicago US ATLAS Tier2 Meeting Harvard University, August 17-18, 2006.
DGAS Distributed Grid Accounting System INFN Workshop /05/1009, Palau Giuseppe Patania Andrea Guarise 6/18/20161.
1 DIRAC Project Status A.Tsaregorodtsev, CPPM-IN2P3-CNRS, Marseille 10 March, DIRAC Developer meeting.
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
GridCat ITB Service Integration Plan Bockjoo Kim OSG Integration Workshop at UC Feb 15-17, 2005.
Open Science Grid Progress and Status
a VO-oriented perspective
DRM Deployment Readiness Plan
Supporting Grid Environments
Presentation transcript:

1 Grid2003 Monitoring, Metrics, and Grid Cataloging System Leigh GRUNDHOEFER, Robert QUICK, John HICKS (Indiana University) Robert GARDNER, Marco MAMBELLI, Andrew ZAHN (University of Chicago) Paul AVERY, Bockjoo KIM, Craig PRESCOTT, Jorge RODGRIGUEZ (University of Florida) Mark GREEN (University at Buffalo) Ian FISK, John WEIGAND (FERMI NATIONAL LABORATORY) Iosif LEGRAND (CERN) CHEP Interlaken, CH - Sept

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 2 Integrated Data Monitoring and Analysis Application domain requirements  Accounting VO activity  Compare with ‘traditional batch production’ Grid environment constraints  Heterogeneity  Scalability  Limited resources (bandwidth, connectivity)  Small footprint (light, unprivileged, compatible)  Reliability Deployment  Multi-level architecture  Using existing tools  Extend and integrate  Develop new tools to fill gaps

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 3 Example of metrics specification desired measures and short description  user category requiring it (Reporting, User, software Development)  time sampling requirement (Offline, Periodic, Event- based) Per-VO metrics:  Execution on more resources  Partial use of a resource

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 4 Components of an Integrated Monitoring Framework Globus Meta Directory System (LDAP directory) MonALISA, Monitoring Agents in Large Integrated Service Architecture (Pub/Sub) MonALISA repository (WS/WAP) Ganglia performance monitoring (Multicast/Hierarchical) Job Monitoring System at the Advanced Center for Distributed Computing (non invasive archive) The Grid Site Status Cataloging System at iGOC (human/automatic managed DB)

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 5 Grid Telemetry Information  Site list  Test result  Load  Jobs running  Jobs queued Heterogeneous Redundant

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 6 Historic information analysis Processed information  Aggregation  Filters  Analysis Provided CPUs CPU.hours consumed US CMS DC04 CPU by VO

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 7 Historic information analysis(2) Usage patterns: Effective grid storage usage Dedicated resource usage

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 8 What is GridCat ? A Grid Site Status Cataloging System - A Web App. High level simple status map : Computing Resource Information Collector/Presenter  Static and dynamic information about all sites  Simple grid status presentation on the web Identifies site readiness A web application easy to develop and deploy Displays disk space and CPU slots Parallel information collecting, storing, and archiving among sites (BE) Web pages: In templated html+php+js(FE)

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 9 Components Back-end  Grid Status Tests: GITS 1.4 (default/non-default ports)  Site $grid3, $tmp, $data, and $wntmp directory information  Site disk information (space used and space left)  Available CPU slots  Site status map generation (High Level Status Map)  Scripts can be used for complex grid failure mode analysis Front-end :  Displays default views  Choice of two maps  Monitoring status per site (MonAlisa, Ganglia, and MDS) Configuration-end (autoconf + Sitepop. Script)  Initial site population Collect site information Site physical/network/pixel address determination  Site parameter editing

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 10 Flow of Operations GridCat-Runscripts Grid TestsCPU ChecksDisk Checks MonA Check OtherMDS CheckGanglia Chk Grid Env MySQL serverArchiving FE web server BE Grid Proxy for GridCatGlobus or VDT client

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 11 BNL Tier1 US ATLAS DC2 Services Grid3 iGOC iGOC and Tier1 monitoring

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 12 Gridcat In Action Facility--> Sites Grid test results clickables Status map on top Optional views: different information Dynamic CPU/Disk info Map : US-Korea map or Worlmap <- New release

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 13 GridCat Site History Analysis GridCat collected DB since 08/19/04 Facility: >=1 sites Chronic problems can be identified easily New sites with grey column Operational/problem solving tool

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 14 Gridcat DB Analysis Batch Jobs Running data disk used 08/19/ /22/04 GridCat Grid3DB

Marco Mambelli, Bockjoo Kim Grid2003 Monitoring, Metrics, and Grid Cataloging System. 15 Conclusions and Future Development Results Integrated diverse tools into a reliable monitoring infrastructure Collect, archive, and analyze user-defined metrics in an end-to-end grid computing system  resource usage by VO Testing in a production environment of considerable size  Grid3: 30 sites, ~3000 CPUs  feasibility of multi-VO Grids. Contributing to several applications Design and develop GridCat to display high level Grid status  Dynamic computing resource info monitoring at a glance  Map view to quickly identify and locate problems on the Grid  DB archiving for failure analyses  extensible for various site status cataloging purposes and operations Work in progress Increase scalability, reliability Integrate monitoring, reservation and scheduling