RSV: OSG Grid Monitoring and User Customizable Views Rob Quick, Arvind Gopu, and Soichi Hayashi High Performance Distributed Computing Location: Munich,

Slides:



Advertisements
Similar presentations
The Top 10 Reasons Why Federated Can’t Succeed And Why it Will Anyway.
Advertisements

CERN IT Department CH-1211 Genève 23 Switzerland t Messaging System for the Grid as a core component of the monitoring infrastructure for.
SYSTEM CENTER: ENDPOINT PROTECTION FUNDAMENTALS Howard A. Carter III Senior Consultant Microsoft Consulting Services September 21, 2013 TechGate 2013 –
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
MyOSG: A user-centric information resource for OSG infrastructure data sources Arvind Gopu, Soichi Hayashi, Rob Quick Open Science Grid Operations Center.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
Powered by Employment Security Department WorkSource Integrated Technology Solution.
Rsv-control Marco Mambelli – Site Coordination meeting October 1, 2009.
Powered by An overview of the WorkSource Integrated Technology Solution for WEC.
OSG Area Coordinators Meeting Operations Rob Quick 2/22/2012.
OSG Operations and Interoperations Rob Quick Open Science Grid Operations Center - Indiana University EGEE Operations Meeting Stockholm, Sweden - 14 June.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
Publication and Protection of Site Sensitive Information in Grids Shreyas Cholia NERSC Division, Lawrence Berkeley Lab Open Source Grid.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
May 8, 20071/15 VO Services Project – Status Report Gabriele Garzoglio VO Services Project – Status Report Overview and Plans May 8, 2007 Computing Division,
Production Coordination Staff Retreat July 21, 2010 Dan Fraser – Production Coordinator.
CERN IT Department CH-1211 Geneva 23 Switzerland t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
James Casey, CERN, IT-GT-TOM 1 st ROC LA Workshop, 6 th October 2010 Grid Infrastructure Monitoring.
July 25, 20071/21 OSG Information Services Gabriele Garzoglio, Rob Quick, Chris Green OSG Information Services, VO Monitoring Services and Resource Selection.
Grid Operations Lessons Learned Rob Quick Open Science Grid Operations Center - Indiana University.
CERN IT Department CH-1211 Geneva 23 Switzerland t GDB CERN, 4 th March 2008 James Casey A Strategy for WLCG Monitoring.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
WLCG Monitoring Roadmap Julia Andreeva, CERN , WLCG workshop, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks MSG - A messaging system for efficient and.
Site Validation Session Report Co-Chairs: Piotr Nyczyk, CERN IT/GD Leigh Grundhoefer, IU / OSG Notes from Judy Novak WLCG-OSG-EGEE Workshop CERN, June.
Production Coordination Area VO Meeting Feb 11, 2009 Dan Fraser – Production Coordinator.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
The OSG and Grid Operations Center Rob Quick Open Science Grid Operations Center - Indiana University ATLAS Tier 2-Tier 3 Meeting Bloomington, Indiana.
Storage dashboard Status report A.Baranovski 12/10/07.
ATLAS Dashboard Recent Developments Ricardo Rocha.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Jan 2010 OSG Update Grid Deployment Board, Feb 10 th 2010 Now having daily attendance at the WLCG daily operations meeting. Helping in ensuring tickets.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
RSV: OSG Grid Fabric Monitoring and Interoperation with WLCG Monitoring Systems Rob Quick, Arvind Gopu, and Soichi Hayashi Computing in High Energy and.
OSG Area Coordinators Meeting Operations Rob Quick 1/11/2012.
Testing and integrating the WLCG/EGEE middleware in the LHC computing Simone Campana, Alessandro Di Girolamo, Elisa Lanciotti, Nicolò Magini, Patricia.
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
ATP Future Directions Availability of historical information for grid resources: It is necessary to store the history of grid resources as these resources.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
MND review. Main directions of work  Development and support of the Experiment Dashboard Applications - Data management monitoring - Job processing monitoring.
GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting.
WLCG Information System Use Cases Review WLCG Operations Coordination Meeting 18 th June 2015 Maria Alandes IT/SDC.
Area Coordinator Report for Operations Rob Quick 4/10/2008.
Open Science Grid OSG Resource and Service Validation and WLCG SAM Interoperability Rob Quick With Content from Arvind Gopu, James Casey, Ian Neilson,
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
INFSO-RI Enabling Grids for E-sciencE DGAS, current status & plans Andrea Guarise EGEE JRA1 All Hands Meeting Plzen July 11th, 2006.
Operations Area Coordinator Report. 31 Jan Overview Operations Current Initiatives  RSV Version 2  New Probes, Easier Configuration, Improved.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
1 Models for Monitoring James Casey, CERN WLCG Service Reliability Workshop 27th November, 2007.
ConTZole Tomáš Kubeš, 2010 atlas-tz-monitoring.cern.ch An Interactive ATLAS Tier-0 Monitoring.
CMS Experience with the Common Analysis Framework I. Fisk & M. Girone Experience in CMS with the Common Analysis Framework Ian Fisk & Maria Girone 1.
1 Grid Service Monitoring James Casey, CERN IT-GD WLCG/OSG Operations Meeting 14th June 2007.
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
RSV and Nagios in OSG Rob Quick. March 11, 2008 USCMS Tier-2 Workshop 2 Current State of OSG ~ 100 Sites ~ 30 VOs April 8th:  216,000 jobs (85% successful)
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Monitoring Working Group Update Grid Deployment Board 5 th December, CERN Ian Neilson.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
MyOSG and MyEGI - One Stop Shopping for Grid (Operations) Information CHEP 2010, 20 October 15:00 (Asia/Taipei) – Auditorium, BHSS MyOSG and MyEGI - One.
OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10 (Asia/Taipei) – Room 2, BHSS OSG Operations – Lessons Learned CHEP 2010, 18 October 15:10.
James Casey, CERN IT-GD WLCG Workshop 1st September, 2007
POW MND section.
Grid Service Monitoring Working Group
The Top 10 Reasons Why Federated Can’t Succeed
Presentation transcript:

RSV: OSG Grid Monitoring and User Customizable Views Rob Quick, Arvind Gopu, and Soichi Hayashi High Performance Distributed Computing Location: Munich, Germany Date: June 10, /24/2016HPDC - MLA

GOC Staff (Back: Fred, Jenny, Soichi, Tom, Kyle Front: Arvind, Elizabeth, Chris, Rob) 6/24/2016HPDC - MLA

What we’ll be covering… Goals of the RSV Project Local Structure and Initial Deployment Central Collection and WLCG SAM Interoperability Data Presentation Next Steps 6/24/2016HPDC - MLA

Initial Goals of RSV Put equal monitoring into the hands of the local resource administrator Make a simple and flexible probe structure Provide independent scheduling and collection infrastructure (decoupled from the actual RSV probe test) Provide data to WLCG for Availability and Reliability calculations 6/24/2016HPDC - MLA

Goals as RSV Matured Interact with local fabric monitoring Recruit ‘experts’ to create probes Make a flexible central display of collected data Improve WLCG transport reliability 6/24/2016HPDC - MLA

RSV Client 6/24/2016HPDC - MLA

Deployment Quick adoption by ATLAS and CMS ◦Due to WLCG Availability and Reliability General OSG adoption outside of LCG related resources is still slow Views of Data outside of WLCG SAM and GridView were primitive Initial version had some reliability issues and was difficult to configure ◦These have been addressed in RSV V2 or are being addressed in RSV V3 6/24/2016HPDC - MLA

Central Collection Uses Gratia for transport and collection of probe results ◦Mechanisms that holds records until they can be transmitted protecting from outages on either side ◦Collection Database OSG Information Management DB ◦Determines which records are from valid OSG resources ◦Determines which OSG sites should publish to WLCG (Changes left to the admin) 6/24/2016HPDC - MLA

WLCG SAM Interoperability Probe output based on specification set forth by Grid Monitoring Working Group ◦Joint project by EGEE and OSG Uses Critical/Warning/Unknown/OK ◦Allows use in existing fabric monitoring Transmitted via ActiveMQ to WLCG 6/24/2016 Pic: James Casey HPDC - MLA

RSV Status in SAM 6/24/2016HPDC - MLA

OSG Status to GridView 6/24/2016HPDC - MLA

Data Presentation Everybody gets so much information all day long that they lose their common sense. --Gertrude Stein (1874 – 1946) Now that we have all this useful information, it would be nice to do something with it. (Actually, it can be emotionally fulfilling just to get the information. This is usually only true, however, if you have the social life of a kumquat.) --Unix Programmer's Manual The revolution has to be customized. -- Scott Rosenberg 6/24/2016HPDC - MLA

Goals of MyOSG Presentation Layer Consolidate data sources in OSG Provide data is ways that are useful to the users Do not make another “dashboard” Replace VORS monitoring Allow users to integrate the information into their normal daily workflow 6/24/2016HPDC - MLA

MyOSG Status History 6/24/2016HPDC - MLA

Drilldown on Issue 6/24/2016HPDC - MLA

MyOSG Availability Graphs 6/24/2016HPDC - MLA

MyOSG UWA Used with iGoogle 6/24/2016HPDC - MLA

MyOSG UWA used with Netvibes 6/24/2016HPDC - MLA

MyOSG - Universal Widget API Programmatic access to all status information Allows creation of your own view of OSG Status data and integrate it with your other web/desktop/dashboard mechanisms ◦Netvibes, Google Personalized Homepage, Windows Vista, Apple Dashboard, Opera, iPhone (Other mobile devices) If you don’t use one of the above for widget technologies a simple XML format is available also 6/24/2016HPDC - MLA

RSV Phase III More probes / re-write some probes ◦Security Probes ◦Infrastructure Probes (VOMS, GUMS, BDII) Complete VORS replacement Improve stability Configuration / Updating / Re-running Unified Management Console Robot certificates Project Plan 6/24/2016HPDC - MLA

Questions? 6/24/2016HPDC - MLA