Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli.

Slides:



Advertisements
Similar presentations
ISecurity GUI User-Friendly Interface. Features Full support of all green-screen functionality Simultaneous views of multiple iSecurity screens and activities.
Advertisements

© 2010 PetroCanopy LLC / 2009 PetroWrangler LLC. All rights reserved. Proprietary and Confidential. 1 Energy XXI EMK3 Automation Project Canopy/Wrangler.
Creating and Submitting a Necessary Wayleave Application
Experience In Developing Dynamic Web Interfaces: The Case Study of the ALICE Job Reliability Dashboard Eamonn Maguire IT-PSS 30-Aug
Software to Manage EEP Vegetation Plot Data A design proposal Michael Lee January 31, 2011.
© 2013 The McGraw-Hill Companies, Inc. All rights reserved. Chapter 9 Tests, Procedures, and Codes.
Experiment Support Introduction to HammerCloud for The LHCb Experiment Dan van der Ster CERN IT Experiment Support 3 June 2010.
Opening SharePoint to External Users.  Centralize all files  Eliminate the need for Matching Subs RFI’s to our RFI’s (Dan Campbell, ETC)  Create a.
Testing as a Service with HammerCloud Ramón Medrano Llamas CERN, IT-SDC
DIRAC API DIRAC Project. Overview  DIRAC API  Why APIs are important?  Why advanced users prefer APIs?  How it is done?  What is local mode what.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
Grid Information Systems. Two grid information problems Two problems  Monitoring  Discovery We can use similar techniques for both.
Operating Systems.  Operating System Support Operating System Support  OS As User/Computer Interface OS As User/Computer Interface  OS As Resource.
A Day in the Life of a MProLite User Easy Steps to Ensure You Get the Best Results Performance Solutions Technology, LLC.
Press the F5 key to continue Project Manager is a web based Project Management Tool. All your work is done and information stored on the internet cloud.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
User Manager Pro Suite Taking Control of Your Systems Joe Vachon Sales Engineer November 8, 2007.
Copyright © 2007, Oracle. All rights reserved. Managing Concurrent Requests.
Grid Initiatives for e-Science virtual communities in Europe and Latin America DIRAC TEAM CPPM – CNRS DIRAC Grid Middleware.
Design and Programming Chapter 7 Applied Software Project Management, Stellman & Greene See also:
The Network Performance Advisor J. W. Ferguson NLANR/DAST & NCSA.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
GGUS summary ( 4 weeks ) VOUserTeamAlarmTotal ALICE ATLAS CMS LHCb Totals 1.
Nick Brook Current status Future Collaboration Plans Future UK plans.
Belle MC Production on Grid 2 nd Open Meeting of the SuperKEKB Collaboration Soft/Comp session 17 March, 2009 Hideyuki Nakazawa National Central University.
Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli.
National Computational Science National Center for Supercomputing Applications National Computational Science NCSA-IPG Collaboration Projects Overview.
Costin Grigoras ALICE Offline. In the period of steady LHC operation, The Grid usage is constant and high and, as foreseen, is used for massive RAW and.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
SAN DIEGO SUPERCOMPUTER CENTER Inca TeraGrid Status Kate Ericson November 2, 2006.
Volpe Center AUTOMATION APPLICATIONS DIVISION DTS /18/01 1 CDM Data Quality Update A&D Meeting 18 April 2001 Mike Golibersuch (Volpe/EG&G)
T3 analysis Facility V. Bucard, F.Furano, A.Maier, R.Santana, R. Santinelli T3 Analysis Facility The LHCb Computing Model divides collaboration affiliated.
1 LHCb on the Grid Raja Nandakumar (with contributions from Greig Cowan) ‏ GridPP21 3 rd September 2008.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
HammerCloud Functional tests Valentina Mancinelli IT/SDC 28/2/2014.
Korea Workshop May GAE CMS Analysis (Example) Michael Thomas (on behalf of the GAE group)
Overview Background: the user’s skills and knowledge Purpose: what the user wanted to do Work: what the user did Impression: what the user think of Ganga.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
Update of SAM Implementation ALICE TF Meeting 18/10/07.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI User-centric monitoring of the analysis and production activities within.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Pavel Nevski DDM Workshop BNL, September 27, 2006 JOB DEFINITION as a part of Production.
Distributed Physics Analysis Past, Present, and Future Kaushik De University of Texas at Arlington (ATLAS & D0 Collaborations) ICHEP’06, Moscow July 29,
03/09/2007http://pcalimonitor.cern.ch/1 Monitoring in ALICE Costin Grigoras 03/09/2007 WLCG Meeting, CHEP.
CERN IT Department CH-1211 Genève 23 Switzerland t Future Needs of User Support (in ATLAS) Dan van der Ster, CERN IT-GS & ATLAS WLCG Workshop.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
Basic Navigation in Oracle R12 BY: Muhammad Irfan.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES The Common Solutions Strategy of the Experiment Support group.
GGUS summary (3 weeks) VOUserTeamAlarmTotal ALICE7029 ATLAS CMS LHCb Totals
INFSO-RI Enabling Grids for E-sciencE GOCDB Requirements John Gordon, STFC.
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
Seven things you should know about Ganga K. Harrison (University of Cambridge) Distributed Analysis Tutorial ATLAS Software & Computing Workshop, CERN,
QA process for Business Catalyst projects.. Starting a project What QA needs to start testing::  Specifications – A detailed description of the product,
A Statistical Analysis of Job Performance on LCG Grid David Colling, Olivier van der Aa, Mona Aggarwal, Gidon Moont (Imperial College, London)
HPDC Grid Monitoring Workshop June 25, 2007 Grid monitoring from the VO/user perspectives Shava Smallen.
Daniele Bonacorsi Andrea Sciabà
Archiving and Document Transfer Utilities
POW MND section.
LHCb Computing Model and Data Handling Angelo Carbone 5° workshop italiano sulla fisica p-p ad LHC 31st January 2008.
Monitoring of the infrastructure from the VO perspective
D. van der Ster, CERN IT-ES J. Elmsheuser, LMU Munich
Two methods to observe tutorial
Overview of Workflows: Why Use Them?
Presentation transcript:

Automated Grid Monitoring for LHCb Experiment through HammerCloud Bradley Dice Valentina Mancinelli

Project Overview  Use HammerCloud to…  Test LHCb data storage access  Ensure that new releases of user analysis programs function successfully  Why?  Temporarily disable sites with unreliable storage  Prioritize bug-fixing by most common problems  Keep the science moving!

Work falls into three categories: Front EndBack EndGrid Tests

Front End (User Interface)  Shows list of current and past tests and offers management tools  Progress:  Added data visualizations to categorize errors and the sites they affect (right)  Cleaned menu structures  Made job colors more easily understandable

Back End (Test Manager)  Interfaces between Ganga (to submit grid jobs) and Django (to display data)  Progress:  HammerCloud sites automatically update to match the WLCG topology  Ganga jobs report back detailed information for analysis  The backend produces plots showing jobs by status: complete, running, schedule, or failed (right)

Grid Tests (Getting Results)  Detecting and classifying data access failure is the key purpose of HammerCloud  Progress:  A postprocessor has to detect whether files were accessed locally or pulled from another site (failover)  Failover detection is presently difficult. Current collaboration with the developers of Ganga will help resolve this challenge.

Future Steps  Retrieve more job information (metrics on CPU time, etc.)  Provide grid site status information to RSS (Resource Status System)  Create data visualizations requested by LHCb  Document code in Twiki for future developers