ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, 17.07.12, JINR, Dubna.

Slides:



Advertisements
Similar presentations
Welcome to Middleware Joseph Amrithraj
Advertisements

1 Generic logging layer for the distributed computing by Gene Van Buren Valeri Fine Jerome Lauret.
Sergey Belov, LIT JINR 15 September, NEC’2011, Varna, Bulgaria.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Experience of xrootd monitoring for ALICE at RDIG sites G.S. Shabratova JINR A.K. Zarochentsev SPbSU.
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks VO-specific systems for the monitoring of.
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Julia Andreeva CERN (IT/GS) CHEP 2009, March 2009, Prague New job monitoring strategy.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
Enabling Grids for E-sciencE System Analysis Working Group and Experiment Dashboard Julia Andreeva CERN Grid Operations Workshop – June, Stockholm.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Overview of STEP09 monitoring issues Julia Andreeva, IT/GS STEP09 Postmortem.
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
STATUS OF DCACHE N2N AND MONITORING REPORT I. CURRENT SITUATION xrootd4j is a part of dCache implemented in such a way that each change requires new dCache.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Ricardo Rocha CERN (IT/GS) EGEE’08, September 2008, Istanbul, TURKEY Experiment.
Xrootd Monitoring and Control Harsh Arora CERN. Setting Up Service  Monalisa Service  Monalisa Repository  Test Xrootd Server  ApMon Module.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,
Impala. Impala: Goals General-purpose SQL query engine for Hadoop High performance – C++ implementation – runtime code generation (using LLVM) – direct.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
Julia Andreeva on behalf of the MND section MND review.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES Andrea Sciabà Hammercloud and Nagios Dan Van Der Ster Nicolò Magini.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Monitoring of the LHC Computing Activities Key Results from the Services.
FAX UPDATE 12 TH AUGUST Discussion points: Developments FAX failover monitoring and issues SSB Mailing issues Panda re-brokering to FAX Monitoring.
The ATLAS Computing Model and USATLAS Tier-2/Tier-3 Meeting Shawn McKee University of Michigan Joint Techs, FNAL July 16 th, 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Global ADC Job Monitoring Laura Sargsyan (YerPhI).
Import XRootD monitoring data from MonALISA Sergey Belov, JINR, Dubna DNG section meeting,
GridView - A Monitoring & Visualization tool for LCG Rajesh Kalmady, Phool Chand, Kislay Bhatt, D. D. Sonvane, Kumar Vaibhav B.A.R.C. BARC-CERN/LCG Meeting.
Accounting in LCG/EGEE Can We Gauge Grid Usage via RBs? Dave Kant CCLRC, e-Science Centre.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Network integration with PanDA Artem Petrosyan PanDA UTA,
Enabling Grids for E-sciencE CMS/ARDA activity within the CMS distributed system Julia Andreeva, CERN On behalf of ARDA group CHEP06.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Distributed Analysis Tutorial Dietrich Liko. Overview  Three grid flavors in ATLAS EGEE OSG Nordugrid  Distributed Analysis Activities GANGA/LCG PANDA/OSG.
Site Authorization Service Local Resource Authorization Service (VOX Project) Vijay Sekhri Tanya Levshina Fermilab.
WLCG Transfers Dashboard A unified monitoring tool for heterogeneous data transfers. Alexandre Beche.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
Streaming Analytics with Spark 1 Magnoni Luca IT-CM-MM 09/02/16EBI - CERN meeting.
XRootD Monitoring Report A.Beche D.Giordano. Outlines  Talk 1: XRootD Monitoring Dashboard  Context  Dataflow and deployment model  Database: storage.
TIFR, Mumbai, India, Feb 13-17, GridView - A Grid Monitoring and Visualization Tool Rajesh Kalmady, Digamber Sonvane, Kislay Bhatt, Phool Chand,
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Present and Future Pedro Andrade (CERN IT) 31 st August.
WLCG Transfers monitoring EGI Technical Forum Madrid, 17 September 2013 Pablo Saiz on behalf of the Dashboard Team CERN IT/SDC.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
Daniele Bonacorsi Andrea Sciabà
OVirt Data Warehouse 02/11/11 Yaniv Dary BI Software Engineer, Red Hat.
Report of Dubna discussion
ALICE Monitoring
Experiment Dashboard overviw of the applications
Artem Petrosyan (JINR), Danila Oleynik (JINR), Julia Andreeva (CERN)
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Advancements in Availability and Reliability computation Introduction and current status of the Comp Reports mini project C. Kanellopoulos GRNET.
A Messaging Infrastructure for WLCG
Monitoring Of XRootD Federation
Monitoring of the infrastructure from the VO perspective
Presentation transcript:

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna

Goals of the project Provide reasonable monitoring solution for ‘off grid’ sites (unplugged geographically close computing resources) Monitoring of computing facility of local groups with collocated storage system (Tier1+Tier3, Tier2+Tier3) Present Tier-3 sites activity on global level Data transfer monitoring across XRootD federation GRID'2012, JINR, Dubna

Tier-3 sites monitoring levels Monitoring of the local infrastructure for site administration Central system for monitoring of the VO activities at Tier-3 sites GRID'2012, JINR, Dubna

Objectives of the local monitoring system at Tier-3 site Detailed monitoring of the local fabric Monitoring of the batch system Monitoring of the job processing Monitoring of the mass storage system Monitoring of the VO computing activities on the local site GRID'2012, JINR, Dubna

Objectives of the global Tier-3 monitoring Monitoring of the VO usage of the Tier-3 resources in terms of data transfer, data access, and job processing Quality of the provided service based on the job processing and data transfer monitoring metrics GRID'2012, JINR, Dubna

Site monitoring Based on Ganglia monitoring system Collects basic metrics using Ganglia sensors Plugin system for monitoring specific metrics PostgreSQL to aggregate data More details for each package at Monitoring modules available for Condor, Lustre, PBS, Proof, XRootD; each has plugin to deliver data to the global level Examples of UI for different systems at GRID'2012, JINR, Dubna

Data flow for the site monitoring Common UI for various data sources Small core with separate modules allows to install only needed software Delivery to global level can be switched off GRID'2012, JINR, Dubna

Global monitoring Ganglia as executor MSG as transmitting system Publisher on local site: is executed by gmond, intercommunicates with local DB and sends information to MSG system Backend: consumer(s) of messages at CERN and data popularity and jobs statistics presentation via Dashboard GRID'2012, JINR, Dubna

Data flow for the global monitoring GRID'2012, JINR, Dubna

Data flow for Proof, Condor PostgreSQL for data aggregation on local site Ganglia UI to present data popularity on site level Ganglia gmond to execute summary gathering Summary is delivered to Dashboard historical views once per hour Data being sent to global level: Job status: Ok, stopped, aborted Site name Time of report Amount of processed events Bytes read Amount of active users GRID'2012, JINR, Dubna

Data flow for XRootD Both summary and detailed events gatherer implemented as Linux daemon Summary data goes directly to Ganglia File transfer data can be stored in local PostgreSQL and then presented via Ganglia Detailed data can be delivered to ActiveMQ directly Data being sent to global level: Domain from, host and ip address Domain to, host and ip address User File, size Bytes read, written Time transfer started and finished GRID'2012, JINR, Dubna

Tier-3 monitoring status Full chain of development from Tier-3 site to Dashboard was performed Site-level presentation via Ganglia Web 2.0 Global-level presentation of Proof jobs via Dashboard Historical Views Tier-3 site to DQ2 popularity: formats agreed, delivers, consumer on DQ2 side is in testing stage T3Mon software was installed on pilot sites Distribution is available via our repository: We are welcome more sites to try and to send their feedback to our support list: GRID'2012, JINR, Dubna

XRootD transfers monitoring Goal: present transfers between servers and sites in federation via one UI Messages from XRootD servers are being collected via T3Mon UDP collector and then being sent into AMQ Data is stored in Hbase storage Hadoop processing is used to prepare data summaries Web-services for data export Dashboard transfer interface as UI GRID'2012, JINR, Dubna

Data flow for the XRootD federation monitoring GRID'2012, JINR, Dubna

T3Mon UDP messages collector Can be installed anywhere, implemented as Linux daemon Extracts transfer info from several messages and compose file transfer message Sends complete transfer message to ActiveMQ Message includes: – Domain from, host and ip address – Domain to, host and address – User – File, size – Bytes read/written – Time transfer started/finished GRID'2012, JINR, Dubna

AMQ2Hadoop collector Can be installed anywhere, implemented as Linux daemon Listens ActiveMQ queue Extracts messages Inserts into Hbase raw table GRID'2012, JINR, Dubna

Hadoop processing Reads raw table Prepares data summary: 10 min stats as structure: – From – To – Sum bytes read – Sum bytes written – Amount files read – Amount files written Inserts summary data into summary table MapReduce: we use Java, we also working on enabling Pig routines GRID'2012, JINR, Dubna

Storage2UI data export Web-service Extracts data from the storage Feeds Dashboard XBrowse UI GRID'2012, JINR, Dubna

Status In prototype stage: – Hadoop processing is executed manually – Simulated data UI: dev.jinr.ru/ui/#date.from= &date.interval=0&date.to= &grouping.dst=(host)&grouping.src=(host) We are ready to start testing on real federation GRID'2012, JINR, Dubna

Thanks for attention GRID'2012, JINR, Dubna