Presentation is loading. Please wait.

Presentation is loading. Please wait.

ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, 17.07.12, JINR, Dubna.

Similar presentations


Presentation on theme: "ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, 17.07.12, JINR, Dubna."— Presentation transcript:

1 ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, 17.07.12, JINR, Dubna

2 Goals of the project Provide reasonable monitoring solution for ‘off grid’ sites (unplugged geographically close computing resources) Monitoring of computing facility of local groups with collocated storage system (Tier1+Tier3, Tier2+Tier3) Present Tier-3 sites activity on global level Data transfer monitoring across XRootD federation 217.07.2012GRID'2012, JINR, Dubna

3 Tier-3 sites monitoring levels Monitoring of the local infrastructure for site administration Central system for monitoring of the VO activities at Tier-3 sites 317.07.2012GRID'2012, JINR, Dubna

4 Objectives of the local monitoring system at Tier-3 site Detailed monitoring of the local fabric Monitoring of the batch system Monitoring of the job processing Monitoring of the mass storage system Monitoring of the VO computing activities on the local site 417.07.2012GRID'2012, JINR, Dubna

5 Objectives of the global Tier-3 monitoring Monitoring of the VO usage of the Tier-3 resources in terms of data transfer, data access, and job processing Quality of the provided service based on the job processing and data transfer monitoring metrics 517.07.2012GRID'2012, JINR, Dubna

6 Site monitoring Based on Ganglia monitoring system Collects basic metrics using Ganglia sensors Plugin system for monitoring specific metrics PostgreSQL to aggregate data More details for each package at https://svnweb.cern.ch/trac/t3mon/wiki/T3MONHome https://svnweb.cern.ch/trac/t3mon/wiki/T3MONHome Monitoring modules available for Condor, Lustre, PBS, Proof, XRootD; each has plugin to deliver data to the global level Examples of UI for different systems at http://vm01.jinr.ru/ganglia/ http://vm01.jinr.ru/ganglia/ 617.07.2012GRID'2012, JINR, Dubna

7 Data flow for the site monitoring Common UI for various data sources Small core with separate modules allows to install only needed software Delivery to global level can be switched off 717.07.2012GRID'2012, JINR, Dubna

8 Global monitoring Ganglia as executor MSG as transmitting system Publisher on local site: is executed by gmond, intercommunicates with local DB and sends information to MSG system Backend: consumer(s) of messages at CERN and data popularity and jobs statistics presentation via Dashboard 817.07.2012GRID'2012, JINR, Dubna

9 Data flow for the global monitoring 917.07.2012GRID'2012, JINR, Dubna

10 Data flow for Proof, Condor PostgreSQL for data aggregation on local site Ganglia UI to present data popularity on site level Ganglia gmond to execute summary gathering Summary is delivered to Dashboard historical views once per hour Data being sent to global level: Job status: Ok, stopped, aborted Site name Time of report Amount of processed events Bytes read Amount of active users 1017.07.2012GRID'2012, JINR, Dubna

11 Data flow for XRootD Both summary and detailed events gatherer implemented as Linux daemon Summary data goes directly to Ganglia File transfer data can be stored in local PostgreSQL and then presented via Ganglia Detailed data can be delivered to ActiveMQ directly Data being sent to global level: Domain from, host and ip address Domain to, host and ip address User File, size Bytes read, written Time transfer started and finished 1117.07.2012GRID'2012, JINR, Dubna

12 Tier-3 monitoring status Full chain of development from Tier-3 site to Dashboard was performed Site-level presentation via Ganglia Web 2.0 Global-level presentation of Proof jobs via Dashboard Historical Views Tier-3 site to DQ2 popularity: formats agreed, delivers, consumer on DQ2 side is in testing stage T3Mon software was installed on pilot sites Distribution is available via our repository: https://svnweb.cern.ch/trac/t3mon/wiki/YumConfigure https://svnweb.cern.ch/trac/t3mon/wiki/YumConfigure We are welcome more sites to try and to send their feedback to our support list: t3mon-jinr-@googlegroups.com 1217.07.2012GRID'2012, JINR, Dubna

13 XRootD transfers monitoring Goal: present transfers between servers and sites in federation via one UI Messages from XRootD servers are being collected via T3Mon UDP collector and then being sent into AMQ Data is stored in Hbase storage Hadoop processing is used to prepare data summaries Web-services for data export Dashboard transfer interface as UI 1317.07.2012GRID'2012, JINR, Dubna

14 Data flow for the XRootD federation monitoring 1417.07.2012GRID'2012, JINR, Dubna

15 T3Mon UDP messages collector Can be installed anywhere, implemented as Linux daemon Extracts transfer info from several messages and compose file transfer message Sends complete transfer message to ActiveMQ Message includes: – Domain from, host and ip address – Domain to, host and address – User – File, size – Bytes read/written – Time transfer started/finished 1517.07.2012GRID'2012, JINR, Dubna

16 AMQ2Hadoop collector Can be installed anywhere, implemented as Linux daemon Listens ActiveMQ queue Extracts messages Inserts into Hbase raw table 1617.07.2012GRID'2012, JINR, Dubna

17 Hadoop processing Reads raw table Prepares data summary: 10 min stats as structure: – From – To – Sum bytes read – Sum bytes written – Amount files read – Amount files written Inserts summary data into summary table MapReduce: we use Java, we also working on enabling Pig routines 1717.07.2012GRID'2012, JINR, Dubna

18 Storage2UI data export Web-service Extracts data from the storage Feeds Dashboard XBrowse UI 1817.07.2012GRID'2012, JINR, Dubna

19 Status In prototype stage: – Hadoop processing is executed manually – Simulated data UI: http://xrdfedmon- dev.jinr.ru/ui/#date.from=201206210000&date.interval=0&date.to=20 1206220000&grouping.dst=(host)&grouping.src=(host) We are ready to start testing on real federation 1917.07.2012GRID'2012, JINR, Dubna

20 Thanks for attention 2017.07.2012GRID'2012, JINR, Dubna


Download ppt "ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, 17.07.12, JINR, Dubna."

Similar presentations


Ads by Google