Artem Petrosyan (JINR), Danila Oleynik (JINR), Julia Andreeva (CERN)

Slides:



Advertisements
Similar presentations
Chubaka Producciones Presenta :.
Advertisements

2012 JANUARY Sun Mon Tue Wed Thu Fri Sat
27-29 September 2002CrossGrid Workshop LINZ1 USE CASES (Task 3.5 Test and Integration) Santiago González de la Hoz CrossGrid Workshop at Linz,
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Testing PanDA at ORNL Danila Oleynik University of Texas at Arlington / JINR PanDA UTA 3-4 of September 2013.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
Sergey Belov, Tatiana Goloskokova, Vladimir Korenkov, Nikolay Kutovskiy, Danila Oleynik, Artem Petrosyan, Roman Semenov, Alexander Uzhinskiy LIT JINR The.
LHC Experiment Dashboard Main areas covered by the Experiment Dashboard: Data processing monitoring (job monitoring) Data transfer monitoring Site/service.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
TeraGrid Information Services December 1, 2006 JP Navarro GIG Software Integration.
Experience of xrootd monitoring for ALICE at RDIG sites G.S. Shabratova JINR A.K. Zarochentsev SPbSU.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks VO-specific systems for the monitoring of.
And Tier 3 monitoring Tier 3 Ivan Kadochnikov LIT JINR
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
OSG Tier 3 support Marco Mambelli - OSG Tier 3 Dan Fraser - OSG Tier 3 liaison Tanya Levshina - OSG.
Storage cleaner: deletes files on mass storage systems. It depends on the results of deletion, files can be set in states: deleted or to repeat deletion.
WLCG infrastructure monitoring proposal Pablo Saiz IT/SDC/MI 16 th August 2013.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
WP3 Information and Monitoring Rob Byrom / WP3
XROOTD AND FEDERATED STORAGE MONITORING CURRENT STATUS AND ISSUES A.Petrosyan, D.Oleynik, J.Andreeva Creating federated data stores for the LHC CC-IN2P3,
Site Manageability & Monitoring Issues for LCG Ian Bird IT Department, CERN LCG MB 24 th October 2006.
2011 Calendar Important Dates/Events/Homework. SunSatFriThursWedTuesMon January
Tier3 monitoring. Initial issues. Danila Oleynik. Artem Petrosyan. JINR.
Julia Andreeva on behalf of the MND section MND review.
July 2007 SundayMondayTuesdayWednesdayThursdayFridaySaturday
Planning Session. ATLAS(-CMS) End-to-End Demo Kaushik De is the Demo Czar Need to put team together Atlfast production jobs –Atlfast may be unstable over.
ATLAS Off-Grid sites (Tier-3) monitoring A. Petrosyan on behalf of the ATLAS collaboration GRID’2012, , JINR, Dubna.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
DIRAC for Grid and Cloud Dr. Víctor Méndez Muñoz (for DIRAC Project) LHCb Tier 1 Liaison at PIC EGI User Community Board, October 31st, 2013.
WLCG Accounting Task Force Update Julia Andreeva CERN GDB, 8 th of June,
LCG Introduction John Gordon, STFC-RAL GDB June 11 th, 2008.
Grid Colombia Workshop with OSG Week 2 Startup Rob Gardner University of Chicago October 26, 2009.
Operations Coordination Team Maria Girone, CERN IT-ES GDB, 11 July 2012.
Accounting Review Summary and action list from the (pre)GDB Julia Andreeva CERN-IT WLCG MB 19th April
WLCG Accounting Task Force Introduction Julia Andreeva CERN 9 th of June,
Daniele Bonacorsi Andrea Sciabà
WLCG Transfers Dashboard
WLCG Workshop 2017 [Manchester] Operations Session Summary
James Casey, CERN IT-GD WLCG Workshop 1st September, 2007
Virtualization and Clouds ATLAS position
Blueprint of Persistent Infrastructure as a Service
RDMS CMS Computing Activities: current status & participation in ARDA
U.S. ATLAS Grid Production Experience
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
LCG 3D Distributed Deployment of Databases
Andreas Unterkircher CERN Grid Deployment
PanDA setup at ORNL Sergey Panitkin, Alexei Klimentov BNL
3D Application Tests Application test proposals
FTS Monitoring Ricardo Rocha
Experiment Dashboard overviw of the applications
Data Federation with Xrootd Wei Yang US ATLAS Computing Facility meeting Southern Methodist University, Oct 11-12, 2011.
Monitoring Of XRootD Federation
Workshop Summary Dirk Duellmann.
LCG middleware and LHC experiments ARDA project
Monitoring of the infrastructure from the VO perspective
McDonald’s Kalender 2009.
McDonald’s Kalender 2009.
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
HLT Jet and MET DQA Summary
Report on GLUE activities 5th EU-DataGRID Conference
McDonald’s Kalender 2009.
McDonald’s calendar 2007.
February 2007 Note: Source:.
McDonald’s calendar 2007.
2015 January February March April May June July August September
Presentation transcript:

Artem Petrosyan (JINR), Danila Oleynik (JINR), Julia Andreeva (CERN) Tier3 Monitoring TF Artem Petrosyan (JINR), Danila Oleynik (JINR), Julia Andreeva (CERN)

T3MON proposal (1/3) Finalized at the beginning of 2011. Registered as ATLAS note: http://cdsweb.cern.ch/record/1336119 «T3MON-SITE» - software suite for local site monitoring, based on Ganglia monitoring system Modules (plug-ins) for local resource management systems (LRMS) and storage systems Additional plug-ins development for Proof and xRootD Aggregation and transmission summary data to central monitoring «T3MON-GLOBAL» - information system for aggregating and visualizing data from distributed Tier3 sites at a global VO Should be integrated with current ATLAS monitoring system (Dashboard) Work is divided in two streams: validation of standard components and development. ATLAS Software & Computing Workshop 05.04.11

T3MON proposal (2/3) In order to validate T3MON-SITE for different T3 configurations, establishment of a work group at JINR was proposed Tasks: Deployment of a test cluster Installation of batch systems and mass storage systems reported as being used at Tier3 sites during T3 survey (various configurations) Installation and configuration of data file monitoring and inventory Installation and configuration of Ganglia for a specific cluster setup Installation and validation of the additional Ganglia plug-ins for monitoring metrics collection Preparation of installation and configuration instructions Participation in the xRootD federation project within ATLAS ATLAS Software & Computing Workshop 05.04.11

T3MON proposal (3/3) Milestones «T3MON-SITE» Begin of June 2011: first prototype Middle of July 2011- begin of September 2011: “Alfa” version September 2011: stable version «T3MON-GLOBAL» Begin of June 2011: complete the collection of system requirements August - September 2011: development and debugging of the publishing agents October – middle of November 2011: collecting data to the central repository. Integration with the Dashboard monitoring systems Middle of December 2011: a pilot version, collecting additional information for implementation of the final version February 2012 – March 2012: a final version. ATLAS Software & Computing Workshop 05.04.11

Team at JINR Involved 4 specialists, 3 young employees, 2 software experts, several volunteers Software Artem Petrosyan Danila Oleynik Sergey Belov Vladimir Vasilyev Installation and validation Nikolay Kutovskiy Ignat Lensky, Ivan Kadochnikov, Anatoly Yakshov Software experts Lucia Valova (Proof cluster administrator) Pavel Dmitrienko (local monitoring system administrator/development) ATLAS Software & Computing Workshop 05.04.11

Testbed at JINR Organized in February 2011 Multicore nodes Virtualization 4 virtual clusters at the moment PBS xRootD PROOF OGE/SGE 3 clusters (PBS, xRootD, OGE/SGE) monitored by Ganglia ATLAS Software & Computing Workshop 05.04.11

Status  Software Test cluster Ganglia Development Documentation xRootD  + PROOF PBS (Torque) OGE/SGE Condor LSF Lustre  - done + - in progress ATLAS Software & Computing Workshop 05.04.11

Plans Setting up development infrastructure at CERN: Development nodes Repository (SVN) Common development framework with other application (Dashboard, DQ2) Twiki documentation xRootD & Proof plug-ins for Nagios (how to extend monitoring systems for sites which already use Nagios) Installation & validation: Condor, Lustre ATLAS Software & Computing Workshop 05.04.11

Open issues Monitoring hooks in Athena Collecting more information about list of metrics to be presented on the global level Information about delivery frequency to the global level ATLAS Software & Computing Workshop 05.04.11 9

Summary Proposal is prepared and issued Work group is organized Test infrastructure is set up at JINR Documentation preparation is in process Development of plug-ins is in process ATLAS Software & Computing Workshop 05.04.11 10