INFSO-RI-508833 Enabling Grids for E-sciencE www.eu-egee.org Experience with monitoring of Prague T2 site Tomáš Kouba NEC 2007, Varna, Bulgaria 11.09.2007.

Slides:



Advertisements
Similar presentations
11 September 2007Milos Lokajicek Institute of Physics AS CR Prague Status of the GRID in the Czech Republic NEC’2007.
Advertisements

Makrand Siddhabhatti Tata Institute of Fundamental Research Mumbai 17 Aug
INFSO-RI Enabling Grids for E-sciencE Status of LCG-2 porting Stephen Childs, Brian Coghlan and Eamonn Kenny Grid-Ireland/EGEE October.
HPDC 2007 / Grid Infrastructure Monitoring System Based on Nagios Grid Infrastructure Monitoring System Based on Nagios E. Imamagic, D. Dobrenic SRCE HPDC.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE Logging and Bookkeeping and Job Provenance Services Ludek Matyska (CESNET) on behalf of the.
, Prague JAN ŠVEC Institute of Physics AS CR.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite IPv6 compliance project tests Further.
Enabling Grids for E-sciencE ENEA and the EGEE project gLite and interoperability Andrea Santoro, Carlo Sciò Enea Frascati, 22 November.
Monitoring the Grid at local, national, and Global levels Pete Gronbech GridPP Project Manager ACAT - Brunel Sept 2011.
Monitoring in EGEE EGEE/SEEGRID Summer School 2006, Budapest Judit Novak, CERN Piotr Nyczyk, CERN Valentin Vidic, CERN/RBI.
L ABORATÓRIO DE INSTRUMENTAÇÃO EM FÍSICA EXPERIMENTAL DE PARTÍCULAS Enabling Grids for E-sciencE Grid Computing: Running your Jobs around the World.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Information System on gLite middleware Vincent.
WLCG Tier-2 site in Prague: a little bit of history, current status and future perspectives Dagmar Adamova, Jiri Chudoba, Marek Elias, Lukas Fiala, Tomas.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
Klaster obliczeniowy WLCG – cz.I Alice::WTU::LCG - skład: VOBOX  alicluster.if.pw.edu.plVM: saturn.if.pw.edu.pl CREAM-CE  aligrid.if.pw.edu.pl VM: saturn.if.pw.edu.pl.
INFSO-RI Enabling Grids for E-sciencE Installation and configuration of gLite Resource Broker Emidio Giorgio INFN EGEE-EMBRACE tutorial,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios for Grid Services E. Imamagic, SRCE.
FP6−2004−Infrastructures−6-SSA E-infrastructure shared between Europe and Latin America Grid Monitoring Tools Alexandre Duarte CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Service Availability Monitoring – Status.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Site Monitoring with Nagios E. Imamagic,
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Feedback on SAM from SA1 site representatives.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stuart Kenny and Stephen Childs Trinity.
INFSO-RI Enabling Grids for E-sciencE OSG-LCG Interoperability Activity Author: Laurence Field (CERN)
Local Monitoring at SARA Ron Trompert SARA. Ganglia Monitors nodes for Load Memory usage Network activity Disk usage Monitors running jobs.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks NagiosGrapher Viewing the history of metrics.
US LHC OSG Technology Roadmap May 4-5th, 2005 Welcome. Thank you to Deirdre for the arrangements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
Presenter Name Facility Name UK Testbed Status and EDG Testbed Two. Steve Traylen GridPP 7, Oxford.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The GILDA t-Infrastructure Roberto Barbera.
INFSO-RI Enabling Grids for E-sciencE GridICE: Grid and Fabric Monitoring Integrated for gLite-based Sites Sergio Fantinel INFN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
INFSO-RI Enabling Grids for E-sciencE Introduction Data Management Ron Trompert SARA Grid Tutorial, September 2007.
SAM Sensors & Tests Judit Novak CERN IT/GD SAM Review I. 21. May 2007, CERN.
Site Report: Prague Jiří Chudoba Institute of Physics, Prague WLCG GridKa+T2s Workshop.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid2Win : gLite for Microsoft Windows Roberto.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GLite testing status and future Gianni Pucciani.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks APEL CPU Accounting in the EGEE/WLCG infrastructure.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
Materials for Report about Computing Jiří Chudoba x.y.2006 Institute of Physics, Prague.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
INFSO-RI Enabling Grids for E-sciencE Introduction to Grid Computing, EGEE and Bulgarian Grid Initiatives, Sofia, South.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Overview of Operations in EGEE-III Marcin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Practical using WMProxy advanced job submission.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite – UNICORE interoperability Daniel Mallmann.
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
INFSO-RI Enabling Grids for E-sciencE Operations Parallel Session Summary Markus Schulz CERN IT/GD Joint OSG and EGEE Operations.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
SAM Status Update Piotr Nyczyk LCG Management Board CERN, 5 June 2007.
INFSO-RI Enabling Grids for E-sciencE File Transfer Software and Service SC3 Gavin McCance – JRA1 Data Management Cluster Service.
Co-ordination & Harmonisation of Advanced e-Infrastructures for Research and Education Data Sharing Research Infrastructures Grant Agreement n
SAM architecture EGEE 07 Service Availability Monitor for the LHC experiments Simone Campana, Alessandro Di Girolamo, Nicolò Magini, Patricia Mendez Lorenzo,
INFSO-RI Enabling Grids for E-sciencE GOCDB Requirements John Gordon, STFC.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
Tier2 Centre in Prague Jiří Chudoba FZU AV ČR - Institute of Physics of the Academy of Sciences of the Czech Republic.
INFSO-RI Enabling Grids for E-sciencE GOCDB2 Matt Thorpe / Philippa Strange RAL, UK.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks New WLCG Grid Service Monitoring Displays.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
NGI and Site Nagios Monitoring
Use of Nagios in Central European ROC
Site availability Dec. 19 th 2006
Presentation transcript:

INFSO-RI Enabling Grids for E-sciencE Experience with monitoring of Prague T2 site Tomáš Kouba NEC 2007, Varna, Bulgaria

Enabling Grids for E-sciencE INFSO-RI Introduction 2 sites involved in EGEE and WLCG project in Prague: –farm golias (praguelcg2)‏  about 150 nodes with 450 cores  CE, 2xSE, many WNs, PBSPro head node –farm skurut (prague_cesnet_lcg2)‏  CE, SE, regional BDII –core infrastructure for VOs voce and auger:  LFC catalogue, VOMS server, lcg RB, glite WMS flexible monitoring is crucial for reliability (see

Enabling Grids for E-sciencE INFSO-RI Nagios I Why –de facto standard in monitoring –open source –easy to write new sensors –static configuration is not a problem Other competitors –ganglia –cacti –zenoss (built on ZOPE)‏ –zabbix (rapid development, graphing features, less robust)‏ –moodss

Enabling Grids for E-sciencE INFSO-RI Nagios II - addons Addons: –best source is –nuvola (better html and css look for nagios)‏ –nagiosql (web frontend for generating configuration from sql db)‏ –NagiosReport (developed locally, summarizes problems at site, information is gotten from nagios log files and status files)‏ –NagiosGrapher (generates graphs from nagios plugin's output)‏

Enabling Grids for E-sciencE INFSO-RI NagiosReport example Nagios summary generated at 09/11/ :10:02 in seconds. ================================================================================ Hosts in trouble Hosts in downtime (not monitored): golias01, golias15, golias16, golias17, golias59, golias97, golias99 ================================================================================ downtimes: ========== golias01: Host je docasne vyrazen kvuli chybejicim dilum. Mozna bude opraven. golias15: Stroj je nedostupny, protoze funguje jako remore syslog server golias16: Stroj docasne nedostupny kvuli presunuti do jine site - testovani Cisco routeru golias17: Host vyrazen pro testovani Glite 3.1 na SLC4 golias59: Vyrazen na nahradni dily golias97: The node currently does not exist. golias99: The node currently does not exist. ================================================================================ Hosts with problem occured in last 8 hours ================================================================================

Enabling Grids for E-sciencE INFSO-RI Nagios III Plugins –default plugins (part of nagios default installation)‏  ping, disk, procs, load, tcp, swap, ldap, gentoo_glsa, gentoo_service_rc_all –SRCE plugins (developed by Emir Imamagic)‏  cert, dpm, dpns, edg_broker, globus_gram2, gridftp, lfc, srm, srm_ping, voms  executed in SLC3 UI installation in chroot environment –RAL plugins  lcg_same (by Chris Brew)‏ –locally developed  hpacucli, ups, jobs, gstat

Enabling Grids for E-sciencE INFSO-RI Generating configuration I We have hardware database for hw management Extended with service definitions and hw-service relations Definition of WSDL interface to the database:

Enabling Grids for E-sciencE INFSO-RI Generating configuration II python client for generating the actual configuration –uses ZSI python SOAP bindings (older bindings pySOAP, SOAPpy etc. were not sufficient)‏ technically the WSDL file is generated from C header file (it is much easier for developer than writing WSDL by hand)‏ the project's home page is at under name SiteQuerywww.nagiosexchange.com

Enabling Grids for E-sciencE INFSO-RI Future work Present dependencies in the database and limit number of false alerts Create another sensors for our hardware (air condition unit, diesel power unit)‏ Cooperate our work with monitoring group of LCG/Hepix at Present another client for generating cfengine/quattor configuration