TSA1.4 Infrastructure for Grid Management Tiziana Ferrari, EGI.eu EGI-InSPIRE – SA1 Kickoff Meeting1.

Slides:



Advertisements
Similar presentations
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Wrap up on perfSONAR-Lite_TSS and Network Troubleshooting Mario Reale GARR.
Advertisements

GOCDB A repository for a worldwide grid infrastructure G. Mathieu, A. Richards, J. Gordon, C. Del Cano Novales, P. Colclough, M. Viljoen CHEP09, Prague,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.
EGI: SA1 Operations John Gordon EGEE09 Barcelona September 2009.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The network monitoring in grid context Operations.
02/07/09 1 WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford.
The EGI Blueprint: Grid Operations and Security Migration to the next grid operations era Tiziana Ferrari (Istituto Nazionale di Fisica Nucleare)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-III INFSO-RI Enabling Grids for E-sciencE COD21 22 Sept 2009 Forum & COD-22 since COD21 until EGI Hélène Cordier COD-22, CNRS-IN2P3,
RI EGI-InSPIRE RI EGI Future activities Peter Solagna – EGI.eu.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1: Grid Operations Maite Barroso (CERN)
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Wojciech Lapka SAM Team CERN EGEE’09 Conference,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGI Operations Tiziana Ferrari EGEE User.
EGI-InSPIRE Steven Newhouse Interim EGI.eu Director EGI-InSPIRE Project Director Technical Director EGEE-III 1GDB - December 2009.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Xavier Jeannin (CNRS/UREC Paris, FR) 24.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI How to integrate portals with the EGI monitoring system Dusan Vudragovic.
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Ops Portal New Requirements.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Xavier Jeannin (CNRS/UREC Paris, FR) 24.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA2 Networking support for EGEE III Xavier.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks SA1 & SA2-ENOC Interactions status and plans.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Operations Portal Development Update on Requirements Cyril L'Orphelin IN2P3/CNRS.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What all NGIs need to do: Helpdesk / User.
Enabling Grids for E-sciencE EGEE-III INFSO-RI Status of the EGI O-E-12 Task: Coordination of Network Support for EGI Mario Reale IGI / GARR
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Etienne Dublé.
EGI Process Assessment and Improvement Plan – EGI core services – Tiziana Ferrari FedSM project 1EGI Process Assessment and Improvement Plan (Core Services)
EGI-InSPIRE Project Overview1 EGI-InSPIRE Overview Activities and operations boards Tiziana Ferrari, EGI.eu Operations Unit Tiziana.Ferrari at egi.eu 1.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Regionalisation summary Prague 1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Update on Service Availability Monitoring (SAM) Marian Babik, David Collados,
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI EGI Services for Distributed e-Infrastructure Access Tiziana Ferrari on behalf.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Regional tools use cases overview Peter Solagna – EGI.eu On behalf of the.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GOCDB4 Gilles Mathieu, RAL-STFC, UK An introduction.
EGI-InSPIRE EGI-InSPIRE RI Network Troubleshooting and PerfSONAR-Lite_TSS Mario Reale GARR.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI A pan-European Research Infrastructure supporting the digital European Research.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Operations Portal OTAG September, 21th 2011 Cyril L’Orphelin – CCIN2P3/CNRS.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Applications Database Past, Present.
EGI-InSPIRE Project Overview1 SA1 Kickoff meeting EGI-InSPIRE Tiziana Ferrari, EGI.eu Operations Unit Tiziana.Ferrari at egi.eu 1.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operational Tools M2 Update James Casey.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks An insight into GOCDB for ROD Operators.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Status of the SAM/Nagios/GSTAT Components.
Transition to EGI PSC-06 Istanbul Ioannis Liabotis Greece GRNET
Piotr Bała, Marcin Radecki, Krzysztof Benedyczak
GGUS New features and roadmap
Regional Operations Centres Core infrastructure Centres
NGI and Site Nagios Monitoring
Status of SA2 network monitoring and troubleshooting tools
EGEE is a project funded by the European Union
SA1 Execution Plan Status and Issues
SA1.4 Infrastructure for Grid Management Overview
Ian Bird GDB Meeting CERN 9 September 2003
POW MND section.
Operational Tools Update OMB 27/07/2010
Evolution of SAM in an enhanced model for monitoring the WLCG grid
Security Monitoring in a Nagios world
Networking support (SA2) tasks for EGI
Advancements in Availability and Reliability computation Introduction and current status of the Comp Reports mini project C. Kanellopoulos GRNET.
Operations & Coordination Tools
Maite Barroso, SA1 activity leader CERN 27th January 2009
TS4.10 Comp Reports A new approach to Computing Availability/Reliability reports for EGI Progress Report C. Kanellopoulos GRNET 9/14/2018.
Monitoring in EGEE Automatisierung & Regionalisierung im Hinblick auf EGI Torsten Antoni (KIT), James Casey (CERN), Sabine Reißer (KIT)
Solutions for federated services management EGI
EGI operations - news T. Ferrari/EGI.eu 12/9/2018.
Kashif Mohammad Deputy Technical Co-ordinator (South Grid) Oxford
Presentation transcript:

TSA1.4 Infrastructure for Grid Management Tiziana Ferrari, EGI.eu EGI-InSPIRE – SA1 Kickoff Meeting1

Goal The purpose of this task is the deployment of the infrastructure for Grid management consisting of a set of services and tools needed by the NGI/EIRO Operations Centres regionally and/or centrally for the running of the Grid software services, for Grid monitoring (including SLA and security monitoring), and ongoing Grid management.

Internal O-N and O-E tasks O-E-1 GOCDB 0.5 FTE, UK O-E-3 Monitoring infrastructure 0.25 CERN, 0.25 GRNET, O-E-4 Operations portal and dashboard 0.25 FTE FR O-E-12 Tools for network troubleshooting and monitoring 0.25 FTE IT O-N-1 Grid topology database O-N-3 Grid repositories (for operational tools) O-N-4 operations portal and dashboard

O-E-1 GOCDB deployment: Current situation CENTRAL GOCDB4 WSGUI GOCDB module REGION / NGI Local users GOCDB3 WSGUI central users EGI tools central tools Read/Write Read only GOCDBPI_v4 GOCDBPI Courtesy of G.Mathieu

GOCDB deployment: Wanted situation CENTRAL GOCDB4 WSGUI GOCDB module REGION / NGI Local users INPUT GOCDB4 WSGUI GOCDB module central users EGI tools central tools Read/Write Read only GOCDBPI_v4 Release timeline First half of July, if well planned and well announced; accounting portal still relying on GOCDB3

O-E-3 Montoring Validation of Nagios instances –Nagios migrated on May 26 th : ROC: ITALY, UKI NGI: NGI_Greece –Nagios migrated on June 1 st : ROC Central Europe ROC IGALC ROC Latin America ROC South Western Europe Remaining instances will be migrated during June: –ROC: AP, Canada, France, Germany/Switzerland, NE, Russia, SEE –NGI: NGI_PL, NGI_France, NGI_BY, NGI_SK, NGI_SI, NGI_HR, NGI_CZ (by now running on CERN Nagios instances) Courtesy of J.Casey, D.Collados

O-E-3 Monitoring (cont) Nagios-based availability/reliability reports compared to SAM reports –Statistics comparable (small improvement with Nagios by its design) SAM –Proposed date for switching off: June 15 MyEGI portal deployment model: –central project instance (CERN) + NGI instances Monitoring of monitoring – –Requested feedback and ideas for more services/probes to deploy (got some input from the ENOC)

Central Oracle DBs currently deployed at CERN: –Aggregated Topology Provider (ATP) –Metric Description Database (MDDB) –Metric Results Store (MRS) Evolution During Y1: –Improve profiles management in MDDB –Implement history functionality in ATP –Integrate & deploy the three DBs into one single account –Maintenance & bug fixing O-E-3 Monitoring: Central DBs status 8

O-E-3 Monitoring: Messaging Currently: 3 sites with brokers +1 broker for APEL accounting Y1 evolution: –it was an aim of the general broker network to support authorization as required by APEL –APEL to migrate once that has been achieved –Until then APEL will run one or more brokers to support APEL depending on STFC view of the risks of a single point of failure.

O-E-4 Operations portal and dashboard 2 Central Web Applications : –historical portal: –recent portal: hosting the Operations Dashboard Module This module will be proposed in a regional package: June 8th Other features will be migrated progressively to the new portal and integrated step by step in the regional package Courtesy of C. L’Orphelin

O-E-4 Central Instance of the dashboard: Architecture

O-E-4 Availibility and failover High availability context : –Each configuration of Lavoisier is copied in SVN –The database Mysql is backed-up Restoration of the back-up : 30 min –The Web machine is hosted in a cluster No automatic failover yet. The DNS switch and the replication of data will be studied during the 1st year. The central instance could be used in case of troubles on the Regional instances.

O-E-4: Migration plans Migration to the rest of key features to Symfony and the new Portal : –VO ID Card –Broadcast tool –User tracking –VO / Sites resources browser Propose regional modules when possible of these features

O-E-12 Network tools DownCollector Polling tool reporting on reachability of GOCDB services (tests on TCP ports) Central server running the probes, star-based architecture EGEE III instance: migrated to GARR (Italy) –will be accessible through a new portal dedicated to the O-E-12 task, which will be available at the URL to be setuphttp://eginet.garr.it High Availability currently not available (to be defined in Y1) Originally developed by IN2P3 CC-Lyon (EGEE SA2)  GARR 14 Courtesy of M.Reale

O-E-12 Network troubleshooting perfSONAR-lite TroubleShooting Services Started in EGEE-III, entirely designed by SA2 Developments lead by DFN/Erlangen Central server orchestrating on demand e2e measurements between light probes hosted by Grid sites Bandwidth measurements DNS lookup Traceroute Port testing Ping

O-E-12 perfSONAR-lite TSS 16

O-E-12 perfSONAR-lite TSS: future –initial deployment strategy within the EGI required O-E-12 testing and deployment campaigns in the next weeks –core development needed to further improve security related to available bandwidth tests and simply AA –DFN and CNRS are interested in be engaged with the future development 17

Y1 Milestones and deliverables MS401 Operational Tools regionalisation status (INFN) PM1 in collaboration with TSA1.5 Contribution to MSA406 “Deployment plan for the distribution of operational tools to the NGIs/EIROs “ (see TSA1.3) Contribution to MSA404 “Operational Level Agreements (OLAs)“ (see TSA1.8)

Short/medium term issues Migration to nagios server final layout, upgrade of the dashboard and gstat, fasing out of GOCDB3 Is current failover/HA of central operational tools sufficient? Measurement of availability/reliability of tools (central/regional MyEGI portals, dashboard, GGUS, regional helpdesk, central/regional monitoring infrastructure,...) Contribution to the definition of OLAs concerning tools