EGEE-II INFSO-RI-031688 Enabling Grids for E-sciencE www.eu-egee.org EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.

Slides:



Advertisements
Similar presentations
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Operations Dashboard Workplan Cyril.
Advertisements

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Grid Infrastructure and Operations Maite.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite Release Process Maria Alandes Pradillo.
08/11/908 WP2 e-NMR Grid deployment and operations Technical Review in Brussels, 8 th of December 2008 Marco Verlato.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks From ROCs to NGIs The pole1 and pole 2 people.
Enabling Grids for E-sciencE COD 19 meeting, Bologna Nordic ROD experiences Michaela Lechner COD-19, Bologna.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Romanian SA1 report Alexandru Stanciu ICI.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Simply monitor a grid site with Nagios J.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What GGUS can do for you JRA1 All hands.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Postmortem gLite postmortem an examination of a dead body to determine the cause.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks gLite IPv6 compliance project tests Further.
Site Report from KEK, Japan JP-KEK-CRC-01 and JP-KEK-CRC-02 Go Iwai, KEK/CRC Grid Operations Workshop – 2007 Kungliga Tekniska högskolan, Stockholm, Sweden.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks PPS All sites Meeting: Introduction & Agenda.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROD model assessment ROC UKI John Walsh.
Enabling Grids for E-sciencE SA1 EGEE-II INFSO-RI The Pre-Production Service in WLCG/EGEE A. Retico, N. Thackray CERN – Geneva, Switzerland PPS.
GridPP Deployment & Operations GridPP has built a Computing Grid of more than 5,000 CPUs, with equipment based at many of the particle physics centres.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks GStat 2.0 Joanna Huang (ASGC) Laurence Field.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks WMSMonitor: a tool to monitor gLite WMS/LB.
INFSO-RI Enabling Grids for E-sciencE SA1 and gLite: Test, Certification and Pre-production Nick Thackray SA1, CERN.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Next steps with EGEE EGEE training community.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team James Casey EGEE’08.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Multi-level monitoring - an overview James.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Performance Improvements to BDII - Grid Information.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Site Monitoring with Nagios E. Imamagic,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE-EGI Grid Operations Transition Maite.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Torsten.
EGEE-III INFSO-RI Enabling Grids for E-sciencE Antonio Retico CERN, Geneva 19 Jan 2009 PPS in EGEEIII: Some Points.
8 th CIC on Duty meeting Krakow /2006 Enabling Grids for E-sciencE Feedback from SEE first COD shift Emanoil Atanassov Todor Gurov.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE Site Architecture Resource Center Deployment Considerations MIMOS EGEE Tutorial.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Dashboard Cyril L’Orphelin - CNRS/IN2P3.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Using GStat 2.0 for Information Validation.
INFSO-RI Enabling Grids for E-sciencE An overview of EGEE operations & support procedures Jules Wolfrat SARA.
INFSO-RI Enabling Grids for E-sciencE /10/20054th EGEE Conference - Pisa1 gLite Configuration and Deployment Models JRA1 Integration.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Resource Allocation in EGEEIII Overview &
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Communication tools between Grid Virtual.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Progress report from University of Cyprus.
EGEE-II INFSO-RI Enabling Grids for E-sciencE Operations procedures: summary for round table Maite Barroso OCC, CERN
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CIC portal Requirements from users WLCG service.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Monitoring Tools E. Imamagic, SRCE CE.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Deliverable DSA1.4 Jules Wolfrat ARM-9 –
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The EGEE User Support Infrastructure Alistair.
CERN - IT Department CH-1211 Genève 23 Switzerland Operations procedures CERN Site Report Grid operations workshop Stockholm 13 June 2007.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Regional Nagios Emir Imamagic /SRCE EGEE’09,
INFSO-RI Enabling Grids for E-sciencE gLite Certification and Deployment Process Markus Schulz, SA1, CERN EGEE 1 st EU Review 9-11/02/2005.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Overview of Operations in EGEE-III Marcin.
Operations model Maite Barroso, CERN On behalf of EGEE operations WLCG Service Workshop 11/02/2006.
EGEE-II TCD 22 nd -25 th May 2007 Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Experiences with a distributed.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid Configuration Data or “What should be.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operations: Evolution of the Role of.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks What all NGIs need to do: Helpdesk / User.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks EGEE Operational Procedures (Contacts, procedures,
II EGEE conference Den Haag November, ROC-CIC status in Italy
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Grid is a Bazaar of Resource Providers and.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROC model assessment AP ROC ShuTing Liao.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks The Dashboard for Operations Cyril L’Orphelin.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks COD-16 (Transition to EGEE-III) Report to.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations automation team presentazione.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Towards an Information System Product Team.
Scuola Grid - Martina Franca, Thursday 08 November Il Sistema di Supporto INFNGrid & GGUS ( Global Grid User.
INFSO-RI Enabling Grids for E-sciencE GOCDB2 Matt Thorpe / Philippa Strange RAL, UK.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks IT ROC: Vision for EGEE III Tiziana Ferrari.
Enabling Grids for E-sciencE EGEE-II INFSO-RI ROC managers meeting at EGEE 2007 conference, Budapest, October 1, 2007 Admin Matters Vera Hanser.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Nagios Grid Monitor E. Imamagic, SRCE OAT.
Regional Operations Centres Core infrastructure Centres
NGI and Site Nagios Monitoring
Brief overview on GridICE and Ticketing System
Presentation transcript:

EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, CYFRONET - overview Site scale –150 dual CPU nodes,  40 CPUs ia64, Itanium2  260 CPUs ia32, Xeon Production and PPS service Core Services –Toplevel BDII, WMS, RB –LFC, VOMS for Gaussian, Turbomole VOs –Regional SAM instance Tools we use for site management –OS deployment: local RPM repository + network install –YAIM for site configuraion –Ganglia for monitoring

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Daily operations Tools used in daily grid operations –Monitoring  SAM, GStat  Regional Nagios instance notifications (Core Service arrives as SMS)‏ Smart test hierarchy – always report the right problem –Documentation  Wikis: GOC, CE ROC, SEE ROC, other  Other web pages –Support  lcgrollout  Regional 1st line support team: just need to answer tickets and cooperate

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Missing features Documentation: –Proper operations manuals for grid middleware (e.g. host certificate copies on WMS)‏ –Better information access – documentation too dispersed and hidden in local pages Deployment: –Grid middleware could be more similar to standard network services used in servers  Standard/consistent paths and files for logs, pids etc.  Standard/consistent logging format so logwatch, logcheck can use them  No hard-coded values in installation and configuration scripts  Only components needed for standard operation of a service in distribution Monitoring –SAM web interface improvements:  failures for all VOs for a given site on one page,  all sensors for a given site on one page,  all sensors for the whole region on one page –Some monitoring could be done at site to avoid failures that are outside of a site

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Most Frequent Failures Most frequent scheduled interventions –HW/SW upgrade, maintenance, reconfiguration Most frequent unscheduled interventions –HW/power problems, bug in middleware update (WMS, RB)‏ Percentage of real problems detected by COD –A few percents –Majority detected and reported by regional 1st line support team + regional monitoring system

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Seamless Deployment Deploying Core Services –Seamless toplevel BDII deployment ready: we use special setup for toplevel BDII – several machines, supports failover and maintenance works –Update instances one after another Ideas for seamless deployment –Sometimes services fails just after upgrade so need for thorough testing of releases by ROC –Own testbed and repository with updates only after proper testing, so sites can safely autoupdate.

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Communication Communication with users –Direct s, via GGUS or regional helpdesk Correlation of cross-site issues –Enough for the moment –Via regional mailing list, Broadcast tool, operations meetings Points to improve communication –Tickets and notifications with operational problems should be answered more timely esp. if Central Service is failing –Better web pages with cross links with some user friendly collaboration technologies for exchanging experience and knowledge. Ticketing system are often inhuman and too formal

Enabling Grids for E-sciencE EGEE-II INFSO-RI Operations Workshop, Stockholm, Usefulness of bodies –COD – problem notification body, makes sure problems are solved, takes care of monitoring tools –1st line support team:  assists site admins till problem solution or software bug report  very helpful for non-experiences admins  overlapping with COD: some cooperation between 1 st line support and COD team on-duty could be established –Operations meeting  tackles with unsolvable problems detected by 1st line support and operational problems with core services and middleware  to make a pressure on stuck GGUS tickets