23 January 2007WLCG workshop, CERN System Management Working Group Alessandra Forti WLCG workshop CERN, 23 January 2007.

Slides:



Advertisements
Similar presentations
Andrew McNab - Manchester HEP - 24 May 2001 WorkGroup H: Software Support Both middleware and application support Installation tools and expertise Communication.
Advertisements

23 May 2007Hep Sysman, RAL Hepix/WLCG System Management WG: an update Alessandra Forti Hep Sysman, RAL 23 May 2007.
Plateforme de Calcul pour les Sciences du Vivant SRB & gLite V. Breton.
Storage: Futures Flavia Donno CERN/IT WLCG Grid Deployment Board, CERN 8 October 2008.
LAL Site Report Michel Jouvin LAL / IN2P3
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
WLCG Cloud Traceability Working Group progress Ian Collier Pre-GDB Amsterdam 10th March 2015.
CERN IT Department CH-1211 Genève 23 Switzerland t Messaging System for the Grid as a core component of the monitoring infrastructure for.
1 Deployment of an LCG Infrastructure in Australia How-To Setup the LCG Grid Middleware – A beginner's perspective Marco La Rosa
HEPiX Catania 19 th April 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 19 th April 2002 HEPiX 2002, Catania.
A. Mohapatra, HEPiX 2013 Ann Arbor1 UW Madison CMS T2 site report D. Bradley, T. Sarangi, S. Dasu, A. Mohapatra HEP Computing Group Outline  Infrastructure.
13 June 2007Operations Workshop, Stockholm1 Hepix/WLCG System Management WG Alessandra Forti Operations Workshop 14 June 2007.
INFSO-RI Enabling Grids for E-sciencE SA1: Cookbook (DSA1.7) Ian Bird CERN 18 January 2006.
LCG and HEPiX Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002.
Enabling Grids for E-sciencE Overview of System Analysis Working Group Julia Andreeva CERN, WLCG Collaboration Workshop, Monitoring BOF session 23 January.
1 24x7 support status and plans at PIC Gonzalo Merino WLCG MB
02/07/09 1 WLCG NAGIOS Kashif Mohammad Deputy Technical Co-ordinator (South Grid) University of Oxford.
CERN IT Department CH-1211 Geneva 23 Switzerland t Open projects in Grid Monitoring IT-GS-MDS Section Meeting 25 th January 2008.
SouthGrid SouthGrid SouthGrid is a distributed Tier 2 centre, one of four setup in the UK as part of the GridPP project. SouthGrid.
Responsibilities of ROC and CIC in EGEE infrastructure A.Kryukov, SINP MSU, CIC Manager Yu.Lazin, IHEP, ROC Manager
What is SAM-Grid? Job Handling Data Handling Monitoring and Information.
GGUS at PEB – –- page 1 LCG Klaus-Peter Mickel, GridKa Karlsruhe LCG-PEB-Meeting ( ) The Global Grid User Support Model (Report of GDB.
Light weight Disk Pool Manager experience and future plans Jean-Philippe Baud, IT-GD, CERN September 2005.
LCG Pilot Jobs + glexec John Gordon, STFC-RAL GDB 7 November 2007.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
Deployment Summary GridPP11 Jeremy Coles 15th September 2004.
INFSO-RI Enabling Grids for E-sciencE Enabling Grids for E-sciencE Pre-GDB Storage Classes summary of discussions Flavia Donno Pre-GDB.
Lemon Monitoring Presented by Bill Tomlin CERN-IT/FIO/FD WLCG-OSG-EGEE Operations Workshop CERN, June 2006.
HEPiX IPv6 Working Group David Kelsey GDB, CERN 11 Jan 2012.
Site Manageability & Monitoring Issues for LCG Ian Bird IT Department, CERN LCG MB 24 th October 2006.
Security Policy Update WLCG GDB CERN, 14 May 2008 David Kelsey STFC/RAL
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
HEPiX Fall 2009 Highlights Michel Jouvin LAL, Orsay November 10, 2009 GDB, CERN.
PIC port d’informació científica EGEE – EGI Transition for WLCG in Spain M. Delfino, G. Merino, PIC Spanish Tier-1 WLCG CB 13-Nov-2009.
16-Nov-01D.P.Kelsey, HTASC report1 HTASC - Report to HEP-CCC David Kelsey, RAL rl.ac.uk 16 November 2001, CERN ( )
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
EGEE is a project funded by the European Union under contract IST Roles & Responsibilities Ian Bird SA1 Manager Cork Meeting, April 2004.
GridPP storage status update Joint GridPP Board Deployment User Experiment Update Support Team, Imperial 12 July 2007,
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Operations Automation Team Kickoff Meeting.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Ian Bird All Activity Meeting, Sofia
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks ROCs Top 5 Middleware Issues Daniele Cesini,
Testing Infrastructure Wahid Bhimji Sam Skipsey Intro: what to test Existing testing frameworks A proposal.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
INFSO-RI Enabling Grids for E-sciencE Fabric and Management WG Davide Salomoni NIKHEF Lyon, ARM-3 –
My Jobs at CERN April 2015 My Jobs at CERN2
LHCC Referees Meeting – 28 June LCG-2 Data Management Planning Ian Bird LHCC Referees Meeting 28 th June 2004.
Password? CLASP Phase 2: Revised Proposal FOCUS, 3 May 2001 Denise Heagerty, IT/IS.
Evolution of WLCG infrastructure Ian Bird, CERN Overview Board CERN, 30 th September 2011 Accelerating Science and Innovation Accelerating Science and.
II EGEE conference Den Haag November, ROC-CIC status in Italy
1 Grid Service Monitoring James Casey, CERN IT-GD WLCG/OSG Operations Meeting 14th June 2007.
DB Questions and Answers open session (comments during session) WLCG Collaboration Workshop, CERN Geneva, 24 of April 2008.
EGEE-II INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks CYFRONET site report Marcin Radecki CYFRONET.
6 th CIC on Duty meeting Lyon 27-29/03/2006 Enabling Grids for E-sciencE Grid INTER-Operations Hélène Cordier EGEE/WLCG Operations IN2P3 Computing Centre.
Monitoring Working Group Update Grid Deployment Board 5 th December, CERN Ian Neilson.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
RI EGI-InSPIRE RI Puppet community Next steps Peter Solagna – EGI.eu.
Ian Bird LCG Project Leader Summary of EGI workshop.
Monitoring BOF, 23 rd Jan 2007 Grid Service Monitoring Working Group Monitoring WG BOF, January 2007 James Casey/Ian Neilson.
Bob Jones EGEE Technical Director
WLCG IPv6 deployment strategy
WLCG Workshop 2017 [Manchester] Operations Session Summary
James Casey, CERN IT-GD WLCG Workshop 1st September, 2007
HEPiX Configuration Management WG
Regional Operations Centres Core infrastructure Centres
Ian Bird GDB Meeting CERN 9 September 2003
Quattor Usage at Nikhef
Cristina del Cano Novales STFC - RAL
Ian Bird LCG Project - CERN HEPiX - FNAL 25-Oct-2002
Grid Management Challenge - M. Jouvin
Presentation transcript:

23 January 2007WLCG workshop, CERN System Management Working Group Alessandra Forti WLCG workshop CERN, 23 January 2007

23 January 2007WLCG workshop, CERN Background Ian Bird at the Fall 2006 Hepix and at the WLCG Management board – essionId=8&materialId=slides&confId=384https://indico.fnal.gov/materialDisplay.py?contribId=34&s essionId=8&materialId=slides&confId=384 – ;sessionId=s0&materialId=slides&confId=a063271http://indico.cern.ch/materialDisplay.py?contribId=s0t14&amp ;sessionId=s0&materialId=slides&confId=a groups have been created to set up a comprehensive monitoring framework to improve the robustness of grid sites. –System Management WG: system management and fabric monitoring tools and cookbook –Grid Services Monitoring WG: middleware monitoring and monitoring framework. –System Analysis WG: Monitoring from the application side GSWG and SAWG presentations will follow

23 January 2007WLCG workshop, CERN Mandate: Intro One of the problems observed (by EGEE and LCG) in providing a reliable grid service is the reliability of the local fabric services of participating sites. The SMWG should bring together the existing expertise in different area of fabric management to build a common repository of tools and knowledge for the benefit of HEP system managers’ community. The idea is not to present all possible tools nor to create new ones, but to recommend specific tools for specific problems according to the best practices already in use at sites. Although this group is proposed in order to help improve grid sites reliability, the results should be useful to any site running similar local services. Two areas should be improved by the group: tools and documentation.

23 January 2007WLCG workshop, CERN Mandate: Goals Improve overall level of grid site reliability, focussing on improving system management practices, sharing expertise, experience and tools Provide a repository –Management tools –Fabric monitoring sensors –HOWTOs Provide site manager input to requirements on grid monitoring and management tools Propose existing tools to the grid monitoring working group as solutions to general problems Produce a Grid Site Fabric Management cook-book –Recommend basic tools to cover essential practices, including security management –Discover what are common problems for sites and document how experienced sites solve them –Document collation of best practices for grid sites Point out holes in existing documentation sets Identify training needs –To be addressed in a workshop or by EGEE for example?

23 January 2007WLCG workshop, CERN Preliminary list of areas and tools System Management Areas –Filesystems: ext(2,3), XFS, NFS, AFS, dcache, DPM –Networking: Interfaces, IPs, Routers, Gateways, NAT –Databases: mysql, Oracle, ldap, gdbm –Processes: system, users monitoring –Servers: http, dhcp, dns, ldap, sendmail or other, sshd, (grid)ftp rfio –Batch systems: LSF, Torque, Maui, BQS, Sun Grid Engine, Condor –Security: login access pool accounts, certificates management and monitoring, non required services, ports list backups, monitoring(file systems, processes, networking), log files (grid services included) Common Fabric Monitoring and Management Tools –Monitoring: Ganglia, Nagios, Ntop, Home grown, SAM, GridICE, Lemon –Management: Cfengine, Npaci rocks, Kickstart, Quattor –Security: iptables, rootkit, tripwire, nmap, ndiff, tcpdump, syslog, yummit –Grid Configuration: Yaim, Quattor

23 January 2007WLCG workshop, CERN Mandate: Interaction with GSWG  Some of the areas covered by this group overlap with the Grid Services Monitoring Working Group ones particularly the local fabric monitoring area.  The two groups are required to work in close contact and boundaries and division of responsibility should be discussed between the groups.  The SMWG should act as a bridge between the system managers and the developers in the GSMWG giving feedback for what concerns monitoring tools and sensors used.  It is important that work is not duplicated.

23 January 2007WLCG workshop, CERN Group Organisation Chairs: –Alessandra Forti (University of Manchester) –Michel Jouvin (LAL) The group organisation is a big question mark at the moment as it depends very much on the number of people and quality (ie dedicated time) of participation. –To be sustainable in the long term it has to be light wait and loosely bound, i.e people joining and leaving according to their availability. However this might not be feasible at the beginning when the initial structure has to be setup and a smaller core of dedicated people among the loosely bound are needed.

23 January 2007WLCG workshop, CERN Further Information Group mandate link: – ystemManagementWGMandatehttps://uimon.cern.ch/twiki/bin/view/LCG/S ystemManagementWGMandate Mailing list for the group: If you want to contribute contact: It would be useful to know your areas of expertise.