CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t Managing changes - 1 Managing changes Olof Bärring WLCG 2009, 14 th November 2008.

Slides:



Advertisements
Similar presentations
ITIL: Service Transition
Advertisements

Six Steps to Implementing Change Management that Works Arvind Parthiban.
Configuration Management
CERN IT Department CH-1211 Genève 23 Switzerland t Messaging System for the Grid as a core component of the monitoring infrastructure for.
National Finance Center’s 2008 Customer Forum EmpowHR 9.0 Billy Dantagnan Teracore.
Change Advisory Board COIN v1.ppt Change Advisory Board ITIL COIN June 20, 2007.
Revised Change, Configuration, Release (CCR) Rollout Overview
Release & Deployment ITIL Version 3
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES News on monitoring for CMS distributed computing operations Andrea.
EGI-Engage Recent Experiences in Operational Security: Incident prevention and incident handling in the EGI and WLCG infrastructure.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES WLCG operations: communication channels Andrea Sciabà WLCG operations.
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
CERN IT Department CH-1211 Genève 23 Switzerland t Tape-dev update Castor F2F meeting, 14/10/09 Nicola Bessone, German Cancio, Steven Murray,
CERN IT Department CH-1211 Genève 23 Switzerland t IT Department Website 2011 Overhaul Cath Noble – IT-DI-LCG February 2011.
CERN - IT Department CH-1211 Genève 23 Switzerland t LCG Deployment GridPP 18, Glasgow, 21 st March 2007 Tony Cass Leader, Fabric Infrastructure.
Asset Record Does Not Equal CI: The confusion between Asset and Configuration Management Christine M. Russo Manager, IT Asset Management and Property.
CERN IT Department CH-1211 Geneva 23 Switzerland t The Experiment Dashboard ISGC th April 2008 Pablo Saiz, Julia Andreeva, Benjamin.
CERN IT Department CH-1211 Genève 23 Switzerland t Service Management GLM 15 November 2010 Mats Moller IT-DI-SM.
Operating Systems & Infrastructure Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS OIS Feedback on Module Responsibilities.
EGI-Engage Recent Experiences in Operational Security: Incident prevention and incident handling in the EGI and WLCG infrastructure.
Service Transition & Planning Service Validation & Testing
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Service Management for CERN Change Management Acceptance Meeting Geneva, Jochen Beuttel.
CERN IT Department CH-1211 Geneva 23 Switzerland t Storageware Flavia Donno CERN WLCG Collaboration Workshop CERN, November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Tier0 Status - 1 Tier0 Status Tony Cass LCG-LHCC Referees Meeting 18 th November 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t Castor development status Alberto Pace LCG-LHCC Referees Meeting, May 5 th, 2008 DRAFT.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Overlook of Messaging.
CERN - IT Department CH-1211 Genève 23 Switzerland t CASTOR Status March 19 th 2007 CASTOR dev+ops teams Presented by Germán Cancio.
Impact of end of EMI+EGI-SA3 April 2013: EMI project finishes EGI-Inspire-SA3 finishes (mainly CERN affected) EGI-Inspire continues until April 2014 EGI.eu.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
MW Readiness WG Update Andrea Manzi Maria Dimou Lionel Cons 10/12/2014.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT DPM Collaboration Motivation and proposal Oliver Keeble CERN On.
CERN IT Department CH-1211 Geneva 23 Switzerland t CCRC’08 Tools for measuring our progress CCRC’08 F2F 5 th February 2008 James Casey, IT-GS-MND.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Priorities update Andrea Sciabà IT/GS Ulrich Schwickerath IT/FIO.
CERN IT Department CH-1211 Genève 23 Switzerland t 24x7 Service Support Tony Cass LCG GDB, 24 th November 2009.
CERN IT Department CH-1211 Genève 23 Switzerland t Towards agile software development Marwan Khelif IT-CS-CT IT Technical Forum – 31th May.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
State of Georgia Release Management Training
CERN IT Department CH-1211 Geneva 23 Switzerland t WLCG Operation Coordination Luca Canali (for IT-DB) Oracle Upgrades.
CERN - IT Department CH-1211 Genève 23 Switzerland t A Quick Overview of ITIL John Shade CERN WLCG Collaboration Workshop April 2008.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Report from GSSD Storage Workshop Flavia Donno CERN WLCG GDB 4 July 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES CVMFS deployment status Ian Collier – STFC Stefan Roiser – CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
CERN - IT Department CH-1211 Genève 23 Switzerland Operations procedures CERN Site Report Grid operations workshop Stockholm 13 June 2007.
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
CERN IT Department CH-1211 Genève 23 Switzerland t Future Needs of User Support (in ATLAS) Dan van der Ster, CERN IT-GS & ATLAS WLCG Workshop.
CERN IT Department CH-1211 Geneva 23 Switzerland t ES 1 how to profit of the ATLAS HLT farm during the LS1 & after Sergio Ballestrero.
Testbed Release Criteria and Plans WP8 Loose Cannons & App Representatives.
CERN - IT Department CH-1211 Genève 23 Switzerland t Grid Reliability Pablo Saiz On behalf of the Dashboard team: J. Andreeva, C. Cirstoiu,
Ian Collier, STFC, Romain Wartel, CERN Maintaining Traceability in an Evolving Distributed Computing Environment Introduction Security.
CERN - IT Department CH-1211 Genève 23 Switzerland t IT-GD-OPS attendance to EGEE’09 IT/GD Group Meeting, 09 October 2009.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service Level & Responsibilities Dirk Düllmann LCG 3D Database Workshop September,
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Web site lifecycles Problem is that web sites live forever –Out of date sites with.
Grid Deployment Technical Working Groups: Middleware selection AAA,security Resource scheduling Operations User Support GDB Grid Deployment Resource planning,
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland.
1 CASE STUDY—CLIENT ABC FOCUS ON CHANGE MANAGEMENT Who: Client ABC—Large Business Enterprise with Multiple IT Organizations, and Operational Departments.
15-Jun-04D.P.Kelsey, LCG-GDB-Security1 LCG/GDB Security Update (Report from the LCG Security Group) CERN 15 June 2004 David Kelsey CCLRC/RAL, UK
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland.
ITIL: Service Transition
IT Service Transition – purpose and processes
The CREAM CE: When can the LCG-CE be replaced?
Olof Bärring LCG-LHCC Review, 22nd September 2008
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 1 Managing changes Olof Bärring WLCG 2009, 14 th November 2008

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 2 To change or not …? Is change really so bad? – The paradigm for LHC start-up has been stability, stability, stability – “It’s working! don’t touch it” – Truth is that everything changes… Configuration (s/w, h/w): every day Linux updates: every week Linux OS: every ~18-24 month Middleware: every now and then Or is it just the change control that is bad? – Assume that change is needed for improving something Functionality for end-users Service operation and stability – It’s unavoidable: changes are here to stay, we just have to learn living with them Managing changes rather than avoiding them? All the time!

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 3 When? Deployment strategies Baby-steps – Trickle of changes one-by-one – Each of which may be treated independently – If something goes wrong, easy to rollback Periodic scheduled – Aggregation of changes – Freeze, test and certify – If something goes wrong, rollback may be difficult Big-bang – Basically the same as periodic scheduled changes though not necessarily ‘periodic’ – Accumulate changes for a long period, which may include major upgrades to more than one component

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 4 When? Deployment strategies Baby-steps – Trickle of changes one-by-one – Each of which may be treated independently – If something goes wrong, easy to rollback Periodic scheduled – Aggregation of changes – Freeze, test and certify – If something goes wrong, rollback may be difficult Big-bang – Basically the same as periodic scheduled changes though not necessarily ‘periodic’ – Accumulate changes for a long period, which may include major upgrades to more than one component But… it usually only breaks after a while due to destructive interference of accumulated changes 

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 5 When? Deployment strategies Baby-steps – Trickle of changes one-by-one – Each of which may be treated independently – If something goes wrong, easy to rollback Periodic scheduled – Aggregation of changes – Freeze, test and certify – If something goes wrong, rollback may be difficult Big-bang – Basically the same as periodic scheduled changes though not necessarily ‘periodic’ – Accumulate changes for a long period, which may include major upgrades to more than one component But… lacks the virtue of establishing change as routine. Between two big-bangs there may be an universe 

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 6 When? Deployment strategies Baby-steps – Trickle of changes one-by-one – Each of which may be treated independently – If something goes wrong, easy to rollback Periodic scheduled – Aggregation of changes – Freeze, test and certify – If something goes wrong, rollback may be difficult Big-bang – Basically the same as periodic scheduled changes though not necessarily ‘periodic’ – Accumulate changes for a long period, which may include major upgrades to more than one component But… the goal for m/w provider should be to allow for revocable updates

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 7 When? Deployment strategies Baby-steps – Trickle of changes one-by-one – Each of which may be treated independently – If something goes wrong, easy to rollback Periodic scheduled – Aggregation of changes – Freeze, test and certify – If something goes wrong, rollback may be difficult Big-bang – Basically the same as periodic scheduled changes though not necessarily ‘periodic’ – Accumulate changes for a long period, which may include major upgrades to more than one component

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 8 How? (ITIL) flow Requested Ready for evaluation Ready for decision Authorized Scheduled Implemented Closed Setup new FTS Request ok as new release is certified High impact, low risk: experiments can test and migrate at their convenience VOs and other sites agree Request hardware Plan installation and announce Service ready, VOs test and migrate All VOs migrated, issue new RFC for closing SLC3 service

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 9 ! Time: Nov :04:55 ID: RFC345879: who: Joe Change type: standard (pre-auth) Why: fix lousy streaming video perf What: net.ipv4.tcp_window_scaling = 1  0; Affected CIs: all diskservers RFC record Tracking ? RFC archive... CERN  T1 12/11

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 10 Tracking Process has to be lightweight and to a large part automated – Ideally a workflow with predefined and self-documenting state transitions E.g. extract list of affected Configuration Items (nodes, devices, …) May required deep level of site details Twiki may not be the most appropriate implementation – Access to change tracker must be authenticated and secure All changes are tracked, also standard (pre- authorized) changes – If something starts to go wrong at Site A on Day X Anything changed at Site A on Day X? Anything changed at Site B-Z on Day X? Anything changed in the network on Day X?

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 11 Tracking Process has to be lightweight and to a large part automated – Ideally a workflow with predefined and self-documenting state transitions E.g. extract list of affected Configuration Items (nodes, devices, …) May required deep level of site details Twiki may not be the most appropriate implementation – Access to change tracker must be authenticated and secure All changes are tracked, also standard (pre- authorized) changes – If something starts to go wrong at Site A on Day X Anything changed at Site A on Day X? Anything changed at Site B-Z on Day X? Anything changed in the network on Day X? Utopia but perhaps asymptotically?

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 12 WLCG Operations role? The grid-wide “Change Advisory Board” (CAB) when change impacts site availability? – Review list of Request For Change (RFC) with grid level impact – Each change is classified by the site in terms of Impact: to site, to the grid, to a VO,… Risk: likelihood of failure, ability to rollback, plan B, … – Authorize the change Stakeholders agree that site can go ahead with the planning for the change Maintain list of types for ‘standard’ changes – Pre-authorized changes, e.g. Linux upgrades Site configuration changes … Emergency changes authorized by site – WLCG operation group meets daily but not available for a 24/7 CAB role ITIL  GrITIL

CERN IT Department CH-1211 Genève 23 Switzerland t Managing changes - 13 Questions? Comments?