Download presentation
Presentation is loading. Please wait.
Published byAustin Barton Modified over 9 years ago
1
Customer Engagement Workshop IT Service Continuity Phoenix, Aston 6th May 2015 Paul Gant, Head of BCM Assurance David Davies, BCM Assurance Consultant
2
Agenda 11:00 Registration, refreshments and networking. 11:30 Why get fit, anyway? 11:50 Fictitious live incident. 12:10 Post incident review. 12:30 Steps to success. 12:50 Questions & answers. 13:00 Lunch, tours, event close. 13:30 BCM Assurance 1-2-1 sessions by appointment.
3
Why get fit, anyway?
4
Introducing BCM Assurance – your personal trainers
5
What if?
6
Real Recovery (Invocations) is like a Battle YOUR ENEMIES (Lack of) time. You can’t recover what you haven’t backed up. You can’t upgrade recovery technology during an invocation. YOUR FRIENDS Phoenix. Your preparation.
7
What does “Preparation” involve? It’s not just about the technology! But aren’t policies, analysis, plans and reports only there to satisfy to auditor? Is there any rhyme or reason to them?
8
PrioritiesDependenciesPlansTestingMaintenance IT Service Continuity Management
9
1. What’s needed first? PrioritiesDependenciesPlansTestingMaintenance
10
2. What rests on what? 3 DependenciesPlansTestingMaintenancePriorities
11
3. Make a plan DependenciesPlansTestingMaintenancePriorities
12
4. See if it works DependenciesPlansTestingMaintenancePriorities
13
5. Keep it up-to-date PrioritiesDependenciesPlansTestingMaintenance
14
What goes wrong? Issues reported in the media DATACOM co-location datacentre flood, Melbourne Australia, March 2010 Heavy rain broke a ceiling panel and poured water into the data centre. Water damaged SANs, servers and routers. All equipment impacted by 12 hour power outage. Camera Corner / Connecting Point datacentre fire, Green Bay, Wisconsin, USA, 19 th March 2008 Fire alarms but no fire suppression. 75 hosted servers destroyed. “10 day outage” reported, with 98% of services resumed by 1 st April.
15
Phoenix Standby Reasons
16
Phoenix Invocation Reasons
18
The reccurring dangers that we see IT recovery requirements haven’t been agreed with the business (through a BIA). IT recovery strategy isn’t joined up (i.e. a full end to end solution isn’t there). Strategy isn’t supported by plans and isn’t tested rigorously enough (resulting in inefficiencies and failures during actual recovery).
19
Fictitious Live Incident (Why have a personal trainer to help you?)
20
Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape)
21
Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor 1 gbps CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape)
22
08:07 Fire Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape)
23
12:15 Servers onsite Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape) 08:07 Fire
24
Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape) 12:15 Servers onsite 08:07 Fire 12:45 Exec Report
25
Warehouse and second server room (ground floor) Backup SAN and tapes Offices and Server room 2 nd (top) floor 12:15 Servers onsite 08:07 Fire 12:45 Exec Report CRITICAL SYSTEMS: Recovery Time Objective 24 hours Recovery Point Objective 24 hours (disk to disk daily) NON CRITICAL SYSTEMS: Recovery Time Objective 5 days Recovery Point Objective 1 day (local tape) and 7 day (offsite tape) 13:15 Start recovery
26
12:15 Servers onsite 08:07 Fire 12:45 Exec Report 13:15 Start recovery
27
12:15 Servers onsite 08:07 Fire 12:45 Exec Report 13:15 Start recovery 09:30 Server recovered?
28
12:15 Servers onsite 08:07 Fire 12:45 Exec Report 13:15 Start recovery 09:30 Server recovered? 11:45 Recovery stalled
29
Post Incident Review (What are the consequences of being unfit?)
30
Post Incident Review What went well? (Where were they fit?) what went badly? (Where were they unfit?) What could the IT manager have done differently during the recovery? What could the IT manager have done differently before the recovery?
31
IT Service Continuity Issues Have you experienced any of the issues raised? Difficulty in getting board engagement. No business requirements for IT recovery (i.e. not BIA). Single points of failure in key skills sets. Lack of recovery documentation (perhaps no spare time to write it?) Lack of formal testing and test reporting. Any other issues?
32
The Barriers and Results What’s stopping you / stopped you from making changes? What would happen if changes aren’t made and you invoke? What would happen if you do make the changes?
33
Steps to Success (How to become IT service continuity fit.)
34
What if?
35
The Steps to Successful IT Service Continuity 1. Engagement and sponsorship at a strategic level. 2. Balance between the technology and ITSC management. 3. Do all of ITSC, and run it as a repeating programme.
36
1. Strategy: Talk the Language of the Business
38
1. Strategy: Engage with the Executive Team Does the Executive Team know: What are the impacts if IT fails? What are the risks associated with IT failure? What is the RTO and RPO of services – and what these terms mean. What is the recovery and hand back process?
39
2. Balance Technology with ITSC Management Priorities Depende ncies PlansTesting Maintena nce
40
3. Do all of the Programme Steps, and Repeat Business Impact Analysis IT Service Continuity Plan IT Recovery Testing Time Trigger PEAKPEAK BC Readiness PrioritiesDependenciesPlansTestingMaintenance
41
3. Do all of the Programme Steps, and Repeat Business Impact Analysis Time Trigger PEAKPEAK BC Readiness PrioritiesDependenciesPlansTestingMaintenance IT Service Continuity Plan IT Recovery Testing
42
What if?
43
Trap 1: The Scope Trap
44
Trap 2: The Audit Trap
45
Trap 3: The Importance and Urgency Trap
46
Trap 4: The Gambler’s (or Optimist’s) Trap
47
Trap 5: The Hero Trap
48
Any Questions?
49
Thank you for participating. Lunch is now ready. Would you like a tour or a meeting?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.