Planning for a Plan: Disaster Recovery Preparation Alex Schenck Solutions Engineer, Zerto Planning for a Plan: Disaster Recovery Preparation
Alex Schenck AS/BS InfoTech, NEIT MS Management, WPI IT professional since 2006 IT Manager, SysAdmin, Tech Support, Sales Engineer VCP5/VCP6 Lives in Cumberland RI with Melissa
What’s your plan?
How do you handle… …power failures? …hardware failures? …network failures? …software/application failures? …PEBKAC? …the “smoking hole scenario?” …the unknown?
If your answer is… Restore from backups… Restore from storage snapshot… Restore from VM snapshot… Hope and pray
Your organization’s data is at risk AND You are at risk
Knowledge is power Cost of data loss (Recovery Point Objective More = $$$) Cost of downtime (Recovery Time Objective More = $$$) 06:00 09:00 18:00 00:00 12:00 15:00 $$$$$$$$$
Start from Scratch Technology is a means to an end Does it provide value to the organization? Does the cost of implementing the solution outweigh the cost of a disaster? Where is your organization’s assets? Single site Colocation CSP/MSP Public cloud
Things to consider: What is the problem/deficiency? What is the desired impact of the new solution? Total Cost of Ownership Agility/Cloud Philosophy
Know Thyself Consider your workloads Determine tiers – not all data is created or treated equal Determine the “cutoff point” for backup versus disaster recovery Stuff you can’t live without versus stuff you can What can the business afford?
So what do you pick? Synchronous replication (RPO 0 / RTO 0) Asynchronous replication (RPO very low / RTO very low) Storage snapshots Backup Archive
A note on why BC/DR Backups – The moment that a point in time backup is created, its value diminishes with time Solves a different use case
RPO and RTO Recovery Point Objective: Total amount of data you will lose between the latest iteration of “now” and selected recovery checkpoint BC/DR: Seconds (Zerto!), minutes Storage Snapshots: Hours Backups: Days Recovery Time Objective: Total amount of time it takes to provide service back to stakeholders Varys with solution (Zerto = minutes!) (image from https://www.linkedin.com/pulse/20141027141721-157452529-what-s-your-rto-rpo-for-disaster-recovery)
Tiering Tier 1: Highest priority, highest impact Represent the greatest impact to revenue, reputation, and health of your organization should it become unavailable Tier 2: Important, crucical, perhaps not as important Tier 3: Secondary or supporting in nature to Tier 1 or Tier 2 Ancillary servers
Service Level Agreements for Tiers How far back in the past should I be able to provide BC/DR functionality for a given workload? What is the highest RPO between checkpoints that the organization will tolerate? What is the longest time interval that a given workload may go without being tested for failover functionality?
Putting the Pieces in Place How much of your environment do you want to be able to fail over at once? Complete site recovery, or partial recoverability only? What budget and staffing do you have available to support a DR environment? Cloud friendliness, or on-prem only? “Thick provisioned DR” “Thin provisioned DR” Elasticity into the cloud
Q&A