Download presentation
Presentation is loading. Please wait.
Published byAvice Hill Modified over 6 years ago
1
Common Solutions Group Policy Discussion Disaster Recovery Jack Duwe September 20, 2002
These slides are designed to stimulate conversation on how your campuses are managing disaster prevention and disaster recovery planning. I will also discuss some of the measures taken at the University of Wisconsin, and particularly at DoIT.
2
Agenda Level-setting, terminology Survey results Wisconsin plan
Discussion 12/27/2018
3
Definitions Disaster prevention Disaster recovery planning
Business continuity planning Business resumption planning Crisis management Today’s focus is on DRP [These are my definitions. They are in alignment with each other, although somewhat different flavors.] This discussion focuses on IT disaster prevention and recovery, rather than the full breadth of business continuity planning. For completeness, some information is provided on business continuity. However, at UW-Madison, little effort has been given outside the IT recovery area. General definitions of these terms are: Disaster prevention – taking actions to avoid incidents which may create a disaster situation Disaster recovery planning – Generally, an IT-focused plan for restoring service following loss of IT function caused by a disaster Business continuity planning– A business-focused approach to assuring an ability to conduct business in the event of an incident that might otherwise interrupt business Business resumption planning- a business-focused plan for restoring service following loss of business and/or IT function caused by a disaster Crisis management-a business and emergency management plan for responding to a crisis situation, designed to protect health and safety, manage communications, and restore business operations as soon as feasible 12/27/2018
4
Scope Options Data Center focus Network focus Campus wide assets
System wide assets IT disaster prevention and recovery can be widely or narrowly scoped. One could focus strictly on the main data center, more broadly on the main distribution points of the network, or on the wide array of IT assets housed across the campus and in the campus network tunnels. Often vendor equipment is housed and managed on campus, and there should be a clear identification of who is responsible for recovering from damage to vendor equipment and what agreements vendors may accept for emergency provision of equipment and restoration of service. 9/16/02
5
Scope Considerations: Threat Perspective
Regional (e.g., war, terrorism) Local (e.g., flood, power loss) Facility (e.g., fire, water pipe) System (e.g., application server, SAN) Component (e.g., disk drive) Strategies for disaster prevention and recovery will vary widely depending on assumption regarding what type of disaster may occur. It is valuable to identify the likely risks, weigh those risks against the costs of reasonable measures to prevent their occurrence, and associated costs to expedite recovery for these various scenarios. 9/16/02
6
Strategies for Prevention
Facility: access & environmental controls Network: architectural redundancies Power redundancies: UPS, dual grid/feed Data storage: RAID, mirroring System redundancies: server clustering Alternate facility for failover/recovery Backup/restore procedures Facility level prevention may include locked doors with access control systems, heat and water detection and alarming, fire suppression systems, etc. Redundant pathways for routing network traffic allow for business operations to continue if there is an equipment failure at one point in the network. Uninterruptible power supply units permit orderly shutdown of systems or an extended operation of selected systems in the event of a power outage. Orderly shutdown reduces the time required to restore service once an electricity is restored. Other types of redundant electrical power are more costly, but may be justified depending on anticipated risks. Data storage capabilities have dramatically improved in recent years, virtually eliminating problems from isolated disk crashes. The additional capability provided by remote data mirroring protects the data resource most effectively in the event of extensive physical damage, although this can be costly. 9/16/02
7
Hot Topics Multiple Internet service providers Network redundancy
Recovery facility options dual data center operations reciprocal agreements contract services Cost differential of: rapid (failover) recovery preparedness business as usual This slide is designed to introduce topics you may wish to discuss together: Do you have more than one Internet provider? How critical is availability of the Internet service to your operations? Do you have a suitable alternate facility on campus or at one of your satellite campuses? If not, where would you recover operations if the primary facility is incapacitated for an extended period? Do you already have built-in server and/or SAN redundancies to assure up-time? If so, have you split them across multiple physical locations for the added physical protection this provides? How does any added expense measure up against co-location of these devices? 12/27/2018
8
Brief CSG Survey Five questions: 10 institutions responded
Comprehensive, tested, DRP? A partial plan? An untested plan? A project under way to upgrade the plan? A campus-wide “business continuity plan”? 10 institutions responded 12/27/2018
9
Comprehensive, tested, DRP?
40% said (basically) yes! 20% have comprehensive plans, But not yet fully tested 20% have tested plans for key administrative systems 20% are working on plans 12/27/2018
10
Upgrade project under way?
100% said yes! 12/27/2018
11
Business continuity plan?
The business of the institution Beyond but including IT 20% said yes. 30% have “Crisis Response Plans” 30% indicate key departments are required to have DRPs 10% report planning is under way National study: Less than 1/2 of existing business continuity plans meet objectives 12/27/2018
12
Is this representative?
40% response rate Any bias in self-reporting? Don’t know Regardless, good models to follow! 12/27/2018
13
Wisconsin/DoIT Focus Scope: recovery from data center facility destruction Recovery strategy: alternate facility on campus provide network redundancy and ESS mirroring Locate half the servers at each site Develop failover support Only local recovery site at present (based on facility-only disaster assumption) Most critical applications include Hospital, Health Alert Network, Instruction, Finance, Payroll DoIT has taken significant measures to secure the data center and to protect it from fire and water damage. DoIT’s campus backbone already provides performance redundancies with three major nodes to route traffic. A major network redesign initiative will enhance the redundancies to better support failover capabilities if a major node is physically damaged. Failover of our SAN is already supported with full mirroring of ESS data storage at our on-campus recovery site. Our technologists are in the process of testing server failover as well. We plan to provide server failover initially for some of our most critical applications that support health and safety, after which we will expand server failover capability to many critical academic and business applications. 12/27/2018
14
Wisconsin/Campus Focus
Crisis Management Plan Provides administrative framework and response framework Defines levels of response based on situation Establishes policy group and operations group Establishes procedures and meeting locations Tested this summer: Conference of Mayors UW-Madison administration has developed a crisis management plan to be managed at the executive level of the University. This slide shows the key components of the plan. DoIT operational experts, particularly in the voice communications group, are active participants in the ongoing support for this plan, providing the detailed technical specifications necessary to support emergency communications during a crisis. 12/27/2018
15
What we are not doing Business resumption planning
Hot site vendor contracts Emergency generators Recovery of equipment outside of data center Recovery of DoIT offices, workstations, etc. Business resumption planning is a Provost-level responsibility, it is not an IT responsibility. Our understanding is that there are no business resumption plans in place at this time, although Research and Sponsored Programs is developing a plan in response to a recent audit. Because DoIT has a recovery site on campus and the scope of our plan is restricted to data center loss, we have not taken the costly approach of using a hot site vendor. The UW Hospital, which is supported on its own mainframe and is hosted at our data center, is one possible candidate for a hot site solution. It is our understanding, however, that they are considering a reciprocal recovery arrangement with the State of Wisconsin’s Department of Electronic Government. Redundant Internet access and emergency generators capable of long-term power generation are considered too costly to pursue, although we are continuing to consider improvements in our Internet connectivity architecture. All other items on this slide are currently out of scope of our disaster recovery planning efforts, although they could be considered when the current efforts are completed. 12/27/2018
16
Disaster Recovery Planning Discussion?
Are we paying enough attention? Are we satisfied with our strategies? Is the institution protected? 12/27/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.