Chapter 10 Disaster Recovery and Data Integrity 1
A disaster-recovery plan looks at what disasters *could* hit and lays out a plan for responding to those disasters. It lists services, order in which to restore and how fast it needs to happen. The network disaster recovery plan is usually part of a larger, overall plan. It starts with understanding the basics: Determine what disasters could afflict your site The likelihood that those disasters will strike The cost of your company if they do strike And how quickly the various parts of your business need to be revived Disaster Recovery 2
Definition: Disaster—a catastrophic event that causes a massive outage affecting an entire building or site. It can be anything from a natural disaster, such as an earthquake, to the more common problem of a stray backhoe cutting your cable by accident. Disaster Recovery 3
Risk Analysis Good candidate for using external consultants. It is a specialized skill that is not used often. A large company may hire a consultant to perform a Risk Analysis and have an in-house person responsible for Risk Management. Risk analysis involves determining what disasters may happen, the chances of those disasters, and the likely cost if a disaster of each type occurred. The company can then use that information to decide how much money is reasonable to spend on trying to mitigate the effects of each type of disaster. (Probably cost of disaster – Probably cost after mitigation) X Risk of Disaster Flood ($10,000,000 - $x) X (1/1,000,000); x=$10 sets this equation to 0 Earthquake ($60,000,000 - $x) X (1/3,000); x=$20,000 sets this equation to 0 Disaster Recovery 4
Legal Obligations: There may be company contract obligations. This must be included in the Risk Analysis. Damage Limitations: Some can be done at little or no cost in some instances -Lifting racks in flood prone areas -Lightening rods and good grounding systems to protect against lightening -Racks bolted to the floor to help mitigate earthquake damage Some can come at significant cost and can only be afforded by very large companies -Building your data center underground to protect against tornados/bombs -Expensive mechanisms to allow racks to shake with an earthquake Fire prevention systems, UPS’s Disaster Recovery 5
Preparation Being prepared for a disaster means being able to restore the essential systems to working order in a timely manner, as defined by your legal obligations. Need to arrange a source of replacement hardware in advance from companies that provide this service. You also need to have another site to which this equipment can be sent if the primary site cannot be used because of safety reasons, lack of power, or lack of connectivity. Make sure these companies are aware of your needs and where to send it. Once you have your machines, you need to recreate your system. Typically, you first rebuild the systems, then you restore from backups—data stored off-site. 6
Data Integrity Data integrity means ensuring your data is not altered by external sources. It can be corrupted maliciously by viruses or individuals, or inadvertently by individuals, bugs in programs, and undetected hardware malfunctions. There are anecdotal methods to check for data corruption Large files checked against “read-only” checksums Seeing large changes in a database only expected to have small changes Industrial espionage and theft of intellectual property are not uncommon. A company may need to prove ownership of intellectual property and your ability to accurately restore data as it existed on a certain date may be required in a court of law. For both disaster recovery and use as evidence in a court of law, an administrator needs to know the data has not been tampered with. 7
Disaster Recover and Data Integrity The ultimate preparation is to have a fully redundant version of everything that can take over when the primary fails. Some companies do this and some companies have quit using the term “disaster recovery” and have started using the term “contingency planning” or “continuity planning.” The next level of disaster planning is to have an alternate site that duplicate *some* of the critical services across both data centers. Then the only problem is getting people access to those services. Security Disasters: a growing concern. Similar risk analysis can be performed on ways to protect data. Media Relations: Have a profession Public Relations firm on retainer, or have a media plan—who will talk to the media, what kinds of things will and will not be said, and what the chain of command is if the designated decision makers aren’t available. 8