Download presentation
Presentation is loading. Please wait.
Published byZander Fitzhugh Modified over 9 years ago
1
Maintaining Business Continuity After Internal and External Incidents Greg Schaffer, CISSP Director of Network Services Middle Tennessee State University
2
Copyright Greg Schaffer 2008. This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.
3
Our Story Begins Like Many…. It was late in the afternoon one weekday when suddenly alarms sounded in the NOC. It was clear SOMETHING had happened, because connectivity was shattered across campus. Students could not access online classes, purchase orders could not be processed, email would not go through… It was late in the afternoon one weekday when suddenly alarms sounded in the NOC. It was clear SOMETHING had happened, because connectivity was shattered across campus. Students could not access online classes, purchase orders could not be processed, email would not go through… BUSINESS DISCONTINUITY
4
Troubleshooting the Problem It was relatively easy to pinpoint what wasn’t talking to what. The fact that many things were not talking to other many things indicated that more than one “thing” was affected. Check of devices indicated the problem was not equipment but at physical layer. It was clear that this was going to take SOME TIME to fix!
5
Location, Location, Location The relative location of the physical layer issue was determined to be at or on the site of new stadium construction. However, there was no initial indications of anything wrong. When asked, the construction workers said they had not been digging…
6
BUT …neglected to mention they had been pile driving rocks to prepare a trench for a new water line. The concrete encased conduits were damaged by the equipment. The area was excavated to reveal what we hoped was minimal damage…
7
Minimal Damage?!
9
Getting Services Up While the extent of the physical damage wasn’t clear until complete excavation was done the next morning it was clear that there was enough physical damage to assume that the conduits would not be usable for replacement fiber optics. There were redundant fiber cables between data centers that took different routes across campus…
10
Forming the Plan …except for one portion, which happened to be the pulverized area! A plan was needed to restore communications…fast The plan: –access manholes on either end of the damage and splice new fibers in manholes –run fibers temporarily on the road, and close the road to all traffic (planned anyway)
11
Finding Manhole Difficult
12
Eventually Circuits Back Up
13
But Almost Down Again! Graduation was that Saturday Road opened for visitors Temporary fibers had vehicles driving over them most of the day! Fibers held, but needless to say they would not be reused…
14
Post Mortem Eventually (nearly one month later) a manhole was constructed around the break, and new fibers pulled through the repaired area and spliced Despite “normal” controls (“Tennessee One Call”, conduits encased in concrete, redundant fibers, etc.) “Bad Stuff” happened Bad Stuff = Good Lessons
15
Operations Security Controls Preventative Detective Corrective Directive Recovery Deterrent Compensating CISSP CBK
16
Preventive/Detective Failed: –Tennessee One Call (dirt covered markings) –Hardened Physical Paths Worked (but after the fact) –Network monitoring –Help desk reporting –Documentation
17
And Keep Manhole Uncovered!
18
Corrective/Directive Worked –Emergency Web Communications –Temporary fiber construction (temporary corrective control for Business/Mission Continuity) –Shovel Failed –Blocking car and truck traffic
19
Recovery More of a longer term approach to prevent the same occurrence Redundant fiber between data centers Must also consider separate building entrances Cost of solution vs cost of downtime analysis
21
Deterrent/Compensating Worked: –Penalty/Insurance –Temporary fiber run –Cutting of ducts –Creation of new manhole
22
Finally It ended up being a late night, hampered by many events. Our DR/BC plan did not specifically address this problem...NOR SHOULD IT HAVE. A good DR/BC plan is flexible and adaptive. The necessary resources were mobilized quickly based on existing DR/BC plans. What could have been a very large disaster goes down as a downtime that lasted 10 hours. It ended up being a late night, hampered by many events. Our DR/BC plan did not specifically address this problem...NOR SHOULD IT HAVE. A good DR/BC plan is flexible and adaptive. The necessary resources were mobilized quickly based on existing DR/BC plans. What could have been a very large disaster goes down as a downtime that lasted 10 hours.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.