Developing An Effective Business Continuity/Disaster Recovery Testing Program CBAG May 2017
Presenter Tracy Hall, MBCP IT Assurance Manager Wolf & Company, P.C Direct: 413-726-6884 thall@wolfandco.com
Testing and Disaster Preparedness How much have you tested? https://youtu.be/9yslB3BkDm8
Exercise/Discussion Write down the biggest challenge within your organization regarding Business Continuity Testing
Testing Mistakes to Avoid Define the assumptions, scope and objectives of the test Develop a scenario for the test Develop and document the test process Alert other departments of the test Define team responsibilities in the test Ensure that all elements needed in the test, e.g., networks, databases, firewalls, load balancers, data, applications, hardware, have been prepared for the test Contact all relevant test participants Get approval for the test Complete an after-action report on the test results Update the DR plan based on test findings and lessons learned Brief management on test outcomes Schedule the next test
Why Testing? Regulatory Guidance “Because we have to” Industry Best Practice Protecting Your Business/Assets Ensures that what you say can be done ACTUALLY can be done Practices response in a less stressful situation
Priority for Examiners Many companies have 100s of pages but can they actually prove it can work? Want test results
FFIEC Guidance Action Summary Risk monitoring and testing is the final step in the cyclical business continuity planning process. Risk monitoring and testing ensures that the institution's business continuity planning process remains viable through the: Incorporation of the BIA and risk assessment into the BCP and testing program; Development of an enterprise-wide testing program; Assignment of roles and responsibilities for implementation of the testing program; Completion of annual, or more frequent, tests of the BCP; Evaluation of the testing program and the test results by senior management and the board; Assessment of the testing program and test results by an independent party; and Revision of the BCP and testing program based upon changes in business operations, audit and examination recommendations, and test results.
FFIEC Guidance Principles: Roles and responsibilities for implementation and evaluation of the testing program should be specifically defined; The BIA and risk assessment should serve as the foundation of the testing program, as well as the BCP that it validates; The breadth and depth of testing activities should be commensurate with the importance of the business process to the institution, as well as to critical financial markets; Enterprise-wide testing should be conducted at least annually, or more frequently, depending on changes in the operating environment; Testing should be viewed as a continuously evolving cycle, and institutions should work towards a more comprehensive and integrated program that incorporates the testing of various interdependencies; Institutions should demonstrate, through testing, that their business continuity arrangements have the ability to sustain the business until permanent operations are reestablished; The testing program should be reviewed by an independent party; and Test results should be compared against the BCP to identify any gaps between the testing program and business continuity guidelines, with notable revisions incorporated into the testing program or the BCP, as deemed necessary.
How do we get started?
Testing Schedule More frequent, dynamic testing Should be multi-year (3 year) Should be built off of most current BIA (Business Impact Analysis) DO NOT SAY: “All we need is Core, Core is most critical”
Types of Testing Many ways to achieve “testing” Evacuation Drills Communication Drills Structured Walkthrough Simulation Tabletop Technology Recovery Test
Types of Testing Evacuation Drills Fire Drills Floor Wardens Posted signs Meeting Places Accounting for personnel
Types of Testing Communication Drills Call Trees Has contact information been kept up to date? Where are the bottlenecks? Automatic Notification Systems Has contact information been kept up to date? Feeds from HR? Are notifications delivered properly? What is response time?
Types of Testing Structured Walkthrough Smaller groups Review plan details Roles and Responsibilities More of an understanding of what to do
Types of Testing Tabletop / Simulation Testing Who is involved? Experience Authority Incorporates Scenarios Decision making in a structured environment Roles and Responsibilities Technology Business Tests entire timeline of an event
Types of Testing Tabletop / Simulation Testing cont. How to incorporate scenarios from the Risk Assessment: Loss of Building Loss of Technology Loss of People
Types of Testing Technology Testing/ Functional Test/ Parallel Test Roles and Responsibilities Validates RTOs and MADs for technologies that support business functions Incorporate business lines for transaction processing
BIA vs. Risk Assessment Business Impact Analysis (BIA) The process of identifying and prioritizing critical business functions and the resources required to support them into predefined RTOs. Determining RPOs for systems. This exercise is considered POST outage. Business Functions Departments Technologies
BIA vs. Risk Assessment BIA cont. Business Functions Departments Determine Criticality of Business Functions Identify Dependent Technologies and Vendors Alternate Procedures Departments Assign Resources Resources Recovery Timeframes Special Recovery Instructions Initial Steps Required for Recovery Technology Assign RPOs RTO and MAD for technologies Technology questionnaire including dependencies
BIA vs. Risk Assessment Risk Assessment The process of identifying the probability of specific threats affecting the organization and the impact on the organization if they were to occur. This exercise is considered PRE outage. Threat Assessment Control Assessment
BIA vs. Risk Assessment Risk Assessment cont. Threat Assessment Determine Probability and Impact Ratings Details of impact Control Assessment Link controls to threats they mitigate
Why are they critical to testing? Business Impact Analysis: Determines criticality of systems and other resources (BIA cannot stop at business function criticality!) Business driven, NOT IT driven Risk Assessment: Incorporates scenarios: Facilities Personnel System
Should be multi year but no more than 3 Testing Plan Should be multi year but no more than 3 Rotate technologies of varying criticality Must include supporting infrastructure Build into RTOs
Testing Plan Should incorporate: Roles and responsibilities A testing policy that includes testing strategies and test planning The execution, evaluation, independent assessment, and reporting of test results Updates to the BCP and testing program
Roles and responsibilities Testing Plan Roles and responsibilities The board and senior management are responsible for establishing and reviewing an enterprise-wide testing program Business line management, who has ownership and accountability for the testing of business operations IT management, who has ownership and accountability for testing recovery of the institution's information technology systems, infrastructure, and telecommunications Crisis management, who has ownership and accountability for testing the institution's event management processes Facilities management, who has ownership and accountability for testing the operational readiness of the institution's physical plant and equipment, environmental controls, and physical security The internal auditor (or other qualified independent party), who has the responsibility for evaluating the overall quality of the testing program and the test results
Testing Plan Testing Policy Defines test plan/strategies of varying scopes and intensities as well as scopes and assumptions Changes with the business Incorporates the BIA and Risk Assessment results Key roles and responsibilities Include TSPs Incorporate appropriate personnel from business lines
Testing Plan The execution, evaluation, independent assessment, and reporting of test results Once the tests are executed, test results should be properly documented and include the following, at a minimum: Test dates and locations An executive summary detailing a comparison between the test objectives and test results Material deviations from the test plans, including whether intended participation levels were achieved Problems identified during testing An evaluation by a qualified independent party
Testing Plan Updates to BCP and Re-testing Update BCP accordingly Close gaps Re-test before next scheduled test once gaps are addressed
Building the Test Plan from an Effective BIA Key BIA Reports Business Impact Analysis Report Department Worksheet Technology Application Summary Sheet IT BIA Worksheet
Incorporating the Risk Assessment Key Risk Assessment Reports Threat Matrix Report Detailed Risk Assessment Report
Common DR Test Gaps Replication Inconsistencies “Rolling outage” difficult to emulate Missing network resources Production servers are not taken offline; DR site uses production environment Tampering Risk Data Corruption Point in time copies never tested Insufficient DR Site Resources
“FULL” Test or no? How can we achieve this without actually performing it? Documenting actual recovery time spent on each system per resource
Test Scripts Scripts should be built as close to assumed real scenario Should incorporate all phases of the test Technology Business Lines Be as specific as possible to roles & responsibilities
Test Results/Logs There is no pass/fail! Incident Log should include: Test script timeline with time stamps Other important milestones Follow-up/Action Items Important for: Post mortem exercise/debrief Insurance
Updating the Testing Schedule As a result of changes to the BIA & Risk Assessment As technologies change As business functions change, get added, or removed As RTOs/RPOs change As new risks are defined As new controls are put in place
Thank You / Questions Tracy Hall, MBCP IT Assurance Manager Wolf & Company, P.C Direct: 413-726-6884 thall@wolfandco.com