Paul M Kane Director, Issues to think about! APTLD Members’ Meeting in Kuala Lumpur 1 – 2 March 2010 Contingency Planning
APTLD Contingency Planning –- 1 st March Blame management – not my fault! Prevention Measures > These measures require avoidance planning to avoid something from occurring. Detection Measures; > These controls are designed for detecting or discovering unwanted events. Corrective measures; > These controls are designed to correct or restore service after an event has occurred.
APTLD Contingency Planning –- 1 st March Prevention Management How much of your operation is controlled internally? > Do you practice, “what if” scenarios? The mere thinking about and practicing scenarios will identify possible hardware, system or process flow failures – leading to operational improvements. > Are multiple staff able and authorised to undertake each other’s roles? Staff go on holiday, or work away from the office so making sure multiple people can undertake a given task leads to process improvements and operational redundancy. > Contingency Plans need to be developed and understood to ensure that in the event of a disaster, everyone knows what to do and how to do it. Hardware will fail, storm damage will occur, power supplies and transit carrier services will break – all outside of your control.
APTLD Contingency Planning –- 1 st March Prevention Management How much of your operation is outsourced? > Upstream ISP – do you use multiple carriers under SLA’s? > Are you using Provider Independent IP space, so it is easy to multi-home your service in different geographical locations for redundancy? > Do you use multiple communications mediums for dialogue with users; Blog, Twitter, forums etc > Do you supplement your internally operated DNS resolution provision with external subcontractors using multiple application software – ie not all of your DNS resolvers should be open source such as BIND or NSD – diversity builds resiliency.
APTLD Contingency Planning –- 1 st March Spread the risk with outsourcing partners With financial support from the Prevention, Preparedness and Consequence Management of Terrorism and other Security Related Risks Programme European Commission - Directorate-General Justice, Freedom and Security
APTLD Contingency Planning –- 1 st March Detection Measures Monitoring > Do you use third party monitoring servers to check the availability and external access to your services such as BGP scans, POP, Web, EPP, DNS, Ticketing systems? > Do you periodically undertake Port scans to make sure your services are secure from both internal and external attacks? > Do you Quality Assess the software your staff have written and use? > When an issue is raised by a customer for an example do you have an operations team assigned to look into, and potentially resolve, the issue?
APTLD Contingency Planning –- 1 st March Service delivery and external monitoring Command and Control Monitoring Locations: Active Anycast nodes (blue pins) Currently being Installed: (yellow pins) Command and Control Monitoring Locations: Chicago, USABuenos Aires, ArgentinaLondon, UKTokyo, Japan
APTLD Contingency Planning –- 1 st March Offices and Data Centres
APTLD Contingency Planning –- 1 st March Quick Service Overview Our TLD customers: > In this region: SG VN IR PH > In other regions: EU > 3m names IT > 2m names PL BE FI HU LU LT LV many more – total of 92 zones (many SLDs) Raw Stats: > Yesterday, 23.8 million authoritative names 28/2/2010 ~54k updates 28/2/ NSEC or NSEC3 signed zones ~6 billion queries / day (28/2/ ,973,886,722) 100% SLA NSEC and NSEC3 supported Free IPv4 and IPv6 address allocated
APTLD Contingency Planning –- 1 st March Corrective Measures Practice then Practice some more! > What is your organisation’s process for activating a Plan and notifying recovery personnel? > Do you test the recovery plan on a regular basis? Do the back up systems and back-up data stores (including off- site) recover correctly? > When back-up power or bandwidth is required is service disruption avoided? > If failure cannot be contained, do you have an effective and rehearsed communications strategy, telling users the status and what action if any is required to restore their service?
APTLD Contingency Planning –- 1 st March Summary Impact Assessment of multiple scenarios Develop your Contingency Plan Testing your Contingency Plan Personnel Training Maintaining the Contingency Plan Be sure to be able to blame someone else!
APTLD Contingency Planning –- 1 st March Thank you ? Paul.Kane AT CommunityDNS.net