Catastrophic Hardware Failure Recovery with Exchange Server 2003 Eileen Brown IT Evangelist Microsoft UK

Slides:



Advertisements
Similar presentations
Course 6425A Module 9: Implementing an Active Directory Domain Services Maintenance Plan Presentation: 55 minutes Lab: 75 minutes This module helps students.
Advertisements

Module 13: Maintaining the Active Directory Database
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter 14 Upgrading to Exchange Server 2003.
Installing Windows XP Professional Using Attended Installation Slide 1 of 35Session 9 Ver. 1.0 CompTIA A+ Certification: A Comprehensive Approach for all.
Installation and Deployment in Microsoft Dynamics CRM 4.0
Active Directory Disaster Recovery Paul Simmons Support Engineer Directory Services Microsoft Corporation.
8.1 © 2004 Pearson Education, Inc. Exam Planning, Implementing, and Maintaining a Microsoft Windows Server 2003 Active Directory Infrastructure.
Chapter 12 - Backup and Disaster Recovery1 Ch. 12 – Backups and Disaster Recovery MIS 431 – Created Spring 2006.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
By Rashid Khan Lesson 4-Preparing to Serve: Understanding Microsoft Networking.
Chapter 16 Chapter 16: Troubleshooting. Chapter 16 Learning Objectives n Develop your own problem-solving strategy n Use the Event Viewer to locate and.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Implementing High Availability
Module 8 Implementing Backup and Recovery. Module Overview Planning Backup and Recovery Backing Up Exchange Server 2010 Restoring Exchange Server 2010.
Exchange 2010 Project Presentation/Discussion August 12, 2015 Project Team: Mark Dougherty – Design John Ditto – Project Manager Joel Eussen – Project.
Module 12: Planning for and Recovering from Disasters.
1. Preventing Disasters Chapter 11 covers the processes to take to prevent a disaster. The most prudent actions include Implement redundant hardware Implement.
TNT Microsoft Exchange Server 2003 Disaster Recovery Michael J. Murphy TechNet Presenter
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
70-293: MCSE Guide to Planning a Microsoft Windows Server 2003 Network, Enhanced Chapter 14: Problem Recovery.
1 Objectives Discuss the Windows Printer Model and how it is implemented in Windows Server 2008 Install the Print Services components of Windows Server.
1 Chapter Overview Backing Up Your Network Backing Up and Restoring Active Directory.
Module 8: Designing Active Directory Disaster Recovery in Windows Server 2008.
Microsoft ® Official Course Module 12 Monitoring, Managing, and Recovering AD DS.
Course 6425A Module 9: Implementing an Active Directory Domain Services Maintenance Plan Presentation: 55 minutes Lab: 75 minutes This module helps students.
CN1276 Server Kemtis Kunanuraksapong MSIS with Distinction MCTS, MCDST, MCP, A+
MCTS Guide to Configuring Microsoft Windows Server 2008 Active Directory Chapter 3: Introducing Active Directory.
Chapter 10 : Designing a SQL Server 2005 Solution for High Availability MCITP Administrator: Microsoft SQL Server 2005 Database Server Infrastructure Design.
13.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 13: Implementing Data and.
Module 13: Configuring Availability of Network Resources and Content.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
Chapter Fourteen Windows XP Professional Fault Tolerance.
Module 1: Recovering Messaging Databases. Overview Overview of Database Recovery Scenarios Recovering a Messaging Database Using Dial-Tone Recovery.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Chapter 18: Windows Server 2008 R2 and Active Directory Backup and Maintenance BAI617.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Module 1: Installing and Upgrading to Exchange Server 2003.
Module 12: Managing Disaster Recovery. Overview Preparing for Disaster Recovery Backing Up Data Scheduling Backup Jobs Restoring Data Configuring Shadow.
70-294: MCSE Guide to Microsoft Windows Server 2003 Active Directory, Enhanced Chapter 4: Active Directory Architecture.
1 Microsoft Exchange 2000 Server Maintenance and Troubleshooting System Maintenance and Monitoring Database Operation and Maintenance Backup, Restore,
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
Maintaining Active Directory Domain Services
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter 11 Backup and Recovery of Exchange Server 2003.
Module 9 Planning a Disaster Recovery Solution. Module Overview Planning for Disaster Mitigation Planning Exchange Server Backup Planning Exchange Server.
Catastrophic Hardware Failure & Recovery with Exchange Server 2003 Eileen Brown IT Evangelist Microsoft UK
Mark A. Magumba Storage Management. What is storage An electronic place where computer may store data and instructions for retrieval The objective of.
Active Directory Maintenance, Troubleshooting, and Disaster Recovery Lesson 11.
Module 3: Preparing for and Recovering from Non- Mailbox Server Failures.
Module 13 Implementing Business Continuity. Module Overview Protecting and Recovering Content Working with Backup and Restore for Disaster Recovery Implementing.
Module 10: Maintaining Active Directory. Overview Introduction to Maintaining Active Directory Moving and Defragmenting the Active Directory Database.
11 DISASTER RECOVERY Chapter 13. Chapter 13: DISASTER RECOVERY2 OVERVIEW  Back up server data using the Backup utility and the Ntbackup command  Restore.
A+ Guide to Managing and Maintaining Your PC Fifth Edition Chapter 13 Understanding and Installing Windows 2000 and Windows NT.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Module 15 Managing Windows Server® 2008 Backup and Restore.
Global Catalog and Flexible Single Master Operations (FSMO) Roles
Module 11 Upgrading to Microsoft ® Exchange Server 2010.
PLANNING A MICROSOFT EXCHANGE SERVER 2003 INFRASTRUCTURE Chapter 2.
1 Chapter Overview Managing Object and Container Permissions Locating and Moving Active Directory Objects Delegating Control Troubleshooting Active Directory.
MCSE Guide to Microsoft Exchange Server 2003 Administration Chapter One Introduction to Exchange Server 2003.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Log Shipping, Mirroring, Replication and Clustering Which should I use? That depends on a few questions we must ask the user. We will go over these questions.
Unit 10 ITT TECHNICAL INSTITUTE NT1330 Client-Server Networking II Date: 2/24/2016 Instructor: Williams Obinkyereh.
1 Microsoft Windows Server 2003 Active Directory Infrastructure Backing Up and Restoring Active Directory Goals  Use the.
Maintaining Windows Server 2008 File Services
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
Unit 10 NT1330 Client-Server Networking II Date: 8/16/2016
Global Catalog and Flexible Single Master Operations (FSMO) Roles
Deploying Exchange 2003 John Westworth SMS&P
Presentation transcript:

Catastrophic Hardware Failure Recovery with Exchange Server 2003 Eileen Brown IT Evangelist Microsoft UK

Topics What’s new in Exchange 2003 and Windows 2003 Disaster Recovery Questionnaire Active Directory Overview and Disaster Recovery Exchange 2003 Overview and Disaster Recovery Database Disaster Recovery SP1 update

What’s New In Exchange 2003 Database snapshot through Volume Shadow Copy Services Recovery Storage Group RPC/HTTP support for Outlook 2003 IPSec support between front-ends and back-end clusters IIS6 runs in Dedicated Mode Clustering

What’s New In Exchange 2003 Linked Value Replication Support Enhanced Management Tools InetOrgPerson-enabled mailboxes Native Mode Dependencies Exchange Store automatically ignores and removes zombie ACE entries Query-based Distribution Groups (QDGs)

Windows Server 2003 AD Less traffic between replicas Create branch office replica from media Active Directory improvements Backup through Volume Shadow Copy Service Linked Value Replication When in forest and domain functional level 2 Group Membership replication improvements Inter site replication topology generator Domain rename (without Exchange installed)

Disaster Recovery Questionnaire Is your data stored off-site? Time to reconstruct your business data? What is the cost per hour of server downtime? What is critical data? Have your configurations been mapped? ExchDump.exe and Clustool.exe What training is required? Who else knows how to recover your data?

Disaster Recovery Active Directory Recovery

Active Directory Database Ntds.dit – the database Edbxxxxx.log – transaction logs Edb.chk – checkpoint file Res1.log and Res2.log – reserved log files Logs are of fixed size (10mb for AD) Three categories of directory data are replicated between domain controllers: Domain data (accounts…) Configuration data (list of domains…) Schema data (definition of all objects…)

Active Directory Backup System State Components: System Start-up Files (boot files) System registry Class registration database of COM+ SYSVOL What Is A Good Backup? System State, system disk contents, and the SYSVOL folder Consider tombstone age set in Active Directory Default is 60 days If data older than the tombstone lifetime - restore disallowed Backup data from a DC can only be used to restore that DC

Types Of Disaster Determine the type of disaster Database corruption Damaged disks DC hardware failure Software failure – server cannot boot Data corruption Accidentally deleted object from directory Methods to restore Windows 2003 DC: Re-installationBackup

Restore Through Re-Installation New DC receives the same name as failed DC: Remove the ntdsDSA object of the failed DC using ntdsutil Use ntdsutil “metadata cleanup” command connect to the remote DC remove orphaned DC

Removing An Orphan ntdsDSA Object

Restore From Backup Non-Authoritative Restore Default method for the restoration of Active Directory DC is then updated using normal replication techniques Authoritative Restore ntdsutil

Follow non-authoritative restore before initiation object attributes version number Incremented entire directory subtree individual object Used when human error is involved Accidentally deleted a number of objects which cannot be recreated easily

Authoritative Restore Considerations Impact on Group Membership significant issue possible loss of group membership information Depends upon which objects replicate first The User object OR The Group object Impact on Trust relationship Password Reset

Recovering A Global Catalog Server Restore from backup or: Add additional GC Create branch office replica from media - dcpromo /adv Restore GC onto different hardware - issues Different HALs Incompatible Boot.ini file Different network or video cards

AD Domain Rename (Rendom.exe) Forest mapping tool – Rendom.exe /list Forest constraints - you cannot: Change which domain is the forest root domain. Drop domains from or add domains to the forest. Reuse a domain name. Domain rename operation is not supported if Exchange 2000 is deployed Detected by domain rename tool. Now supported with Exchange 2003/SP1 and XDR- Fixup.exe tool

AD Forest Recovery

Call Microsoft PSS first !!! Restoring forest DC’s Steps involved: Recover the forest root domain first Recover the remaining domains in the forest ONE DC from backup DC’s through reinstallation

AD Forest Recovery - High Level Steps Identify single DC for restore Shut down ALL DC’s Recover first DC in root domain 1. Primary SYSVOL restore, disable GC flag 2. Configure DNS 3. Raise value of RID pool by 100,000 cn=RID Manager$,cn=System,dc= cn=RID Manager$,cn=System,dc= 4. Seize all (FSMO) roles (ntdsutil) 5. Clean metadata of ALL DC’s in the root (ntdsutil)

Recover FIRST DC in the root domain (cont.) 6. Delete server and computer objects of all other DC 7. Reset the computer account of the DC twice (netdom) 8. Reset the krbtgt password twice (ADUC) 9. Reset the trust password twice (netdom) Restore FIRST DC in each other remaining domains Primary SYSVOL restore for domain Same steps as previously (domain wide) Enable GC flag DO FRESH BACKUP Install other DC’s using dcpromo AD Forest Recovery - High Level Steps

White paper N-US/forestrecovery.exe N-US/forestrecovery.exe N-US/forestrecovery.exe AD Fast recovery (VSS) – white paper available AD Forest Recovery - High Level Steps

Exchange 2003 Disaster Recovery

Disaster Recovery Issues Resource intensive Excessive downtime for clients Maintenance of recovery forest/server Goals Reduce resource requirements Perceived client availability Product supported/Integrated functionality

Where Is Exchange Information Stored? Registry settings and metabase System state backup AD Directory Objects store “Recipient” information Users, Groups, and Contacts. Replicated to GCs Most Exchange information placed on existing objects are replicated between Global Catalogs AD Configuration Exchange System Objects Public Folder Directory entries Active Directory Connector (ADC) settings

Levels Of Disaster Recovery Restoring mailboxes Recovery Storage Group / Separate server / 3rd party backup utility Restoring one or more Exchange databases Backup software Restoring multiple databases - single storage group Backup software Complete disaster - full server recoveries

Disaster Recovery What You Need Similar replacement hardware Windows 2003/Exchange 2003 install CDs, hotfixes System state backups IIS Metabase, AD, Registry, Certificate Server Exchange database backups Event ID 8019 (VSS/System State simultaneous backup fails) System/application drive backups

Disaster Recovery Steps To Recover Reconfigure hardware and drives Reinstall operating system Restore drive backups Restore System state Install Exchange Setup /DisasterRecovery Restore Exchange databases

Move Exchange To New Hardware (Exchange 2003 = GC) If server is a domain controller: Deletion of computer account / NTDS Settings Object DCPROMO /FORCEREMOVAL – “NEW” Keeping the same server name Take existing Exchange 2003 computer offline Reset existing Exchange 2003 computer account Bring the new computer online using same name Log on using Exchange 2003 Full Administrator account Exchange 2003 Setup /disasterrecovery Mount stores - check client connectivity and mail flow.

Using Exchange 2003 Stand-By Recovery Server What you need System State backup C:\Windows folder backup Exchange 2003 database backups Steps to recover Start stand-by server Restore %SystemRoot% folder and System State Run Exchange 2003 setup in disaster recovery mode Restore databases Recovery Using Images Drive signature issue prevents logon after recovery Fix using Q and Q223188

Alternate Server Restores (Before) Active Directory Forest Forest ProductionServerRestoreServer.pst CopyCopy Restore

Recovery Storage Group RSG per Server/ Information Store Restore mailbox DBs from same SG Restore SG/DBs from same AG User mailboxes remain disconnected Only MAPI protocol supported Restores default into RSG Active/Passive one restore storage group per EVS ONE recovery storage group per cluster supported

Benefits Do not need to maintain restore forest/server No changes to backup processes Utilise existing resources Minimises downtime without data loss Access to disconnected mailboxes (ExMerge) Allows single item/mailbox restore!!! WARNING: If db already exist in RSG all additional db’s must be from the same storage group as the first db WARNING: If E2K db is restored to EX2003 RSG the db will be upgraded to EX2003 version and won't be mountable back to E2K box.

Recovery Of Other Exchange 2003 Services Connectors Lotus Notes Novell GroupWise Exchange Calendar Connector Custom OWA Clusters Volume Mount Points Majority Node Set (MNS) Clusters Resource Kit clusdiag tool

Exchange 2003 Connectors Lotus Notes connector Back up the Lotus Notes client software Configuration file Notes.ini Back up \conndata and its subdirectories Novell GroupWise connector Ensure that GroupWise Gateway Network Service can be restored correctly Back up \conndata and its subdirectories Install Exchange 2003 in disaster recovery mode using setup /disasterrecovery Clean directories from temporary work files *.seq and *.ck before you start connector

Custom OWA Themes Outlook Web Access in Exchange 2003 supports the concept of 'Themes'. To create your own: OWA themes, Custom Form for logon – logon.asp STEP 1 On the front-end and back-end, create a directory in …\exchweb\themes, e.g. called “XBOXtheme" STEP 2 Copy or create new versions of 9 images STEP 3 Edit the CSS file to come up with your own colors and styles. STEP 4 back-end, add reg key …

Custom Xbox OWA Theme

Exchange 2003 Clustering What to back up Cluster Administrative software Quorum System State Exchange 2003 Server Cluster Disaster Recovery types Recover shared disk resource (Clusdb – Chkxxx.tmp Q224999) Restore Quorum Resource Replace a damaged node Restore an entire Exchange 2003 cluster Majority Node Set (MNS) Cluster, ASR for cluster Windows 2000 to Windows Server 2003 rolling upgrades supported Support for Mount Points

ASR For Clusters Automated System Recovery – ASR can completely restore a cluster in a variety of scenarios, including damaged or missing system files complete OS reinstallation due to hardware failure a damaged Cluster database, and changed disk signatures (including shared)

Removing orphaned Exchange Server Active Directory Sites and Services snap-in Services: Microsoft Exchange: organisation_name:Administrative Groups: Servers Delete same named server object If cluster is gone you cannot delete Exchange Virtual Server resources from AD Bind to DC using LDP: Configuration\Services\Microsoft Exchange\Organization\Administrative Group\Servers Right click: Delete orphan EVS entries No option of Disaster Recovery Setup for EVS

Database Disaster Recovery

Logical Versus Physical Corruption Three layers of corruption that can occur Page level ESE level Store level To remove corruption Restore an uncorrupted backup of the database Repair the database Expunge the corrupted pages from the database Salvage data and generate a new database

File Header Information Signatures for crossmatching logs and databases Database state Log files needed Backup information (last time done, type done) Shadow Headers Database header changes are not logged Crash Recovery Data locations In a crash, all cache buffer data is lost Transaction log data can reconstruct the cache Damaged database files? Damaged log files?

Errors 1018 and 1019 Error 1018: JET_errReadVerifyFailure Bad checksum / Wrong page number Hardware / Firmware File system corruption How serious are 1018 Errors? During normal operation (somewhat serious) During startup (likely fatal) During backup (may be minor) Error 1019: JET_errPageNotInitialized What causes Error 1019? Special case of error 1018 (page is replaced with zeroes) Bad page links

Errors 1022 and 1216 Error 1022: JET_errDiskIO Disk I/O failure File damage or truncation File locked by another process Anti-virus software Error 1216 (Q296843) files in the database's running set are missing or have been replaced When storage group starts system analyses header information If logs are missing: Restore the database from backup Repair the database by using ESEUTIL /P followed by ESEUTIL /D and ISINTEG -fix Q – more details

Best Practices

Best Practices - Hardware Protect databases with RAID 5, RAID0+1 or RAID 10 Protect logs with mirroring (separate volume) Keep disks < 50% full Deploy fastest possible backup solutions Never use circular logging for mailbox storage groups

Volume Shadow Copy Service Requestor - Third Party Vendor Integration Module for Exchange Writer - Microsoft Exchange Freeze write operations to ensure operations to ensure data consistency VSS HW Provider SG-1DataClone SG-1DataSource SG-1LogCloneSG-1LogSource Exchange Storage Group – Data and Log Source and Clone Targets Replica Based Recovery

Microsoft IT Design Goals MB mailboxes (4000 per EVS) 1.2 IOPS per mailbox peak + 20% buffer 16ms read, 3ms write response at peak load Scalable clustered Exchange configuration Deploy transportable VSS functionality Windows Server 2003 RTM Exchange Server 2003 RTM/SP1 72GB 10K disks for production data and log 146GB 10K disks for clones

Best Practices - Conclusion Develop backup and restore strategies rendom.exe /list, ExchDump.exe, ExDeploy to map environment Give dedicated administrator backup and restore responsibilities Keep three copies of the backup media. Back up an entire volume Perform a trial restoration periodically Test recovery procedures. Windows startup disks function correctly. Uninterruptible power supply (UPS) on the computers running Windows Server Test your disaster recovery plan. Test restores from daily, weekly, and monthly backup media.

SP1 Enhancements

Service Pack 1 Self-heal removes 40% of 1018 errors No need for ExMerge – RSG Wizard Right click on Storage Group to merge data Change of database format Database is not upgraded at once – only pages that are changed getting upgraded Support for i SCSI and NAS devices Once used SP1 must continue with SP1

SP1 – Self-Healing Error the most familiar (and dreaded) Exchange database error New Error Correcting Code (ECC) algorithm Exchange automatically corrects damage to the database file if caused by a "bit flip" 40% of damaged Exchange pages are caused by bit flips

Upgrading DB to SP1 tips If databases with 1018 errors are upgraded to SP1: ECC information needed to correct an error does not exist on previously written database pages To upgrade all pages in a database at once Take the database offline and defragment Eseutil /D [db_filename] Eseutil /D [db_filename] Will not be able to replay logs Make new full backup and mark old sets invalid Transaction log replays

Restoring SP1 DB You can restore a pre-SP1 backup set to an SP1 server Not the other way round Pre SP1 Exchange can't understand the ECC data on the page page interpreted as damaged

SP1 Dialtone SP1 enhances Recovery Storage Group interface with automated merge of data between: Dialtone database Last backup

DIALTONE MERGE SP1

Conclusion Review your disaster recovery plan when upgrading / deploying Exchange 2000/2003 Backup all data needed for full recovery Verify disaster recovery and restore plans through drills Read Exchange 2003 mailbox and disaster recovery whitepapers regularly Audit your Best Practices Request Microsoft PSS Operations Assessment

© 2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.