Download presentation
Presentation is loading. Please wait.
Published byRussell Osborne Modified over 8 years ago
2
EXCHANGE SERVER 2010 HIGH AVAILABILITY DEEP DIVE Scott Schnoll Principal Technical Writer Microsoft Corporation SESSION CODE: EXL407 (c) 2011 Microsoft. All rights reserved.
3
Agenda ► Exchange Server 2010 High Availability Deep Dive – Database Availability Group Networks – Active Manager – Best Copy Selection – Datacenter Activation Coordination Mode (c) 2011 Microsoft. All rights reserved.
4
Exchange Server 2010 High Availability Deep Dive: Database Availability Group Networks
5
DAG Networks ► A DAG network is a collection of one or more subnets ► There are two types of DAG networks – MAPI Network - connects DAG members to network resources (Active Directory, other Exchange servers, DNS, etc.) Registered in DNS / DNS configured Uses default gateway Client for Microsoft Networks/File and Print Sharing enabled – Replication Network - used for/by continuous replication (log shipping and seeding) Not registered in DNS / DNS not configured Typically no default gateway Client for Microsoft Networks/File and Print Sharing disabled
6
DAG Networks ► All DAGs must have: – Exactly one MAPI network – Zero or more Replication networks Separate network(s) on separate subnet(s) LRU determines which replication network is used with multiple replication networks ► DAG networks automatically created when Mailbox server is added to DAG – Based on cluster’s enumeration of networks Cluster enumeration based on subnet One cluster network is created for each subnet
7
DAG Networks ► Maximum round trip return latency between all DAG members must be 500 ms or less – Regardless of the latency of the solution, customers should validate that the network between all DAG members is capable of satisfying the data protection and availability goals of the deployment – May need to investigate increasing the number of databases or decreasing the number of mailboxes per database to achieve desired goals
8
DAG Networks Server / Network IP Address / Subnet BitsDefault Gateway EX1 – MAPI192.168.0.15/24192.168.0.1 EX1 – REPLICATION10.0.0.15/24N / A EX2 – MAPI192.168.0.16/24192.168.0.1 EX2 – REPLICATION10.0.0.16/24N / A NameSubnet(s)Interface(s)MAPI Access Enabled Replication Enabled DAGNetwork01192.168.0.0/24EX1 (192.168.0.15) EX2 (192.168.0.16) True DAGNetwork0210.0.0.0/24EX1 (10.0.0.15) EX2 (10.0.0.16) FalseTrue
9
DAG Networks NameSubnet(s)Interface(s)MAPI Access Enabled Replication Enabled DAGNetwork01192.168.0.0/24EX1 (192.168.0.15)True DAGNetwork0210.0.0.0/24EX1 (10.0.0.15)FalseTrue DAGNetwork03192.168.1.0/24EX2 (192.168.1.15)True DAGNetwork0410.0.1.0/24EX2 (10.0.1.15)FalseTrue Server / Network IP Address / Subnet BitsDefault Gateway EX1 – MAPI192.168.0.15/24192.168.0.1 EX1 – REPLICATION10.0.0.15/24N / A EX2 – MAPI192.168.1.15/24192.168.1.1 EX2 – REPLICATION10.0.1.15/24N / A
10
DAG Networks ► Collapse subnets into two DAG networks and disable replication for the MAPI network: NameSubnet(s)Interface(s)MAPI Access Enabled Replication Enabled DAGNetwork01192.168.0.0/24EX1 (192.168.0.15)True DAGNetwork0210.0.0.0/24EX1 (10.0.0.15)FalseTrue DAGNetwork03192.168.1.0/24EX2 (192.168.1.15)True DAGNetwork0410.0.1.0/24EX2 (10.0.1.15)FalseTrue Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 - Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$false Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 - Subnets 10.0.0.0,10.0.1.0 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04
11
DAG Networks ► Collapse subnets into two DAG networks and disable replication for the MAPI network: NameSubnet(s)Interface(s)MAPI Access Enabled Replication Enabled DAGNetwork01192.168.0.0/24 192.168.1.0/24 EX1 (192.168.0.15) EX2 (192.168.1.15) FalseTrue DAGNetwork0210.0.0.0/24 10.0.1.0/24 EX1 (10.0.0.15) EX2 (10.0.1.15) FalseTrue Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork01 - Subnets 192.168.0.0,192.168.1.0 -ReplicationEnabled:$false Set-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork02 - Subnets 10.0.0.0,10.0.1.0 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork03 Remove-DatabaseAvailabilityGroupNetwork -Identity DAG2\DAGNetwork04
12
DAG Networks ► Automatic detection occurs only when members added to DAG – If networks are added after member is added, you must perform discovery Set-DatabaseAvailabilityGroup -DiscoverNetworks ► DAG network configuration persisted in cluster registry – HKLM\Cluster\Exchange\DAG Network ► DAG networks include built-in encryption and compression – Encryption: Kerberos SSP EncryptMessage/DecryptMessage APIs – Compression: Microsoft XPRESS, based on LZ77 algorithm
13
DAG Networks ► Block cross-network communication to minimize heartbeat traffic BlockedAllowed Subnet 3 Subnet 4 Subnet 2 Subnet 1
14
DAG Networks ► If using iSCSI storage, configure DAG and cluster to ignore iSCSI networks 1.Set-DatabaseAvailabilityGroupNetwork -Identity -ReplicationEnabled:$false -IgnoreNetwork:$true 2.Cluster network /prop Role=0
15
DAG Networks ► When a DAG spans multiple subnets you need an IP address on the MAPI network for each subnet ► Use DHCP in site resilience configurations to assign IP addresses to Replication network – Enables delivery of the typically required static routes – If using static IP addresses, use netsh to configure static routes ► Configure a DNS TTL on service access connection records that is consistent with your SLA, e.g. ~5 minutes for a one hour RTO SLA
16
Exchange Server 2010 High Availability Deep Dive: Active Manager
17
Active Manager ► What are the three Active Manager roles? – Standalone – PAM (Primary Active Manager) – SAM (Standby Active Manager) ► Transition of role state logged into Microsoft- Exchange-HighAvailability/Operational event log (Crimson Channel)
18
Active Manager Functionality ► Mount and Dismount Databases ► Provide Database Availability Information ► Provide Interface for Administrative Tasks ► Monitor for Failures ► Maintains Database and Server State Information
19
AutoMount on DAG Members ► In a DAG, all AutoMount operations are coordinated through the PAM ► AutoMount operations occur: – When the first server in the DAG is initialized – When the ownership of the PAM role is changed
20
AutoMount on DAG Members ► Checks msExchMasterServerOrAvailabilityGroup to determine all databases hosted on the DAG ► Checks if database can be mounted on startup – If msExchEDBOffline is TRUE, stop processing – If msExchEDBOffline is FALSE, proceed with processing
21
AutoMount on DAG Members ► Checks persistent database information stored in cluster registry ► Determines if database is mounted on another DAG member – If the database is mounted on another server, take no action – If the database is not mounted on another server, proceed
22
AutoMount on DAG Members ► Checks AdminDismount in cluster registry: – If AdminDismount is TRUE, take no action – If AdminDismount is FALSE, proceed ► Checks persistent database state information in cluster registry for server on which database was last mounted – If server available, issue mount request to Information Store on that server – If server not available or property not set, issue mount request to next server in sorted list
23
AutoMount on DAG Members ► If AutoMount operation succeeds: – Update persistent database state information stored in cluster database – Propagate information to all other DAG members
24
Mount / Dismount Database Copy ► Mount Database – An administrator action invoked through a task – The last part of a move operation ► Dismount Database – An administrator action invoked through a task – The first part of a move operation
25
Mount Database – DAG Member ► Initiate RPC to member of the DAG – If the server contacted is not the PAM, the task is referred to the PAM – If the server is the PAM, continue with no referral ► Checks the msExchMasterServerOrAvailabilityGroup to ensure database is hosted in the DAG – If database is hosted in DAG, proceed – If database is not hosted in DAG, error out
26
Mount Database – DAG Member ► Checks if the database is already mounted – If already mounted, task fails – If not already mounted, task continues ► PAM invokes callback – This invokes a pre-check for the database mount operation – Persistent database state updated to show mount Initiated
27
Mount Database – DAG Member ► PAM invokes RPC call to Information Store to mount database – If mount fails, task fails – If mount succeeds, task completes successfully ► Persistent database state updated to record results of operation and propagated to other members
28
Dismount Database – DAG Member ► Task initiates call to PAM or is referred to PAM ► PAM checks that msExchMasterServerOrAvailabilityGroup value matches the DAG ► PAM verifies that database is mounted in the DAG by checking persistent database state information stored in registry – If database is mounted, task proceeds – If database is dismounted, task fails
29
Dismount Database – DAG Member ► PAM updates persistent state information in cluster database to show state Initiated ► PAM makes RPC call to Information Store on DAG member and invokes dismount – If dismount operation succeeds, persistent database state information stored in cluster database is updated – If dismount operation fails, task fails
30
Auto Dismount – DAG Member ► Occurs when a DAG loses quorum ► All DAG members are running (but may not be participating in the cluster) ► Databases dismounted as quickly as possible to avoid split-brain – Information Store service is terminated
31
Auto Dismount – DAG Member ► Dismount operation should attempt to update database state information in cluster database ► This is the only case where a database operation occurs on a server other than the PAM
32
Active Manager – Move Database ► Move Database – An administrator action invoked by a task – Automatic operation initiated by the PAM (failover) ► Begins with a Dismount operation and ends with a Mount operation
33
Exchange Server 2010 High Availability Deep Dive: Best Copy Selection
34
Best Copy Selection ► Process of finding the best copy of an individual database to activate, given a list potential copies for activation and their status ► Active Manager selects the “best” copy to become the new active copy when the existing active copy fails or when an administrator performs a targetless switchover
35
Best Copy Selection – RTM ► Sorts copies by copy queue length to minimize data loss, using activation preference as a secondary sorting key if necessary ► Selects from sorted listed based on which set of criteria met by each copy ► Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy
36
Best Copy Selection – SP1 ► Sorts copies by activation preference when auto database mount dial is set to Lossless – Otherwise, sorts copies based on copy queue length, with activation preference used a secondary sorting key if necessary ► Selects from sorted listed based on which set of criteria met by each copy ► Attempt Copy Last Logs (ACLL) runs and attempts to copy missing log files from previous active copy
37
Best Copy Selection ► Is database mountable? – Is copy queue length <= AutoDatabaseMountDial? If Yes, database is marked as current active and mount request is issued If not, next best database tried (if one is available) ► During best copy selection, any servers that are unreachable or “activation blocked” are ignored
38
Best Copy Selection CriteriaCopy Queue LengthReplay Queue LengthContent Index Status 1< 10 logs< 50 logsHealthy 2< 10 logs< 50 logsCrawling 3N / A< 50 logsHealthy 4N / A< 50 logsCrawling 5N / A< 50 logsN / A 6< 10 logsN / AHealthy 7< 10 logsN / ACrawling 8N / A Healthy 9N / A Crawling 10Any database copy with a status of Healthy, DisconnectedAndHealthy, DisconnectedAndResynchronizing, or SeedingSource
39
Best Copy Selection – RTM ► Four copies of DB1 ► DB1 currently active on Server1 Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy DB1 Server1 Server2 Server3 Server4 DB1 X
40
Best Copy Selection – RTM ► Sort list of available copies based by Copy Queue Length (using Activation Preference as secondary sort key if necessary): – Server3\DB1 – Server2\DB1 – Server4\DB1 Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy
41
Best Copy Selection – RTM ► Only two copies meet first set of criteria for activation (CQL< 10; RQL< 50; CI=Healthy): – Server3\DB1 – Server2\DB1 – Server4\DB1 Lowest copy queue length – tried first Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy
42
Best Copy Selection – SP1 ► Four copies of DB1 ► DB1 currently active on Server1 ► Auto database mount dial set to Lossless DB1 Server1 Server2 Server3 Server4 DB1 X Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy
43
Best Copy Selection – SP1 ► Sort list of available copies based by Activation Preference: – Server2\DB1 – Server3\DB1 – Server4\DB1 Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy
44
Best Copy Selection – SP1 ► Sort list of available copies based by Activation Preference: – Server2\DB1 – Server3\DB1 – Server4\DB1 Lowest preference value – tried first Database CopyActivation Preference Copy Queue Length Replay Queue Length CI StateDatabase State Server2\DB1240Healthy Server3\DB1322HealthyDiscAndHealthy Server4\DB14100CrawlingHealthy
45
Best Copy Selection ► After Active Manager determines the best copy to activate – The Replication service on the target server attempts to copy missing log files from the source (ACLL) If successful, then the database will mount with zero data loss If unsuccessful (lossy failure), then the database will mount based on the AutoDatabaseMountDial setting If data loss is outside of dial setting, next copy will be tried
46
Best Copy Selection ► If an activated database copy is mounted – It will generate new log files (using the same log generation sequence) – Transport Dumpster requests will be initiated for the mounted database to recover lost messages – When original server or database recovers, it will run through divergence detection and either perform an incremental resync or require a full reseed
47
Exchange Server 2010 High Availability Deep Dive: Datacenter Activation Coordination Mode
48
Datacenter Activation Coordination Mode ► DAC mode is a property of a DAG ► Acts as an application-level form of quorum – Controls whether or not a Mailbox server attempts to mount its active databases on startup – Designed to prevent multiple copies of same database mounting on different members due to loss of network (split brain) ► Also enables use of Site Resilience tasks – Stop-DatabaseAvailabilityGroup – Restore-DatabaseAvailabilityGroup – Start-DatabaseAvailabilityGroup
49
Datacenter Activation Coordination Mode ► RTM: DAC Mode for DAGs with three or more members that are extended to two Active Directory sites – Don’t enable for two-member DAGs where each member is in different AD site or DAGs where all members are in the same AD site ► SP1: DAC Mode can be enabled for all DAGs ► If using Third Party Replication (TPR) mode, check with your vendor for guidance on DAC mode
50
Datacenter Activation Coordination Mode ► Uses Datacenter Activation Coordination Protocol (DACP) ► A bit in memory (in MSExchangeRepl.exe) set to either: – 0 = can’t mount – 1 = can mount
51
Datacenter Activation Coordination Mode ► Active Manager startup sequence – DACP is set to 0 – DAG member communicates with other DAG members it can reach to determine the current value for their DACP bits If the starting DAG member can communicate with all other members on the StartedServers list, DACP bit switches to 1 If the starting DAG member can communicate with another member, and that other member’s DACP bit is set to 1, starting DAG member DACP bit switches to 1 If the starting DAG member can communicate with another member, and that other member’s DACP bits are set to 0, starting DAG member DACP bit remains at 0
52
Datacenter Activation Coordination Mode
54
001 1
55
Resources Exchange Team Blog - http://aka.ms/ehlo Exchange 2010 Documentation - http://aka.ms/ex2010docs My Blog – http://aka.ms/schnoll Twitter: @schnoll
56
Enrol in Microsoft Virtual Academy Today Why Enroll, other than it being free? The MVA helps improve your IT skill set and advance your career with a free, easy to access training portal that allows you to learn at your own pace, focusing on Microsoft technologies. What Do I get for enrolment? ► Free training to make you become the Cloud-Hero in my Organization ► Help mastering your Training Path and get the recognition ► Connect with other IT Pros and discuss The Cloud Where do I Enrol? www.microsoftvirtualacademy.com Then tell us what you think. TellTheDean@microsoft.com
57
© 2010 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. (c) 2011 Microsoft. All rights reserved.
58
www.msteched.com/Australia Sessions On-Demand & Community http:// technet.microsoft.com/en-au Resources for IT Professionals http://msdn.microsoft.com/en-au Resources for Developers www.microsoft.com/australia/learning Microsoft Certification & Training Resources Resources (c) 2011 Microsoft. All rights reserved.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.