Download presentation
Presentation is loading. Please wait.
Published byAugustus Kelley Modified over 9 years ago
2
Copyright© Microsoft Corporation
3
DAG Architecture
4
Active Directory lookupReplay RPC server wrapperTPR API manager Copy status lookupRemote data provider wrapperSupport API manager Replay core managerVssWriterServer locator manager Seed managerActive ManagerHealth state tracker Autoreseed managerActive Manager RPC server wrapper Disk reclaimer managerFailure item manager
5
Copyright© Microsoft Corporation
15
Witness Server Placement
16
Copyright© Microsoft Corporation
19
Deployment ScenarioRecommendations Single DAG deployed in a single datacenterLocate witness server in the same datacenter as DAG members Single DAG deployed across two datacenters; no additional locations available Locate witness server in primary datacenter Multiple DAGs deployed in a single datacenterLocate witness server in the same datacenter as DAG members. Additional options include: Using the same witness server for multiple DAGs Using a DAG member to act as a witness server for a different DAG Multiple DAGs deployed across two datacenters Locate witness server in the same datacenter as DAG members. Additional options include: Using the same witness server for multiple DAGs Using a DAG member to act as a witness server for a different DAG Single or Multiple DAGs deployed across more than two datacenters Locate the witness server in the datacenter where you want the majority of quorum votes to exist
20
Copyright© Microsoft Corporation
22
Dynamic Quorum
23
Copyright© Microsoft Corporation
28
X X X
29
X X X X
30
X X X X X
31
NameDynamicWeightNodeWeightState -------------------------------- EX111Up
32
Copyright© Microsoft Corporation
34
DAG Member Maintenance
35
Copyright© Microsoft Corporation
38
Managed Availability
39
Bringing the learnings from the service to the enterprise Monitoring based on the end user’s experience Protect the user’s experience through recovery oriented computing
40
Copyright© Microsoft Corporation
41
If you can’t measure it, you cannot manage it Availability Can I access the service? Latency How is my experience? Errors Am I able to accomplish what I want? Customer Touch Points
42
—OWA send —OWA failure —OWA fast recovery —OWA verified as healthy —OWA send —OWA failure —OWA fast recovery —Failover server’s databases —OWA verified as healthy —Server becomes “good” failover target (again) LBCAS-1 CAS-2 DAG MBX-1 DB1 DB2 MBX-2 OWA DB1 DB2 MBX-3 OWA DB1 DB2 OWA DB1 “stuff breaks and the Experience does not”
43
System Level Checks 1.Mailbox Self Test (e.g. OWA MST) [detection 5m] 2.Protocol Self Test (e.g. OWA PST) [detection 20 secs] 3.Proxy Self Test (e.g. OWA PrST) [detection 20 secs] End User Experience Level Checks 4.Customer Touch Point – CTP (e.g. OWA CTP) [detection 20m]
44
PROBES The key goal is to measure the customer’s perception of the service These are typically synthetic end to end customer transactions CHECKS The key goal is to measure actual customer traffic and become aware when they are experiencing issues These are typically implemented as performance counters where thresholds can be set to detect spikes in customer failures NOTIFY The key goal is to take action immediately based on a critical event These are typically exceptions or conditions that can be detected without a large sample set
45
Monitors query the data collected by the probes and determine if an action needs to occur based on a rule set Depending on the rule, a monitor can escalate or initiate a responder Monitors can be Healthy, Degraded, Unhealthy, Repairing, Disabled, or Unavailable Defines the time from failure that a responder is executed
46
A responder is a “plug-in” that executes a response to an alert generated by a monitor There are several types of responders Restart Responder – Terminates and restarts service Reset AppPool Responder – Cycles IIS application pool Failover Responder – Takes a MBX server out of service Bugcheck Responder – Initiates a bugcheck of the server Offline Responder- Takes a protocol on a machine out of service Online Responder – Places a machine back into service Escalate Responder – escalates an issue Specialized Component Responders Built-in sequencing mechanism to control recovery actions
47
Monitor States Sampling DetectionRecovery Probe Probe Definition Monitor Monitor Results (Alerts) Monitor Definition Responder Responder Results (Responses) Responder Definition Healthy T1 T2 T3 00:00:00 00:00:10 00:00:30 Restart Responder Reset AppPool Responder Failover responder Bugcheck responder Offline Responder Escalate Responder Sequenced HA Responder Pipeline Example Named Times Probe Results (Samples) Notification Item
48
Copyright© Microsoft Corporation
51
Scott Schnoll Microsoft Corporation scott.schnoll@microsoft.com http://aka.ms/schnollTwitter: @schnoll
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.