Presentation is loading. Please wait.

Presentation is loading. Please wait.

Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319.

Similar presentations


Presentation on theme: "Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319."— Presentation transcript:

1

2 Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319

3 Session Objectives And Takeaways Session Objective(s): Understanding the need and benefit of multi-site clusters What to consider as you plan, design, and deploy your first multi-site cluster Windows Server Failover Clustering is a great solution for not only high availability, but also disaster recovery

4 Multi-Site Clustering Introduction Networking Storage Quorum Workloads

5 Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location Is my Cluster Resilient to Site Failures?

6 Site B Site A Applications are failed over to a separate physical location Node is moved to a physically separate site Multi-Site Clusters for DR Extends a cluster from being a High Availability solution, to also being a Disaster Recovery solution

7 Benefits of a Multi-Site Cluster Protects against loss of an entire datacenter Automates failover Reduced downtime Lower complexity disaster recovery plan Reduces administrative overhead Automatically synchronize application and cluster changes Easier to keep consistent than standalone servers The primary reason DR solutions fail is dependence on people The primary reason DR solutions fail is dependence on people

8 Multi-Site Clustering Introduction Networking Storage Quorum Workloads

9 Network Considerations Network Options: 1. Stretch VLAN’s across sites 2. Cluster nodes can reside in different subnets Site A Public Network Site B 10.10.10.1 20.20.20.1 30.30.30.1 40.40.40.1 Separate Network

10 Stretching the Network Longer distance traditionally means greater network latency Too many missed health checks can cause false failover Heartbeating is fully configurable SameSubnetDelay (default = 1 second) Frequency heartbeats are sent SameSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down CrossSubnetDelay (default = 1 second) Frequency heartbeats are sent to nodes on dissimilar subnets CrossSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down to nodes on dissimilar subnets Command Line: Cluster.exe /prop PowerShell (R2): Get-Cluster | fl *

11 Security over the WAN Encrypt intra-node traffic 0 = clear text 1 = signed (default) 2 = encrypted Site A Site B 10.10.10.1 20.20.20.1 30.30.30.1 40.40.40.1

12 Enhanced Dependencies – OR Network Name resource stays up if either IP Address Resource A OR IP Address Resource B is up OR Network Name resource IP Address Resource A IP Address Resource A IP Address Resource B IP Address Resource B

13 Client Reconnect Considerations Nodes in dissimilar subnets Failover changes resource’s IP Address Clients need that new IP Address from DNS to reconnect 10.10.10.111 20.20.20.222 DNS Server 1 DNS Server 2 DNS Replication Record Updated Record Created Record Obtained FS = 10.10.10.111 Record Updated FS = 20.20.20.222 Site A Site B

14 Solution #1: Configure NN Setting RegisterAllProvidersIP (default = 0 for FALSE) Determines if all IP Addresses for a Network Name will be registered by DNS TRUE (1): IP Addresses can be online or offline and will still be registered Ensure application is set to try all IP Addresses, so clients can connect quicker HostRecordTTL (default = 1200 seconds) Controls time the DNS record lives on client for a cluster network name Shorter TTL: DNS records for clients updated sooner

15 Solution #2: Prefer Local Failover Local failover for higher availability No change in IP Address Cross-site failover for disaster recovery 10.10.10.111 DNS Server 1 DNS Server 2 FS = 10.10.10.111 Site A Site B 20.20.20.222

16 Solution #3: Stretch VLAN’s Deploying a VLAN minimizes client reconnection times DNS Server 1 DNS Server 2 FS = 10.10.10.111 Site A Site B 10.10.10.111 VLAN

17 Solution #4: Abstraction in Device Network device uses 3 rd IP 3 rd IP is the one registered in DNS & used by client Example:http://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Cente r/App_Networking/extmsftw2k8vistacisco.pdfhttp://www.cisco.com/en/US/docs/solutions/Enterprise/Data_Cente r/App_Networking/extmsftw2k8vistacisco.pdf10.10.10.111 20.20.20.222 DNS Server 1 DNS Server 2 FS = 30.30.30.30 Site A Site B 30.30.30.30

18 This is generic guidance… If you have other creative ideas, that’s ok!

19 Multi-Site Clustering Introduction Networking Storage Quorum Workloads

20 Storage in Multi-Site Clusters Different than local clusters: Multiple storage arrays – independent per site Nodes commonly access own site storage No “true” shared disk visible to all nodes Site A Site B

21 Site A Changes are made on Site A and replicated to Site B Site B Replica Storage Considerations Need a data replication mechanism between sites

22 Replication Options Replication levels: Hardware storage-based replication Software host-based replication Application-based replication

23 Synchronous Replication Host receives “write complete” response from the storage after the data is successfully written on both storage devices Primary Storage Secondary Storage Write Complete Replication Acknowledgement Write Request

24 Asynchronous Replication Host receives “write complete” response from the storage after the data is successfully written to the primary storage device Primary Storage Secondary Storage Write Complete Replication Write Request

25 Synchronous vs. Asynchronous SynchronousAsynchronous No data lossPotential data loss on hard failures Requires high bandwidth/low latency connection Enough bandwidth to keep up with data replication Stretches over shorter distances Stretches over longer distances Write latencies impact application performance No significant impact on application performance

26 Ensures node is communicating with local storage and array state Disk Resource Resource Group Custom Resource IP Address Resources* Network Name Resource Establishes start order timing Group determines smallest unit of failover Storage Resource Dependencies Ensures node is communicating with local storage and array state Ensures application comes online after replication is complete Workload Resource (example File Server)

27 Cluster Validation and Replication Multi-Site clusters are not required to pass the Storage tests to be supported Validation Guide and Policy http://go.microsoft.com/fw link/?LinkID=119949

28 HP’s Multi-Site Implementation & Demo Matthias Popp Architect HP

29 HP's Multi-Site Implementation: CLX for Windows Virtual Machine VM Config FilePhysical Disk HP CLX All Physical Disk resources of one Resource Group (VM) depend on a CLX resource Very smooth integration

30 HP Cluster Extension – What’s new? Support for Hyper-V Live Migration across disk arrays Support for Windows 2008 R2 Support for Windows Hyper-V Server 2008 R2 TT337AAE – HP StorageWorks Cluster Extension EVA for Window e-LTU There is no change to current CLX product pricing XP Cluster Extension does not yet support Live Migration - planed for 2010

31 Live Migration with Storage Failover Initiate Live Migration storage based remote replication Host 1 Host 2 HP EVA Storage Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Initiate Live Migration Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Final state transfer Pause virtual machine Move storage connectivity from source server to target server Change storage replication direction Initiate Live Migration Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Final state transfer Pause virtual machine Move storage connectivity from source server to target server Change storage replication direction Run new VM on target server; Delete VM on source server

32 HP Storage for Virtualization Hyper-V Live Migration between Replicated Disk Arrays End-user transparent app migration across data centers; across servers and storage Zero Downtime Array Load Balancing (IOPS, cache utilization, response times, power consumption, etc.) Zero Downtime Maintenance Firmware/HBA/Server updates without user interruption Plan maintenance without the need to check for downtimes Follow the sun/moon data center access model Move the app/VM closest to the users or closest to the cheapest power source Failover, failback, Quick and Live Migration using the same management software No need to learn x different tools and their limitations

33 EVA CLX with Exchange 2010 Live Migration

34 Hyper-V Geo Cluster with Exchange

35 Automatically re-direct storage replication during Live Migration Hyper-V Geo Cluster with Exchange

36

37 37 Additional HP Resources HP website for Hyper-V www.hp.com/go/hyper-v HP and Microsoft Frontline Partnership website www.hp.com/go/microsoft HP website for Windows Server 2008 R2 www.hp.com/go/ws2008r2 HP website for management tools www.hp.com/go/insight HP OS Support Matrix www.hp.com/go/osssupport Information on HP ProLiant Network Adapter Teaming for Hyper-V http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01663264/c01 663264.pdf Technical overview on HP ProLiant Network Adapter Teaming http://h20000.www2.hp.com/bc/docs/support/SupportManual/c01415139/c01 415139.pdf?jumpid=reg_R1002_USEN Whitepaper: Disaster Tolerant Virtualization Architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-V™ http://h20195.www2.hp.com/V2/getdocument.aspx?docname=4AA2- 6905ENW.pdf

38 Multi-Site Clustering Introduction Networking Storage Quorum Workloads

39 Quorum Overview Disk only (not recommended) Node and Disk majority Node majority Node and File Share majority Vote Majority is greater than 50% Possible Voters: Nodes (1 each) + 1 Witness (Disk or File Share) 4 Quorum Types

40 Replicated Disk Witness A witness is a decision maker when nodes lose network connectivity When a witness is not a single decision maker, problems occur Do not use in multi-site clusters unless directed by vendor Replicated Storage from vendor ? Vote

41 Site B Site A Cross site network connectivity broken! Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership 5 Node Cluster: Majority = 3 Majority in Primary Site Node Majority

42 Site BSite A Disaster at Site 1 We are down! Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Majority in Primary Site 5 Node Cluster: Majority = 3 Need to force quorum manually

43 Forcing Quorum Always understand why quorum was lost Used to bring cluster online without quorum Cluster starts in a special “forced” state Once majority achieved, no more “forced” state Command Line: net start clussvc /fixquorum (or /fq) PowerShell (R2): Start-ClusterNode –FixQuorum (or –fq)

44 Site A Site B Site C Complete resiliency and automatic recovery from the loss of any 1 site Replicated Storage \\Foo\Cluster1 WAN Multi-Site With File Share Witness File Share Witness

45 WAN Site A Site B Site C Complete resiliency and automatic recovery from the loss of connection between sites Replicated Storage Multi-Site With File Share Witness Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up File Share Witness Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership \\Foo\Cluster1

46 FSW Considerations Simple Windows File Server Single file server can serve as a witness for multiple clusters Each cluster requires it’s own share Can be clustered in a second cluster Recommended to be at 3 rd separate site so that there is no single point of failure FSW cannot be on a node in the same cluster

47 Quorum Model Summary No Majority: Disk Only Not Recommended Use as directed by vendor Node and Disk Majority Use as directed by vendor Node Majority Odd number of nodes More nodes in primary site Node and File Share Majority Even number of nodes Best availability solution – FSW in 3rd site

48 Multi-Site Clustering Introduction Networking Storage Quorum Workloads

49 Hyper-V in a Multi-Site Cluster AreaConsiderations Network-On cross-subnet failover, if guest is … -DHCP, then IP updated automatically -Statically configured IP, then admin needs to configure new IP -Use VLAN preferred with live migration between sites Storage-3 rd party replication solution required -Configuration with CSV (explained next) Quorum-No special considerations Links: http://technet.microsoft.com/en-us/library/dd197488.aspxhttp://technet.microsoft.com/en-us/library/dd197488.aspx

50 CSV in a Multi-Site Cluster Architectural assumptions collide… Replication solutions assume only 1 array accessed at a time CSV assumes all nodes can concurrently access the LUN CSV is not required for Live Migration Talk to your storage vendor for their support story CSV requires VLAN’s VHD Nodes in Primary Site Nodes in Disaster Recovery Site Read/OnlyRead/Write Replication VM attempts to access replica

51 SQL in a Multi-Site Cluster AreaConsiderations Network-SQL does not support OR dependency -Need to stretch VLAN between sites Storage-No special considerations -3 rd party replication solution required Quorum-No special considerations Links:http://technet.microsoft.com/en-us/library/ms189134.aspxhttp://technet.microsoft.com/en-us/library/ms189134.aspx http://technet.microsoft.com/en-us/library/ms178128.aspx

52 Exchange in a Multi-Site Cluster AreaConsiderations Network-No VLAN needed -Change HostRecordTTL from 20 minutes to 5 minutes -CCR supports 2 nodes, one per site Storage-Exchange CCR provides application-based replication Quorum-File share witness on the Hub Transport server on primary site Links:http://technet.microsoft.com/en-us/library/bb124721.aspxhttp://technet.microsoft.com/en-us/library/bb124721.aspx http://technet.microsoft.com/en-us/library/aa998848.aspx

53 Session Summary Multi-Site Failover Clustering has many benefits Redundancy is needed everywhere Understand your replication needs Compare VLANs with multiple subnets Plan quorum model & nodes before deployment Follow the checklist and best practices

54 www.microsoft.com/teched Sessions On-Demand & Community http://microsoft.com/technet Resources for IT Professionals http://microsoft.com/msdn Resources for Developers www.microsoft.com/learning Microsoft Certification & Training Resources Resources

55 Related Content Breakout Sessions SVR208 Gaining Higher Availability with Windows Server 2008 R2 Failover Clustering SVR319 Multi-Site Clustering with Windows Server 2008 R2 DAT312 All You Needed to Know about Microsoft SQL Server 2008 Failover Clustering UNC307 Microsoft Exchange Server 2010 High Availability SVR211 The Challenges of Building and Managing a Scalable and Highly Available Windows Server 2008 R2 Virtualisation Solution SVR314 From Zero to Live Migration. How to Set Up a Live Migration Demo Sessions SVR01-DEMO Free Live Migration and High Availability with Microsoft Hyper-V Server 2008 R2 Hands-on Labs UNC12-HOL Microsoft Exchange Server 2010 High Availability and Storage Scenarios

56 Multi-Site Clustering Content Design guide: http://technet.microsoft.com/en-us/library/dd197430.aspx Deployment guide/checklist: http://technet.microsoft.com/en-us/library/dd197546.aspx

57 Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!

58 © 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. Required Slide


Download ppt "Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319."

Similar presentations


Ads by Google