Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319.

Slides:



Advertisements
Similar presentations
Henk Den Baes Technology Advisor Microsoft BeLux
Advertisements

Symon Perriman Program Manager II Clustering & High-Availability Microsoft Corporation SESSION CODE: VIR303.
Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location.
1 Vladimir Knežević Microsoft Software d.o.o.. 80% Održavanje 80% Održavanje 20% New Cost Reduction Keep Business Up & Running End User Productivity End.
Faith Allington Program Manager Microsoft Corporation Session Code: WSV304.
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
Created by the Community for the Community Kent Weare.
Keith Burns Microsoft UK Mission Critical Database.
Business Continuity and DR
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Tech·Ed North America /19/2017 7:21 AM
Implementing Failover Clustering with Hyper-V
National Manager Database Services
Ronen Gabbay Microsoft Regional Director Yside / Hi-Tech College
Gopal Ashok Program Manager Microsoft Corp Session Code: DAT 312.
Hyper-V High-Availability & Mobility: Designing the Infrastructure for Your Private Cloud Symon Perriman Technical Evangelist Microsoft
But what if there is a catastrophic event? Fire, flood, earthquake …
Implementing Multi-Site Clusters April Trần Văn Huệ Nhất Nghệ CPLS.
Cluster Heartbeats Node health monitoring CSV I/O Built-in resiliency for storage volume access Intra-Cluster Synchronization Replicated state.
Failover Clustering & Hyper-V: Multisite Disaster Recovery
SQLCAT: SQL Server 2012 AlwaysOn Lessons Learned from Early Customer Deployments Sanjay Mishra Program Manager Microsoft Corporation DBI360.
Session objectives Discuss whether or not virtualization makes sense for Exchange 2013 Describe supportability of virtualization features Explain sizing.
Site Power OutageNetwork Disconnect Node Shutdown for Patching Node Crash Quorum Witness Failure How do I make sure my Cluster stays up ??... Add/Evict.
SQLCAT: SQL Server HA and DR Design Patterns, Architectures, and Best Practices Using Microsoft SQL Server 2012 AlwaysOn Sanjay Mishra Program Manager.
Microsoft Virtual Academy. First HalfSecond Half (01) Introduction to Microsoft Virtualization(05) Hyper-V Management (02) Hyper-V Infrastructure (06)
Speaker Name 00/00/2013. Solution Requirements.
You there? Yes Network Health Monitoring Heartbeats are sent to monitor health status of network interfaces Are sent over all cluster.
Service Pack 2 System Center Configuration Manager 2007.
Complete VM Mobility Across the Datacenter Server Virtualization Hyper-V 2012 Live Migrate VM and Storage to Clusters Live Migrate VM and Storage Between.
Failover Clustering & Hyper-V: Multi-Site Disaster Recovery Symon Perriman Technical Evangelist Microsoft
Dawie Human Infrastructure Architect Inobits Consulting VIR202.
Virtual Machine Movement and Hyper-V Replica
Patrick Ortiz Global SQL Solution Architect Dell Inc. BIN209.
VAROVANJE VIRTUALIZIRANEGA DATACENTRA – VISOKA RAZPOLOŽLJIVOST Gorazd Šemrov Microsoft Consulting Services
Dev and Test Solution reference architecture.
Business Continuity & Disaster Recovery
Microsoft Virtual Academy
Dev and Test Solution reference architecture.
Dev and Test Solution reference architecture.
Dev and Test Solution reference architecture.
Dev and Test Solution reference architecture.
Design and Implement Cloud Data Platform Solutions
Required 9s and data protection: introduction to sql server 2012 alwayson, new high availability solution Santosh Balasubramanian Senior Program Manager.
Cloud Database Based on SQL Server 2012 Technologies
Business Continuity & Disaster Recovery
iSCSI Software Target for Application Storage and Boot
Microsoft Azure P wer Lunch
Microsoft Virtual Academy
TechEd /21/2018 5:20 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
TechEd /23/ :44 AM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
TechEd /28/ :51 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered.
Guest vs. Host Clustering: What? Why? When?
Microsoft Virtual Academy
Tech·Ed North America /5/2018 1:52 AM
Tech·Ed North America /5/2018 6:43 PM
Dev and Test Solution reference architecture.
Microsoft Virtual Academy
Building continuously available systems with Hyper-V
Optimizing SQL Server Performance in a Virtual Environment
Upgrading Your Private Cloud with Windows Server 2012 R2
4/7/2019 8:09 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or.
Microsoft Virtual Academy
Building global and highly-available services using Windows Azure
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Microsoft Virtual Academy
Presentation transcript:

Elden Christensen Senior Program Manager Lead Microsoft Session Code: SVR319

Session Objectives And Takeaways Session Objective(s): Understanding the need and benefit of multi-site clusters What to consider as you plan, design, and deploy your first multi-site cluster Windows Server Failover Clustering is a great solution for not only high availability, but also disaster recovery

Multi-Site Clustering Introduction Networking Storage Quorum Workloads

Site A But what if there is a catastrophic event? Fire, flood, earthquake … Same Physical Location Is my Cluster Resilient to Site Failures?

Site B Site A Applications are failed over to a separate physical location Node is moved to a physically separate site Multi-Site Clusters for DR Extends a cluster from being a High Availability solution, to also being a Disaster Recovery solution

Benefits of a Multi-Site Cluster Protects against loss of an entire datacenter Automates failover Reduced downtime Lower complexity disaster recovery plan Reduces administrative overhead Automatically synchronize application and cluster changes Easier to keep consistent than standalone servers The primary reason DR solutions fail is dependence on people The primary reason DR solutions fail is dependence on people

Multi-Site Clustering Introduction Networking Storage Quorum Workloads

Network Considerations Network Options: 1. Stretch VLAN’s across sites 2. Cluster nodes can reside in different subnets Site A Public Network Site B Separate Network

Stretching the Network Longer distance traditionally means greater network latency Too many missed health checks can cause false failover Heartbeating is fully configurable SameSubnetDelay (default = 1 second) Frequency heartbeats are sent SameSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down CrossSubnetDelay (default = 1 second) Frequency heartbeats are sent to nodes on dissimilar subnets CrossSubnetThreshold (default = 5 heartbeats) Missed heartbeats before an interface is considered down to nodes on dissimilar subnets Command Line: Cluster.exe /prop PowerShell (R2): Get-Cluster | fl *

Security over the WAN Encrypt intra-node traffic 0 = clear text 1 = signed (default) 2 = encrypted Site A Site B

Enhanced Dependencies – OR Network Name resource stays up if either IP Address Resource A OR IP Address Resource B is up OR Network Name resource IP Address Resource A IP Address Resource A IP Address Resource B IP Address Resource B

Client Reconnect Considerations Nodes in dissimilar subnets Failover changes resource’s IP Address Clients need that new IP Address from DNS to reconnect DNS Server 1 DNS Server 2 DNS Replication Record Updated Record Created Record Obtained FS = Record Updated FS = Site A Site B

Solution #1: Configure NN Setting RegisterAllProvidersIP (default = 0 for FALSE) Determines if all IP Addresses for a Network Name will be registered by DNS TRUE (1): IP Addresses can be online or offline and will still be registered Ensure application is set to try all IP Addresses, so clients can connect quicker HostRecordTTL (default = 1200 seconds) Controls time the DNS record lives on client for a cluster network name Shorter TTL: DNS records for clients updated sooner

Solution #2: Prefer Local Failover Local failover for higher availability No change in IP Address Cross-site failover for disaster recovery DNS Server 1 DNS Server 2 FS = Site A Site B

Solution #3: Stretch VLAN’s Deploying a VLAN minimizes client reconnection times DNS Server 1 DNS Server 2 FS = Site A Site B VLAN

Solution #4: Abstraction in Device Network device uses 3 rd IP 3 rd IP is the one registered in DNS & used by client Example: r/App_Networking/extmsftw2k8vistacisco.pdfhttp:// r/App_Networking/extmsftw2k8vistacisco.pdf DNS Server 1 DNS Server 2 FS = Site A Site B

This is generic guidance… If you have other creative ideas, that’s ok!

Multi-Site Clustering Introduction Networking Storage Quorum Workloads

Storage in Multi-Site Clusters Different than local clusters: Multiple storage arrays – independent per site Nodes commonly access own site storage No “true” shared disk visible to all nodes Site A Site B

Site A Changes are made on Site A and replicated to Site B Site B Replica Storage Considerations Need a data replication mechanism between sites

Replication Options Replication levels: Hardware storage-based replication Software host-based replication Application-based replication

Synchronous Replication Host receives “write complete” response from the storage after the data is successfully written on both storage devices Primary Storage Secondary Storage Write Complete Replication Acknowledgement Write Request

Asynchronous Replication Host receives “write complete” response from the storage after the data is successfully written to the primary storage device Primary Storage Secondary Storage Write Complete Replication Write Request

Synchronous vs. Asynchronous SynchronousAsynchronous No data lossPotential data loss on hard failures Requires high bandwidth/low latency connection Enough bandwidth to keep up with data replication Stretches over shorter distances Stretches over longer distances Write latencies impact application performance No significant impact on application performance

Ensures node is communicating with local storage and array state Disk Resource Resource Group Custom Resource IP Address Resources* Network Name Resource Establishes start order timing Group determines smallest unit of failover Storage Resource Dependencies Ensures node is communicating with local storage and array state Ensures application comes online after replication is complete Workload Resource (example File Server)

Cluster Validation and Replication Multi-Site clusters are not required to pass the Storage tests to be supported Validation Guide and Policy link/?LinkID=119949

HP’s Multi-Site Implementation & Demo Matthias Popp Architect HP

HP's Multi-Site Implementation: CLX for Windows Virtual Machine VM Config FilePhysical Disk HP CLX All Physical Disk resources of one Resource Group (VM) depend on a CLX resource Very smooth integration

HP Cluster Extension – What’s new? Support for Hyper-V Live Migration across disk arrays Support for Windows 2008 R2 Support for Windows Hyper-V Server 2008 R2 TT337AAE – HP StorageWorks Cluster Extension EVA for Window e-LTU There is no change to current CLX product pricing XP Cluster Extension does not yet support Live Migration - planed for 2010

Live Migration with Storage Failover Initiate Live Migration storage based remote replication Host 1 Host 2 HP EVA Storage Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Initiate Live Migration Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Final state transfer Pause virtual machine Move storage connectivity from source server to target server Change storage replication direction Initiate Live Migration Create VM on target node Copy memory pages from source server to target server via Ethernet Check disk array for replication link and disk pair states Final state transfer Pause virtual machine Move storage connectivity from source server to target server Change storage replication direction Run new VM on target server; Delete VM on source server

HP Storage for Virtualization Hyper-V Live Migration between Replicated Disk Arrays End-user transparent app migration across data centers; across servers and storage Zero Downtime Array Load Balancing (IOPS, cache utilization, response times, power consumption, etc.) Zero Downtime Maintenance Firmware/HBA/Server updates without user interruption Plan maintenance without the need to check for downtimes Follow the sun/moon data center access model Move the app/VM closest to the users or closest to the cheapest power source Failover, failback, Quick and Live Migration using the same management software No need to learn x different tools and their limitations

EVA CLX with Exchange 2010 Live Migration

Hyper-V Geo Cluster with Exchange

Automatically re-direct storage replication during Live Migration Hyper-V Geo Cluster with Exchange

37 Additional HP Resources HP website for Hyper-V HP and Microsoft Frontline Partnership website HP website for Windows Server 2008 R2 HP website for management tools HP OS Support Matrix Information on HP ProLiant Network Adapter Teaming for Hyper-V pdf Technical overview on HP ProLiant Network Adapter Teaming pdf?jumpid=reg_R1002_USEN Whitepaper: Disaster Tolerant Virtualization Architecture with HP StorageWorks Cluster Extension and Microsoft Hyper-V™ ENW.pdf

Multi-Site Clustering Introduction Networking Storage Quorum Workloads

Quorum Overview Disk only (not recommended) Node and Disk majority Node majority Node and File Share majority Vote Majority is greater than 50% Possible Voters: Nodes (1 each) + 1 Witness (Disk or File Share) 4 Quorum Types

Replicated Disk Witness A witness is a decision maker when nodes lose network connectivity When a witness is not a single decision maker, problems occur Do not use in multi-site clusters unless directed by vendor Replicated Storage from vendor ? Vote

Site B Site A Cross site network connectivity broken! Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership 5 Node Cluster: Majority = 3 Majority in Primary Site Node Majority

Site BSite A Disaster at Site 1 We are down! Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No, drop out of Cluster Membership Majority in Primary Site 5 Node Cluster: Majority = 3 Need to force quorum manually

Forcing Quorum Always understand why quorum was lost Used to bring cluster online without quorum Cluster starts in a special “forced” state Once majority achieved, no more “forced” state Command Line: net start clussvc /fixquorum (or /fq) PowerShell (R2): Start-ClusterNode –FixQuorum (or –fq)

Site A Site B Site C Complete resiliency and automatic recovery from the loss of any 1 site Replicated Storage \\Foo\Cluster1 WAN Multi-Site With File Share Witness File Share Witness

WAN Site A Site B Site C Complete resiliency and automatic recovery from the loss of connection between sites Replicated Storage Multi-Site With File Share Witness Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up Can I communicate with majority of the nodes (+FSW) in the cluster? Yes, then Stay Up File Share Witness Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership Can I communicate with majority of the nodes in the cluster? No (lock failed), drop out of Cluster Membership \\Foo\Cluster1

FSW Considerations Simple Windows File Server Single file server can serve as a witness for multiple clusters Each cluster requires it’s own share Can be clustered in a second cluster Recommended to be at 3 rd separate site so that there is no single point of failure FSW cannot be on a node in the same cluster

Quorum Model Summary No Majority: Disk Only Not Recommended Use as directed by vendor Node and Disk Majority Use as directed by vendor Node Majority Odd number of nodes More nodes in primary site Node and File Share Majority Even number of nodes Best availability solution – FSW in 3rd site

Multi-Site Clustering Introduction Networking Storage Quorum Workloads

Hyper-V in a Multi-Site Cluster AreaConsiderations Network-On cross-subnet failover, if guest is … -DHCP, then IP updated automatically -Statically configured IP, then admin needs to configure new IP -Use VLAN preferred with live migration between sites Storage-3 rd party replication solution required -Configuration with CSV (explained next) Quorum-No special considerations Links:

CSV in a Multi-Site Cluster Architectural assumptions collide… Replication solutions assume only 1 array accessed at a time CSV assumes all nodes can concurrently access the LUN CSV is not required for Live Migration Talk to your storage vendor for their support story CSV requires VLAN’s VHD Nodes in Primary Site Nodes in Disaster Recovery Site Read/OnlyRead/Write Replication VM attempts to access replica

SQL in a Multi-Site Cluster AreaConsiderations Network-SQL does not support OR dependency -Need to stretch VLAN between sites Storage-No special considerations -3 rd party replication solution required Quorum-No special considerations Links:

Exchange in a Multi-Site Cluster AreaConsiderations Network-No VLAN needed -Change HostRecordTTL from 20 minutes to 5 minutes -CCR supports 2 nodes, one per site Storage-Exchange CCR provides application-based replication Quorum-File share witness on the Hub Transport server on primary site Links:

Session Summary Multi-Site Failover Clustering has many benefits Redundancy is needed everywhere Understand your replication needs Compare VLANs with multiple subnets Plan quorum model & nodes before deployment Follow the checklist and best practices

Sessions On-Demand & Community Resources for IT Professionals Resources for Developers Microsoft Certification & Training Resources Resources

Related Content Breakout Sessions SVR208 Gaining Higher Availability with Windows Server 2008 R2 Failover Clustering SVR319 Multi-Site Clustering with Windows Server 2008 R2 DAT312 All You Needed to Know about Microsoft SQL Server 2008 Failover Clustering UNC307 Microsoft Exchange Server 2010 High Availability SVR211 The Challenges of Building and Managing a Scalable and Highly Available Windows Server 2008 R2 Virtualisation Solution SVR314 From Zero to Live Migration. How to Set Up a Live Migration Demo Sessions SVR01-DEMO Free Live Migration and High Availability with Microsoft Hyper-V Server 2008 R2 Hands-on Labs UNC12-HOL Microsoft Exchange Server 2010 High Availability and Storage Scenarios

Multi-Site Clustering Content Design guide: Deployment guide/checklist:

Complete an evaluation on CommNet and enter to win an Xbox 360 Elite!

© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries. The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION. Required Slide