Download presentation
Presentation is loading. Please wait.
Published byAlexandrina Watson Modified over 9 years ago
1
Clustering Next Wave In PC Computing
2
2 PP150299.ppt Cluster Concepts 101 This section is about clusters in general, we’ll get to Microsoft’s Wolfpack cluster implementation in the next section.
3
3 PP150299.ppt Why Learn About Clusters Today clusters are a niche Unix market But Microsoft will bring clusters to the masses Last October, Microsoft announced NT clusters SCO announced UnixWare clusters Sun announced Solaris / Intel clusters Novell announced Wolf Mountain clusters In 1998, 2M Intel servers will ship 100K in clusters In 2001, 3M Intel servers will ship 1M in clusters (IDC’s forecast) Clusters will be a huge market and RAID is essential to clusters
4
4 PP150299.ppt What Are Clusters? Group of independent systems that Function as a single system Appear to users as a single system And are managed as a single system’ Clusters are “virtual servers”
5
5 PP150299.ppt Why Clusters #1. Clusters Improve System Availability This is the primary value in Wolfpack-I clusters #2. Clusters Enable Application Scaling #3. Clusters Simplify System Management #4. Clusters (with Intel servers) Are Cheap
6
6 PP150299.ppt Why Clusters - #1 #1. Clusters Improve System Availability When a networked server fails, the service it provided is down When a clustered server fail, the service it provided “failsover” and downtime is avoided Mail Server Internet Server Networked Servers Clustered Servers Mail & Internet
7
7 PP150299.ppt Why Clusters - #2 #2. Clusters Enable Application Scaling With networked SMP servers, application scaling is limited to a single server With clusters, applications scale across multiple SMP servers (typically up to 16 servers)
8
8 PP150299.ppt Why Clusters - #3 #3. Clusters Simplify System Management Clusters present a Single System Image; the cluster looks like a single server to management applications Hence, clusters reduce system management costs Three Management Domains One Management Domain
9
9 PP150299.ppt Why Clusters - #4 #4. Clusters (with Intel servers) Are Cheap Essentially no additional hardware costs Microsoft charges an extra $3K per node Windows NT Server$1,000 Windows NT Server, Enterprise Edition$4,000 Note: Proprietary Unix cluster software costs $10K to $25K per node.
10
10 PP150299.ppt An Analogy to RAID RAID Makes Disks Fault Tolerant Clusters make servers fault tolerant RAID Increases I/O Performance Clusters increase compute performance RAID Makes Disks Easier to Manage Clusters make servers easier to manage RAID
11
11 PP150299.ppt Two Flavors of Clusters #1. High Availability Clusters Microsoft’s Wolfpack 1 Compaq’s Recovery Server # 2. Load Balancing Clusters (a.k.a. Parallel Application Clusters) Microsoft’s Wolfpack 2 Digital’s VAXClusters Note: Load balancing clusters are a superset of high availability clusters.
12
12 PP150299.ppt High Availability Clusters Two node clusters (node = server) During normal operations, both servers do useful work Failover When a node fails, applications failover to the surviving node and it assumes the workload of both nodes MailWeb Mail & Web
13
13 PP150299.ppt High Availability Clusters Failback When the failed node is returned to service, the applications failback MailWeb Mail
14
14 PP150299.ppt Load Balancing Clusters Multi-node clusters (two or more nodes) Load balancing clusters typically run a single application, (e.g. database, distributed across all nodes) Cluster capacity is increased by adding nodes (but like SMP servers, scaling is less than linear) 3,000 TPM3,600 TPM
15
15 PP150299.ppt Load Balancing Clusters Cluster rebalances the workload when a node dies If different apps are running on each server, they failover to the least busy server or as directed by predefined failover policies
16
16 PP150299.ppt Two Cluster Models #1. “Shared Nothing” Model Microsoft’s Wolfpack Cluster #2. “Shared Disk” Model VAXClusters
17
17 PP150299.ppt #1. “Shared Nothing” Model At any moment in time, each disk is owned and addressable by only one server “Shared nothing” terminology is confusing Access to disks is shared -- on the same bus But at any moment in time, disks are not shared RAID
18
18 PP150299.ppt #1. “Shared Nothing” Model When a server fails, the disks that it owns “failover” to the surviving server transparently to the clients RAID
19
19 PP150299.ppt #2. “Shared Disk” Model Disks are not owned by servers but shared by all servers At any moment in time, any server can access any disk Distributed Lock Manager arbitrates disk access so apps on different servers don’t step on one another (corrupt data) RAID
20
20 PP150299.ppt Cluster Interconnect This is about how servers are tied together and how disks are physically connected to the cluster
21
21 PP150299.ppt Cluster Interconnect Clustered servers always have a client network interconnect, typically Ethernet, to talk to users And at least one cluster interconnect to talk to other nodes and to disks RAID Cluster Interconnect Client Network HBA
22
22 PP150299.ppt Cluster Interconnects (cont’d) Or They Can Have Two Cluster Interconnects One for nodes to talk to each other -- “Heartbeat Interconnect” Typically Ethernet And one for nodes to talk to disks -- “Shared Disk Interconnect” Typically SCSI or Fibre Channel RAID Shared Disk Interconnect Cluster Interconnect HBA NIC
23
Micosoft’s Wolfpack Clusters
24
24 PP150299.ppt Clusters Are Not New Clusters Have been Around Since 1985 Most UNIX Systems are Clustered What’s New is Microsoft Clusters Code named “Wolfpack” Named Microsoft Cluster Server (MSCS) Software that provides clustering MSCS is part of Window NT, Enterprise Server
25
25 PP150299.ppt Microsoft Cluster Rollout Wolfpack-I In Windows NT, Enterprise Server, 4.0 (NT/E, 4.0) [Also includes Transaction Server and Reliable Message Queue] Two node “failover cluster” Shipped October, 1997 Wolfpack-II In Windows NT, Enterprise Server 5.0 (NT/E 5.0) “N” node (probably up to 16) “load balancing cluster” Beta in 1998 and ship in 1999
26
26 PP150299.ppt MSCS (NT/E, 4.0) Overview Two Node “Failover” Cluster “Shared Nothing” Model At any moment in time, each disk is owned and addressable by only one server Two Cluster Interconnects “Heartbeat” cluster interconnect Ethernet Shared disk interconnect SCSI (any flavor) Fibre Channel (SCSI protocol over Fibre Channel) Each Node Has a “Private System Disk” Boot disk
27
27 PP150299.ppt MSCS (NT/E, 4.0) Topologies #1. Host-based (PCI) RAID Arrays #2. External RAID Arrays
28
28 PP150299.ppt NT Cluster with Host-Based RAID Array Each node has Ethernet NIC -- Heartbeat Private system disk (generally on an HBA) PCI-based RAID controller -- SCSI or Fibre Nodes share access to data disks but do not share data RAID Shared Disk Interconnect “Heartbeat” Interconnect RAIDHBA NIC
29
29 PP150299.ppt NT Cluster with SCSI External RAID Array Each node has Ethernet NIC -- Heartbeat Multi-channel HBA’s connect boot disk and external array Shared external RAID controller on the SCSI Bus -- DAC SX RAID Shared Disk Interconnect “Heartbeat” Interconnect HBA NIC
30
30 PP150299.ppt NT Cluster with Fibre External RAID Array DAC SF or DAC FL (SCSI to disks) DAC FF (Fibre to disks) RAID Shared Disk Interconnect “Heartbeat” Interconnect HBA NIC
31
MSCS -- A Few of the Details Managers -->
32
32 PP150299.ppt Cluster Interconnect & Heartbeats Cluster Interconnect Private Ethernet between nodes Used to transmit “I’m alive” heartbeat messages Heartbeat Messages When a node stops getting heartbeats, it assumes the other node has died and initiates failover In some failure modes both nodes stop getting heartbeats (NIC dies or someone trips over the cluster cable) Both nodes are still alive But each thinks the other is dead Split brain syndrome Both nodes initiate failover Who wins?
33
33 PP150299.ppt Quorum Disk Special cluster resource that stores the cluster log When a node joins a cluster, it attempts to reserve the quorum disk (purple disk) If the quorum disk does not have an owner, the node takes ownership and forms a cluster If the quorum disk has an owner, the node joins the cluster RAID Disk Interconnect Cluster “Heartbeat” Interconnect RAID HBA
34
34 PP150299.ppt Quorum Disk If Nodes Cannot Communicate (no heartbeats) Then only one is allow to continue operating They use the quorum disk to decide which one lives Each node waits, then tries to reserve the quorum disk Last owner waits the shortest time and if it’s still alive will take ownership of the quorum disk When the other node attempts to reserve the quorum disk, it will find that it’s already owned The node that doesn’t own the quorum disk then failsover This is called the Challenge / Defense Protocol
35
35 PP150299.ppt Microsoft Cluster Server (MSCS) MSCS Objects Lots of MSCS objects but only two we care about Resources and Groups Resources Applications, data files, disks, IP addresses,... Groups Application and related resources like data on disks
36
36 PP150299.ppt Microsoft Cluster Server (MSCS) When a server dies, groups failover When a server is repaired and returned to service, groups failback Since data on disks is included in groups, disks failover and failback Group: Mail Resource Group: Mail Resource Group: Mail Resource Group: Web Resource Group: Web Resource Group: Web Resource
37
37 PP150299.ppt Groups Failover Groups are the entities that failover And they take their disks with them Group: Mail Resource Group: Mail Resource Group: Mail Resource Group: Web Resource Group: Web Resource Group: Web Resource Group: Mail Resource Group: Mail Resource Group: Mail Resource
38
38 PP150299.ppt Microsoft Cluster Certification Two Levels of Certification Cluster Component Certification HBA’s and RAID controllers must be certified When they pass: They’re listed on the Microsoft web site www.microsoft.com/hwtest/hcl/ They’re eligible for inclusion in cluster system certification Cluster System Certification Complete two node cluster When they pass: They’re listed on the Microsoft web site They’ll be supported by Microsoft Each Certification Takes 30 - 60 Days
39
Mylex NT Cluster Solutions
40
40 PP150299.ppt Commodity PC Market (Including Mobile) PC based Workstations Performance Desktop PCs Target Markets Entry Level Servers AcceleRAID ™ 200 eXtremeRAID-1100 DAC-PJ DAC-PG AcceleRAID ™ 250 DAC-FF DAC-FL DAC-SF DAC-SX AcceleRAID ™ 150 Mid Range Enterprise Servers
41
41 PP150299.ppt Internal vs External RAID Positioning Internal RAID Lower cost solution Higher performance in read-intensive applications Proven TPC-C performance enhances cluster performance External RAID Higher performance in write-intensive applications Write-back cache is turned-off in PCI-RAID controllers Higher connectivity Attach more disk drives Greater footprint flexibility Until PCI-RAID implements fibre
42
42 PP150299.ppt Why We’re Better -- External RAID Robust Active - Active Fibre Implementation Shipping active - active for over a year It works in NT (certified) and Unix environments Have Fibre on the back-end soon Mirrored Cache Architecture Without mirrored cache, data is inaccessible or dropped on the floor when a controller fails Unless you turn-off the write-back cache which degrades write performance by 5x to 30x. Four to Six Disk Channels I/O bandwidth and capacity scaling Dual Fibre Host Ports NT expects to access data over pre-configured paths If it doesn’t find the data over the expected path, then I/O’s don’t complete and applications fail
43
43 PP150299.ppt SX Active / Active Duplex HBA SX Ultra SCSI Disk Interconnect Cluster Interconnect
44
44 PP150299.ppt SF (or FL) Active / Active Duplex HBA SF FC HBA Single FC Array Interconnect
45
45 PP150299.ppt SF (or FL) Active / Active Duplex HBA Dual FC Array Interconnect FC HBA FC Disk Interconnect FC HBA SF
46
46 PP150299.ppt FF Active / Active Duplex HBA Single FC Array Interconnect FC HBA FF
47
47 PP150299.ppt FF Active / Active Duplex HBA Dual FC Array Interconnect FC HBA FF
48
48 PP150299.ppt Why We’ll Be Better -- Internal RAID Deliver Auto-Rebuild Deliver RAID Expansion MORE-IAdd Logical Units On-line MORE-IIAdd or Expand Logical Units On-line Deliver RAID Level Migration 0 ---> 1 1 ---> 0 0 ---> 5 5 ---> 0 1 ---> 5 5 ---> 1 And (of course) Award Winning Performance
49
49 PP150299.ppt Nodes have: Ethernet NIC -- Heartbeat Private system disks (HBA) PCI-based RAID controller eXtreme RAID Shared Disk Interconnect “Heartbeat” Interconnect eXtreme RAID HBA NIC NT Cluster with Host-Based RAID Array
50
50 PP150299.ppt Why eXtremeRAID & DAC960PJ Clusters Typically four or less processors Offers a less expensive, integrated RAID solution Can combine clustered and non clustered applications in the same enclosure Uses today’s readily available hardware
51
51 PP150299.ppt TPC-C Performance for Clusters Two External Ultra Channels At 40 MB/sec 32 bit PCI bus between the controller and the server, providing burst data transfer rates up to 132 MB/sec. Three internal Ultra Channels At 40 MB/sec 66 Mhz I960 processor off-loads RAID management from the host CPU DAC960PJ
52
52 PP150299.ppt eXtremeRAID™ achieves breakthrough in RAID technology, eliminates storage bottlenecks and delivers scaleable performance for NT Clusters. LEDsSerial Port 233 MHz RISC processor CPU NVRAM Ch 0Ch 1 Ch 0 (bottom) Ch 2 (top) SCSI PCI Bridge BASS DAC Memory Module with BBU 80 MB/sec. 64 bit PCI bus doubles data bandwidth between the controller and the server, providing burst data transfer rates up to 266 MB/sec. 3 - Ultra2 SCSI LVD channels for up to 42 shared storage devices and Connectivity Up To 12 Meters 233 MHz strong ARM RISC processor off-loads RAID management from the host CPU Mylex’s new firmware is optimized for performance and manageability eXtremeRAID™ supports up to 42 drives, per cluster, as much as 810 GB of capacity per controller. Performance increases as you add drives. eXtremeRAID ™ : Blazing Clusters
53
53 PP150299.ppt eXtremeRAID ™ 1100 NT Clusters Nodes have: Ethernet NIC -- Heartbeat Private system disks (HBA) PCI-based RAID controller Nodes share access to data disks but do not share data 3 Shared Ultra2 Interconnects “Heartbeat” Interconnect HBA NIC eXtreme RAID eXtreme RAID
54
54 PP150299.ppt Cluster Support Plans Internal RAID Windows NT 4.01998 Windows NT 5.01999 Novell OrionQ4 98 SCOTBD SUNTBD External RAID Windows NT 4.01998 Windows NT 5.01999 Novell OrionTBD SCOTBD
55
55 PP150299.ppt Plans For NT Cluster Certification Microsoft Clustering (submission dates) DACSXCompleted (Simplex) DACSFCompleted (Simplex) DACSXJuly (Duplex) DACSFJuly (Duplex) DACFLAugust (Simplex) DACFLAugust (Duplex) DAC960 PJQ4 ‘99 eXtremeRAID™ 1164 Q4 ‘99 AcceleRAID™Q4 ‘99
56
56 PP150299.ppt What RAID Arrays are Right for Clusters eXtremeRAID ™ -1100 AcceleRAID ™ 200 AcceleRAID ™ 250 DAC SF DAC FL DAC FF
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.