Clustering Technology For Scaleability Jim Gray Microsoft Research
Cluster: Shared What? 4 Shared Memory Multiprocessor –Multiple processors, one memory –all devices are local –DEC, SG, Sun Sequent nodes –easy to program, not commodity 4 Shared Disk Cluster –an array of nodes –all shared common disks –VAXcluster + Oracle 4 Shared Nothing Cluster –each device local to a node –ownership may change –Tandem, SP2, Wolfpack
The Answer: BOTH SMP and Cluster? Grow Up with SMP 4xP6 is now standard Grow Out with Cluster Cluster has inexpensive parts Cluster of PCs
Clusters being built 4 Teradata 500 nodes (50k$/slice) 4 Tandem,VMScluster 150 nodes (100k$/slice) 4 Intel, 9,000 55M$ ( 6k$/slice) 4 IBM: m$ (200k$/slice) 4 PC clusters (bare handed) at dozens of nodes web servers (msn, PointCast,…), DB servers 4 KEY TECHNOLOGY HERE IS THE APPS. –Apps distribute data –Apps distribute execution
So, Whats New? 4 When slices cost 50k$, you buy 10 or When slices cost 5k$ you buy 100 or Manageability, programmability, usability become key issues (total cost of ownership). 4 PCs are MUCH easier to use and program
So, Whats New? 4 PCs create virtuous cycle New MPP & NewOS New App New MPP & NewOS New App New MPP & NewOS New App New MPP & NewOS New App Standard OS & Hardware Apps Customers Vicious Cycle No Customers! Virtuous Cycle: Standards allow progress and investment protection
What is Wolfpack? 4 A consortium of 60 HW & SW vendors (everybody who is anybody) 4 A set of APIs for clustering and fault tolerance 4 An enhancement to NT Server (in beta test ) 4 Key concepts –System: a particular node –Cluster: a collection of systems working together –resource: a hardware or software module –resource dependency: one resource needs another –resource group: fails over as a unit: dependencies do not cross group boundaries
What is Wolfpack? Cluster Api DLL Database Manager Event Processor Node Manager Failover Mgr ResourceMgr Communication Manager Resource Monitors Cluster Service Cluster Management Tools Physical Resource DLL Logical Resource DLL App Resource DLL Resource Management Interface App Resource DLL Non Aware App Cluster Aware App RPC Global Update Manager Open Online IsAlive LooksAlive Offline Close Other Nodes
Cluster Advantages 4 Clients and Servers made from the same stuff. –Inexpensive: Built with commodity components 4 Fault tolerance: –Spare modules mask failures 4 Modular growth –grow by adding small modules 4 Parallel data search –use multiple processors and disks
Single System Image:Is It Important? 4 Yes, if you dont have it you fail –parallel MPPs vs Tandem, Teradata, VAXcluster. 4 NUMA & Cluster: –some things are farther away. –Must program in parallel to utilize multiple cpus, disks, wires 4 OS, DBMS, TPmonitor, Web Server, ORB give transparency: load balance data and programs. 4 Administrator, Programmer, User –do not want to know about program & data location
What Happens When a Component Fails? 4 Redundant disk or path: configure around it. 4 Non-redundant software: restart. 4 Non-redundant hardware: migrate software to surviving nodes. 4 Fault detection: 1 ms to 10 sec. 4 Failover.1 sec to 1 min. 4 This is standard in Tandem, Teradata, VMScluster
What are Support Costs? 4 Cluster lowers support costs by –masking failures (instant repair via spare modules) –allowing online maintenance and upgrades. 4 Commodity parts are much cheaper –10$/MIPS vs 10,000$/MIPS –1k$/OS vs 30K$/month/OS 4 Moden OSs are easier to install, configure, manage –GUI –Self-tuning –Online and task-based help –Built in wizards