金仲達國立清華大學資訊工程學系 Cluster Computing: An Introduction.

金仲達國立清華大學資訊工程學系 king@cs.nthu.edu.tw Cluster Computing: An Introduction

1 Clusters Have Arrived

2 What is a Cluster? o A collection of independent computer systems working together as if a single system o Coupled through a scalable, high bandwidth, low latency interconnect o The nodes can exist in a single cabinet or be separated and connected via a network o Faster, closer connection than a network (LAN) o Looser connection than a symmetric multiprocessor

3 Outline o Motivations of Cluster Computing o Cluster Classifications o Cluster Architecture & its Components o Cluster Middleware o Representative Cluster Systems o Task Forces on Cluster o Resources and Conclusions

4 Motivations of Cluster Computing

5 How to Run Applications Faster ? o There are three ways to improve performance: m Work harder m Work smarter m Get help o Computer analogy m Use faster hardware: e.g. reduce the time per instruction (clock cycle) m Optimized algorithms and techniques m Multiple computers to solve problem => techniques of parallel processing is mature and can be exploited commercially

6 Motivation for Using Clusters o Performance of workstations and PCs is rapidly improving o Communications bandwidth between computers is increasing o Vast numbers of under-utilized workstations with a huge number of unused processor cycles o Organizations are reluctant to buy large, high performance computers, due to the high cost and short useful life span

7 Motivation for Using Clusters o Workstation clusters are thus a cheap and readily available approach to high performance computing m Clusters are easier to integrate into existing networks m Development tools for workstations are mature l Threads, PVM, MPI, DSM, C, C++, Java, etc. o Use of clusters as a distributed compute resource is cost effective --- incremental growth of system!!! m Individual node performance can be improved by adding additional resource (new memory blocks/disks) m New nodes can be added or nodes can be removed m Clusters of Clusters and Metacomputing

8 Key Benefits of Clusters o High performance: running cluster enabled programs o Scalability: adding servers to the cluster or by adding more clusters to the network as the need arises or CPU to SMP o High throughput o System availability (HA): offer inherent high system availability due to the redundancy of hardware, operating systems, and applications o Cost-effectively

9 Why Cluster Now?

10 Hardware and Software Trends o Important advances taken place in the last five year m Network performance increased with reduced cost m Workstation performance improved l Average number of transistors on a chip grows 40% per year l Clock frequency growth rate is about 30% per year l Expect 700-MHz processors with 100M transistors in early 2000 m Availability of powerful and stable operating systems (Linux, FreeBSD) with source code access

11 Why Clusters NOW? o Clusters gained momentum when three technologies converged : m Very high performance microprocessors l workstation performance = yesterday supercomputers m High speed communication m Standard tools for parallel/ distributed computing & their growing popularity o Time to market => performance o Internet services: huge demands for scalable, available, dedicated internet servers m big I/O, big compute

12 Efficient Communication o The key enabling technology: from killer micro to killer switch m Single chip building block for scalable networks l high bandwidth l low latency l very reliable m Challenges for clusters l greater routing delay and less than complete reliability l constraints on where the network connects into the node l UNIX has a rigid device and scheduling interface

13 Putting Them Together... o Building block = complete computers (HW & SW) shipped in 100,000s: Killer micro, Killer DRAM, Killer disk, Killer OS, Killer packaging, Killer investment o Leverage billion $ per year investment o Interconnecting building blocks => Killer Net c High bandwidth c Low latency c Reliable c Commodity (ATM, Gigabit Ethernet, MyridNet)

14 Windows of Opportunity o The resources available in the average clusters offer a number of research opportunities, such as m Parallel processing: use multiple computers to build MPP/DSM-like system for parallel computing m Network RAM: use the memory associated with each workstation as an aggregate DRAM cache m Software RAID: use the arrays of workstation disks to provide cheap, highly available, and scalable file storage m Multipath communication: use the multiple networks for parallel data transfer between nodes

15 Windows of Opportunity o Most high-end scalable WWW servers are clusters m end services (data, web, enhanced information services, reliability) o Network mediation services also cluster-based m Inktomi traffic server, etc. m Clustered proxy caches, clustered firewalls, etc. m => These object web applications increasingly compute intensive m => These applications are an increasing part of the “scientific computing”

16 Classification of Cluster Computers

17 Clusters Classification 1 o Based on Focus (in Market) m High performance (HP) clusters l Grand challenging applications m High availability (HA) clusters l Mission critical applications

18 HA Clusters

19 Clusters Classification 2 o Based on Workstation/PC Ownership m Dedicated clusters m Non-dedicated clusters l Adaptive parallel computing l Can be used for CPU cycle stealing

20 Clusters Classification 3 o Based on Node Architecture m Clusters of PCs (CoPs) m Clusters of Workstations (COWs) m Clusters of SMPs (CLUMPs)

21 Clusters Classification 4 o Based on Node Components Architecture & Configuration: m Homogeneous clusters l All nodes have similar configuration m Heterogeneous clusters l Nodes based on different processors and running different OS

22 Clusters Classification 5 o Based on Levels of Clustering: m Group clusters (# nodes: 2-99) l A set of dedicated/non-dedicated computers --- mainly connected by SAN like Myrinet m Departmental clusters (# nodes: 99-999) m Organizational clusters (# nodes: many 100s) m Internet-wide clusters = Global clusters (# nodes: 1000s to many millions) l Metacomputing

23 Clusters and Their Commodity Components

24 Cluster Computer Architecture

25 Cluster Components...1a Nodes o Multiple high performance components: m PCs m Workstations m SMPs (CLUMPS) m Distributed HPC systems leading to Metacomputing o They can be based on different architectures and running different OS

26 Cluster Components...1b Processors o There are many (CISC/RISC/VLIW/Vector..) m Intel: Pentiums, Xeon, Merced…. m Sun: SPARC, ULTRASPARC m HP PA m IBM RS6000/PowerPC m SGI MPIS m Digital Alphas o Integrating memory, processing and networking into a single chip m IRAM (CPU & Mem): (http://iram.cs.berkeley.edu) m Alpha 21366 (CPU, Memory Controller, NI)

27 Cluster Components…2 OS o State of the art OS: m Tend to be modular: can easily be extended and new subsystem can be added without modifying the underlying OS structure m Multithread has added a new dimension to parallel processing m Popular OS used on nodes of clusters: l Linux(Beowulf) l Microsoft NT(Illinois HPVM) l SUN Solaris(Berkeley NOW) l IBM AIX(IBM SP2) l …..

28 Cluster Components…3 High Performance Networks o Ethernet (10Mbps) o Fast Ethernet (100Mbps) o Gigabit Ethernet (1Gbps) o SCI (Dolphin - MPI- 12 usec latency) o ATM o Myrinet (1.2Gbps) o Digital Memory Channel o FDDI

29 P Cluster Components…4 Network Interfaces o Dedicated Processing power and storage embedded in the Network Interface o An I/O card today o Tomorrow on chip? $ M I/O bus (S-Bus) 50 MB/s Mryicom Net P Sun Ultra 170 Myricom NIC 160 MB/s M

30 Cluster Components…4 Network Interfaces o Network interface card m Myrinet has NIC m User-level access support: VIA m Alpha 21364 processor integrates processing, memory controller, network interface into a single chip..

31 Cluster Components…5 Communication Software o Traditional OS supported facilities (but heavy weight due to protocol processing).. m Sockets (TCP/IP), Pipes, etc. o Light weight protocols (user-level): minimal Interface into OS m User must transmit directly into and receive from the network without OS intervention m Communication protection domains established by interface card and OS m Treat message loss as an infrequent case m Active Messages (Berkeley), Fast Messages (UI),...

32 Cluster Components…6a Cluster Middleware o Resides between OS and applications and offers an infrastructure for supporting: m Single System Image (SSI) m System Availability (SA) o SSI makes collection of computers appear as a single machine (globalized view of system resources) o SA supports check pointing and process migration, etc.

33 Cluster Components…6b Middleware Components o Hardware m DEC Memory Channel, DSM (Alewife, DASH) SMP techniques o OS/gluing layers m Solaris MC, Unixware, Glunix o Applications and Subsystems m System management and electronic forms m Runtime systems (software DSM, PFS etc.) m Resource management and scheduling (RMS): l CODINE, LSF, PBS, NQS, etc.

34 Cluster Components…7a Programming Environments o Threads (PCs, SMPs, NOW,..) m POSIX Threads m Java Threads o MPI m Linux, NT, on many Supercomputers o PVM o Software DSMs (Shmem)

35 Cluster Components…7b Development Tools? o Compilers m C/C++/Java/ o RAD (rapid application development tools): GUI based tools for parallel processing modeling o Debuggers o Performance monitoring and analysis tools o Visualization tools

36 Cluster Components…8 Applications o Sequential o Parallel/distributed (cluster-aware applications) m Grand challenging applications l Weather Forecasting l Quantum Chemistry l Molecular Biology Modeling l Engineering Analysis (CAD/CAM) l ………………. m Web servers, data-mining

37 Cluster Middleware and Single System Image

38 Middleware Design Goals o Complete transparency m Let users see a single cluster system l Single entry point, ftp, telnet, software loading... o Scalable performance m Easy growth of cluster l no change of API and automatic load distribution o Enhanced availability m Automatic recovery from failures l Employ checkpointing and fault tolerant technologies m Handle consistency of data when replicated..

39 Single System Image (SSI) o A single system image is the illusion, created by software or hardware, that a collection of computers appear as a single computing resource o Benefits: m Usage of system resources transparently m Improved reliability and higher availability m Simplified system management m Reduction in the risk of operator errors m User need not be aware of the underlying system architecture to use these machines effectively

40 Desired SSI Services o Single entry point m telnet cluster.my_institute.edu m telnet node1.cluster.my_institute.edu o Single file hierarchy: AFS, Solaris MC Proxy o Single control point: manage from single GUI o Single virtual networking o Single memory space - DSM o Single job management: Glunix, Condin, LSF o Single user interface: like workstation/PC windowing environment

41 SSI Levels o Single system support can exist at different levels within a system, one is able to be built on another Application and Subsystem LevelOperating System Kernel Level Hardware Level

42 SSI at Application and Subsystem Level LevelExamplesBoundaryImportance applicationcluster batch system, system management subsystem file system distributed DB, OSF DME, Lotus Notes, MPI, PVM an applicationwhat a user wants Sun NFS, OSF, DFS, NetWare, and so on a subsystemSSI for all applications of the subsystem implicitly supports many applications and subsystems shared portion of the file system toolkitOSF DCE, Sun ONC+, Apollo Domain best level of support for heterogeneous system explicit toolkit facilities: user, service name,time (c) In search of clusters

43 SSI at OS Kernel Level LevelExamplesBoundaryImportance Kernel/ OS Layer Solaris MC, Unixware MOSIX, Sprite,Amoeba / GLunix kernel interfaces virtual memory UNIX (Sun) vnode, Locus (IBM) vproc each name space: files, processes, pipes, devices, etc. kernel support for applications, adm subsystems none supporting operating system kernel type of kernel objects: files, processes, etc. modularizes SSI code within kernel may simplify implementation of kernel objects each distributed virtual memory space microkernel Mach, PARAS, Chorus, OSF/1AD, Amoeba implicit SSI for all system services each service outside the microkernel (c) In search of clusters

44 SSI at Hardware Level LevelExamplesBoundaryImportance memorySCI, DASH better communication and synchro- nization memory space memory and I/O SCI, SMP techniqueslower overhead cluster I/O memory and I/O device space (c) In search of clusters

45 Availability Support Functions o Single I/O space (SIO): m Any node can access any peripheral or disk devices without the knowledge of physical location. o Single process space (SPS) m Any process can create processes on any node, and they can communicate through signals, pipes, etc, as if they were one a single node o Checkpointing and process migration m Saves the process state and intermediate results in memory or disk; process migration for load balancing o Reduction in the risk of operator errors

46 Relationship among Middleware Modules

47 Strategies for SSI o Build as a layer on top of existing OS (e.g. Glunix) m Benefits: l Makes the system quickly portable, tracks vendor software upgrades, and reduces development time l New systems can be built quickly by mapping new services onto the functionality provided by the layer beneath, e.g. Glunix/Solaris-MC o Build SSI at the kernel level (True Cluster OS)  Good, but can’t leverage of OS improvements by vendor m e.g. Unixware and Mosix (built using BSD Unix)

48 Representative Cluster Systems

49 Research Projects of Clusters o Beowulf: CalTech, JPL, and NASA o Condor: Wisconsin State University o DQS (Distributed Queuing System): Florida State U. o HPVM (High Performance Virtual Machine): UIUC& UCSB o Gardens: Queensland U. of Technology, AU o NOW (Network of Workstations): UC Berkeley o PRM (Prospero Resource Manager): USC

50 Commercial Cluster Software o Codine (Computing in Distributed Network Environment): GENIAS GmbH, Germany o LoadLeveler: IBM Corp. o LSF (Load Sharing Facility): Platform Computing o NQE (Network Queuing Environment): Craysoft o RWPC: Real World Computing Partnership, Japan o Unixware: SCO o Solaris-MC: Sun Microsystems

51 Solaris MC OS o Solaris MC extends standard Solaris o Makes the cluster look like a single machine => easy administration & use c Global file system: an OO VFS (PXFS) c Global process management c Global networking o Runs existing applications unchanged c Support Solaris ANI (appl. binary interface) o State of the art distributed systems techniques c High availability and powerful distributed OO

52 Solaris MC Implementation o Solaris MC is a set of C++ loadable modules on top of Solaris m very few changes to existing kernel o A private Solaris kernel per node: provides reliability o Object-oriented system with well-defined interfaces

53 Status of Solaris-MC o 4-node, 8-CPU prototype with Myrinet: m Object and communication infrastructure m Global file system (PXFS) with coherency and caching m Networking TCP/IP with load balancing m Global process management (ps, kill, exec, wait, rfork, /proc) m Monitoring tools m Cluster membership protocols o Demonstrated applications m Commercial parallel database m Scalable Web server m Parallel make m Timesharing

54 Comparison of 4 Cluster Systems

55 Task Forces on Cluster Computing

56 IEEE Task Force on Cluster Computing (TFCC) http://www.dgs.monash.edu.au/~rajkumar/tfcc/ http://www.dcs.port.ac.uk/~mab/tfcc/

57 TFCC Activities o Mailing list, workshops, conferences, tutorials, web-resources etc. o Resources for introducing the subject in senior undergraduate and graduate levels o Tutorials/workshops at IEEE Chapters o ….. and so on. o Visit TFCC Page for more details: m http://www.dgs.monash.edu.au/~rajkumar/tfcc/

58 Efforts in Taiwan o PC Farm Project at Academia Sinica Computing Center: http://www.pcf.sinica.edu.tw/ o NCHC PC Cluster Project: http://www.nchc.gov.tw/project/pccluster/

59 NCHC PC Cluster o A Beowulf class cluster

60 System Hardware 5 Fast Ethernet switching hubs

61 System Software

62 Conclusions o Clusters are promising and fun m Offer incremental growth and match with funding pattern m New trends in hardware and software technologies are likely to make clusters more promising m Cluster-based HP and HA systems can be seen everywhere!

63 The Future o Cluster system using idle cycles from computers will continue o Individual nodes will have of multiple processors o Widespread usage of Fast and Gigabit Ethernet and they will become de facto network for clusters o Cluster software bypass OS as much as possible o Unix-based OS are likely to be most popular, but the steady improvement and acceptance of NT will not be far behind

64 The Challenges o Programming m enable applications, reduce programming effort, distributed object/component models? o Reliability (RAS) m programming effort, reliability with scalability to 1000’s o Heterogeneity m performance, configuration, architecture and interconnect o Resource Management (scheduling, perf. pred.) o System Administration/Management o Input/Output (both network and storage)

65 Pointers to Literature on Cluster Computing

66 Reading Resources..1a Internet & WWW o Computer architecture: m http://www.cs.wisc.edu/~arch/www/ o PFS and parallel I/O: m http://www.cs.dartmouth.edu/pario/ o Linux parallel processing: m http://yara.ecn.purdue.edu/~pplinux/Sites/ o Distributed shared memory: m http://www.cs.umd.edu/~keleher/dsm.html

67 Reading Resources..1b Internet & WWW o Solaris-MC: m http://www.sunlabs.com/research/solaris-mc o Microprocessors: recent advances m http://www.microprocessor.sscc.ru o Beowulf: m http://www.beowulf.org o Metacomputing m http://www.sis.port.ac.uk/~mab/Metacomputing/

68 Reading Resources..2 Books o In Search of Cluster m by G.Pfister, Prentice Hall (2ed), 98 o High Performance Cluster Computing m Volume1: Architectures and Systems m Volume2: Programming and Applications l Edited by Rajkumar Buyya, Prentice Hall, NJ, USA. o Scalable Parallel Computing m by K Hwang & Zhu, McGraw Hill,98

69 Reading Resources..3 Journals o “A Case of NOW”, IEEE Micro, Feb1995 m by Anderson, Culler, Paterson o “Fault Tolerant COW with SSI”, IEEE Concurrency m by Kai Hwang, Chow, Wang, Jin, Xu o “Cluster Computing: The Commodity Supercomputing”, Journal of Software Practice and Experience m by Mark Baker & Rajkumar Buyya

金仲達國立清華大學資訊工程學系 Cluster Computing: An Introduction.

Similar presentations

Presentation on theme: "金仲達國立清華大學資訊工程學系 Cluster Computing: An Introduction."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

金仲達 國立清華大學資訊工程學系 Cluster Computing: An Introduction.

Similar presentations

Presentation on theme: "金仲達 國立清華大學資訊工程學系 Cluster Computing: An Introduction."— Presentation transcript:

Similar presentations

About project

Feedback

金仲達國立清華大學資訊工程學系 Cluster Computing: An Introduction.

Presentation on theme: "金仲達國立清華大學資訊工程學系 Cluster Computing: An Introduction."— Presentation transcript: