Download presentation
Presentation is loading. Please wait.
Published byPaul Tillison Modified over 9 years ago
1
Beowulf Clusters Matthew Doney
2
What is a cluster? A cluster is a group of several computers connected Several different methods of connecting them Distributed Computers widely separated, connected over the internet Used by research groups like SETI@home and GIMPS Workstation Cluster Collection of Workstations loosely connected by LAN Cluster Farm PC’s connected over LAN that perform work when idle
3
What is a Beowulf Cluster A Beowulf Cluster is one class of a cluster computer Uses Commercial Off The Shelf (COTS) hardware Typically contains both master and slave nodes Not defined by a specific piece of hardware Image Source: http://www.cse.mtu.edu/Common/cluster.jpg
4
What is a Beowulf Cluster The origin of the name “Beowulf” Main character of Old English poem Described in the poem – “he has thirty men’s heft of grasp in the gripe of his hand, the bold-in-battle”. Image Source: http://www.teachingcollegeenglish.com/wp- content/uploads/2011/06/lynd-ward-17-jnanam- dot-net.jpg
5
Cluster Computer History – 1950’s SAGE, one of the first cluster computers Developed by IBM for NORAD Linked radar stations together for first early warning detection system Image Source: http://www.ieeeghn.org/wiki/images/3/34/Sage_nomination.jpg
6
Cluster Computer History – 1970’s Technological Advancements VLSI (Very Large Scale Integration) Ethernet UNIX Operating System
7
Cluster Computer History – 1980’s Increased interest in cluster computing Ex: NSA connected 160 Apollo workstations in a cluster configuration First widely used clustering product: VAXcluster Development of task scheduling software Condor package developed by UW-Madison Development of parallel programming software PVM(Parallel Virtual Machine)
8
Cluster Computer History – 1990’s NOW(Network of workstations) project at UC Berkeley First cluster on TOP500 list Development of Myrinet LAN system Beowulf project started at NASA’s Goddard Space Flight Center Image Source: http://www.cs.berkeley.edu/~pattrsn/Arch/NOW2.jpg
9
Cluster Computer History - Beowulf Developed by Thomas Sterling and Donald Becker 16 Individual nodes 100 MHz Intel 80486 processors 16 MB memory, 500 MB hard drive 2 10Mbps Ethernet ports Early version of Linux Used PVM library
10
Cluster Computer History – 1990’s MPI standard developed Created to be a global standard to replace existing message passing protocols DOE, NASA, California Institute of Technology collaboration Developed a Beowulf system with sustained performance 1 Gflops Cost $50,000 Awarded Gordon Bell prize for price/performance 28 Clusters were on the TOP500 list by the end of the decade
11
Beowulf Cluster Advantages Price/Performance Using COTS hardware greatly reduces associated costs Scalability By using individual nodes, more can easily be added by slightly altering the network Convergence Architecture Using commodity hardware has standardized operating systems, instruction sets, and communication protocols Code portability has greatly increased
12
Beowulf Cluster Advantages Flexibility of Configuration and Upgrades Large variety of COTS components Standardization of COTS components allows for easy upgrades Technology Tracking Can use new components as soon as they come out No delay time waiting for manufacturers to integrate components High Availability System will continue to run if an individual node fails
13
Beowulf Cluster Advantages Level of Control System is easily configured to users liking Development Cost and Time No special hardware needs to be designed Less time designing system, just pick parts to be used Cheaper mass market components
14
Beowulf Cluster Disadvantages Programming Difficulty Programs need to be highly parallelized to take advantage of hardware design Distributed Memory Program data is split over the individual nodes Network speed can bottleneck performance Results may need to be compiled by a single node
15
Beowulf Cluster Architecture Master-Slave configuration Master Node Job scheduling System monitoring Resource management Slave Node Does assigned work Communicates with other slave nodes Sends results to master node
16
Node Hardware Typically desktop PC’s Can consist of other types of computers i.e. Rack-mount servers Case-less motherboards PS3’s RaspberryPi boards
17
Node Software Operating System Resource Manager Message Passing Software
18
Resource Management Software Condor Developed by UW-Madison Allows distributed job submission PBS (Portable Batch System) Initially developed by NASA Developed to schedule jobs on parallel compute clusters Maui Adds enhanced monitoring to existing job scheduler (i.e. PBS) Allows administrator to set individual and group job priorities
19
Sample Condor Submit File Submits 150 copies of the program foo Each copy of the program has its own input, output, and error message file All of the log information from Condor goes to one file
20
Sample Maui Configuration File User yangq will have the highest priority users of the group ART having lowest Members of group CS_SE are limited to 20 jobs which use no more than 100 nodes
21
Sample PBS Submit File Submits job “my_job_name” that needs 1 hour and 4 CPUs with 2GB of memory Uses file “my_job_name.in” as input Uses file “my_job_name.log” as output Uses file “my_job_name.err” as error output
22
Message Passing Software MPI (Message Passing Interface) Widely used in HPC community Specification is controlled by MPI-Forum Available for free PVM (Parallel Virtual Machine) First message passing protocol in be widely used Provided for fault tolerant operation
23
MPI Hello World Example
24
MPI Hello World Example(cont)
25
PVM Hello World Example
27
Interconnection Hardware Two main choices – technology and topology Main Technologies Ethernet with speeds up to 10Gbps Infiniband with speeds up to 300 Gbps Image Source:http://www.sierra-cables.com/Cables/Images/12X-Infiniband-R.jpg
28
Interconnection Topology Torus Network Bus Network Flat Neighborhood Network
29
References [1] Impagliazzo, J., & Lee, J. A. N. (2004). History of Computing in Education. Norwell: Kluwer Academic Publishers. [2] Pfeiffer, C. (Photographer). (2006, November 25). Cray-1 Deutsches Museum [Web Photo]. Retrieved from http://en.wikipedia.org/wiki/File:Cray- 1-deutsches-museum.jpghttp://en.wikipedia.org/wiki/File:Cray- 1-deutsches-museum.jpg [3] Sterling, T. (2002). Beowulf Cluster Computing with Linux. United States of America: Massahusetts Institue of Technology. [4] Sterling, T. (2002). Beowulf Cluster Computing with Windows. United State of America: Massachusetts Institute of Technology. [5] Condor High Throughput Computing. (2013, October 24). Retrieved October 27, 2013, from http://research.cs.wisc.edu/htcondor/http://research.cs.wisc.edu/htcondor/
30
References [6] Beowulf: A Parallel Workstation For Scientific Computation. (1995). Retrieved October 27, 2013, from http://www.phy.duke.edu/~rgb/brahma/Resources/beowulf/papers/ICPP95/ icpp95.html http://www.phy.duke.edu/~rgb/brahma/Resources/beowulf/papers/ICPP95/ [7] Development over Time | TOP500 Supercomputer Sites. Retrieved October 27, 2013, from www.top500.org/statistics/overtime/www.top500.org/statistics/overtime/ [8] Jain, A. (2006). Beowulf cluster design and setup. Retrieved October 27, 2013. Informally published manuscript, Department of Computer Science, Boise State University, Retrieved from http://cs.boisestate.edu/~amit/research/beowulf/beowulf-setup.pdf http://cs.boisestate.edu/~amit/research/beowulf/beowulf-setup.pdf [9] Zinner, S. (2012). High Performance Computing Using Beowulf Clusters. Retrieved October 27, 2013. Retrieved from http://www2.hawaii.edu/~zinner/101/students/MitchelBeowulf/cluster.html
31
Questions???
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.