Beowulf Clusters Matthew Doney. What is a cluster?  A cluster is a group of several computers connected  Several different methods of connecting them.

Slides:



Advertisements
Similar presentations
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Advertisements

Master/Slave Architecture Pattern Source: Pattern-Oriented Software Architecture, Vol. 1, Buschmann, et al.
Beowulf Supercomputer System Lee, Jung won CS843.
BY Curtis Michels 1. Where did the name come from? “I knew him of yore in his youthful days; his aged father was Ecgtheow named, to whom, at home, gave.
Universidad Politécnica de Baja California. Juan P. Navarro Sánchez 9th level English Teacher: Alejandra Acosta The Beowulf Project.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Types of Parallel Computers
Information Technology Center Introduction to High Performance Computing at KFUPM.
SHARCNET. Multicomputer Systems r A multicomputer system comprises of a number of independent machines linked by an interconnection network. r Each computer.
Introduction of Cluster and (KBRIN) Computational Cluster Facilities Xiaohui Cui CECS Department University of Louisville 09/03/2003.
Distributed Processing, Client/Server, and Clusters
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Reference: Message Passing Fundamentals.
NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
Component 4: Introduction to Information and Computer Science Unit 1: Basic Computing Concepts, Including History Lecture 1 This material was developed.
Cluster Computing Slides by: Kale Law. Cluster Computing Definition Uses Advantages Design Types of Clusters Connection Types Physical Cluster Interconnects.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Distributed MapReduce Team B Presented by: Christian Bryan Matthew Dailey Greg Opperman Nate Piper Brett Ponsler Samuel Song Alex Ostapenko Keilin Bickar.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
Universidad Politécnica de Baja California. Juan P. Navarro Sanchez 9th level English Teacher: Alejandra Acosta The Beowulf Project.
Hardware and Software Basics. Computer Hardware  Central Processing Unit - also called “The Chip”, a CPU, a processor, or a microprocessor  Memory (RAM)
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
This is the way an organisation distributes the data across its network. It uses different types of networks to communicate the information across it.
Distributed Systems Early Examples. Projects NOW – a Network Of Workstations University of California, Berkely Terminated about 1997 after demonstrating.
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
Chapter 4 COB 204. What do you need to know about hardware? 
1 Lecture 20: Parallel and Distributed Systems n Classification of parallel/distributed architectures n SMPs n Distributed systems n Clusters.
Choosing NOS can be a complex and a difficult decision. Every popular NOS has its strengths and weaknesses. NOS may cost thousands of dollars depending.
Seaborg Cerise Wuthrich CMPS Seaborg  Manufactured by IBM  Distributed Memory Parallel Supercomputer  Based on IBM’s SP RS/6000 Architecture.
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
NATIONAL INSTITUTE OF SCIENCE & TECHNOLOGY Presented by: Santosh kumar Swain Technical Seminar Presentation by SANTOSH KUMAR SWAIN Roll # CS
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Outline Course Administration Parallel Archtectures –Overview –Details Applications Special Approaches Our Class Computer Four Bad Parallel Algorithms.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Beowulf Cluster Jon Green Jay Hutchinson Scott Hussey Mentor: Hongchi Shi.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Parallel Programming on the SGI Origin2000 With thanks to Igor Zacharov / Benoit Marchand, SGI Taub Computer Center Technion Moshe Goldberg,
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
MIS 105 LECTURE 1 INTRODUCTION TO COMPUTER HARDWARE CHAPTER REFERENCE- CHP. 1.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Cluster Software Overview
Computer Science and Engineering Copyright by Hesham El-Rewini Advanced Computer Architecture CSE 8383 April 11, 2006 Session 23.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
CHAPTER 7 CLUSTERING SERVERS. CLUSTERING TYPES There are 2 types of clustering ; Server clusters Network Load Balancing (NLB) The difference between the.
Distributed Systems Unit – 1 Concepts of DS By :- Maulik V. Dhamecha Maulik V. Dhamecha (M.Tech.)
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Background Computer System Architectures Computer System Software.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Lecture 1: Network Operating Systems (NOS)
CHAPTER 11: Modern Computer Systems
Chapter 1: Introduction
Constructing a system with multiple computers or processors
Chapter 2: The Linux System Part 1
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Types of Parallel Computers
Cluster Computers.
Presentation transcript:

Beowulf Clusters Matthew Doney

What is a cluster?  A cluster is a group of several computers connected  Several different methods of connecting them  Distributed  Computers widely separated, connected over the internet  Used by research groups like and GIMPS  Workstation Cluster  Collection of Workstations loosely connected by LAN  Cluster Farm  PC’s connected over LAN that perform work when idle

What is a Beowulf Cluster  A Beowulf Cluster is one class of a cluster computer  Uses Commercial Off The Shelf (COTS) hardware  Typically contains both master and slave nodes  Not defined by a specific piece of hardware Image Source:

What is a Beowulf Cluster  The origin of the name “Beowulf”  Main character of Old English poem  Described in the poem – “he has thirty men’s heft of grasp in the gripe of his hand, the bold-in-battle”. Image Source: content/uploads/2011/06/lynd-ward-17-jnanam- dot-net.jpg

Cluster Computer History – 1950’s  SAGE, one of the first cluster computers  Developed by IBM for NORAD  Linked radar stations together for first early warning detection system Image Source:

Cluster Computer History – 1970’s  Technological Advancements  VLSI (Very Large Scale Integration)  Ethernet  UNIX Operating System

Cluster Computer History – 1980’s  Increased interest in cluster computing  Ex: NSA connected 160 Apollo workstations in a cluster configuration  First widely used clustering product: VAXcluster  Development of task scheduling software  Condor package developed by UW-Madison  Development of parallel programming software  PVM(Parallel Virtual Machine)

Cluster Computer History – 1990’s  NOW(Network of workstations) project at UC Berkeley  First cluster on TOP500 list  Development of Myrinet LAN system  Beowulf project started at NASA’s Goddard Space Flight Center Image Source:

Cluster Computer History - Beowulf  Developed by Thomas Sterling and Donald Becker  16 Individual nodes  100 MHz Intel processors  16 MB memory, 500 MB hard drive  2 10Mbps Ethernet ports  Early version of Linux  Used PVM library

Cluster Computer History – 1990’s  MPI standard developed  Created to be a global standard to replace existing message passing protocols  DOE, NASA, California Institute of Technology collaboration  Developed a Beowulf system with sustained performance 1 Gflops  Cost $50,000  Awarded Gordon Bell prize for price/performance  28 Clusters were on the TOP500 list by the end of the decade

Beowulf Cluster Advantages  Price/Performance  Using COTS hardware greatly reduces associated costs  Scalability  By using individual nodes, more can easily be added by slightly altering the network  Convergence Architecture  Using commodity hardware has standardized operating systems, instruction sets, and communication protocols  Code portability has greatly increased

Beowulf Cluster Advantages  Flexibility of Configuration and Upgrades  Large variety of COTS components  Standardization of COTS components allows for easy upgrades  Technology Tracking  Can use new components as soon as they come out  No delay time waiting for manufacturers to integrate components  High Availability  System will continue to run if an individual node fails

Beowulf Cluster Advantages  Level of Control  System is easily configured to users liking  Development Cost and Time  No special hardware needs to be designed  Less time designing system, just pick parts to be used  Cheaper mass market components

Beowulf Cluster Disadvantages  Programming Difficulty  Programs need to be highly parallelized to take advantage of hardware design  Distributed Memory  Program data is split over the individual nodes  Network speed can bottleneck performance  Results may need to be compiled by a single node

Beowulf Cluster Architecture  Master-Slave configuration  Master Node  Job scheduling  System monitoring  Resource management  Slave Node  Does assigned work  Communicates with other slave nodes  Sends results to master node

Node Hardware  Typically desktop PC’s  Can consist of other types of computers i.e.  Rack-mount servers  Case-less motherboards  PS3’s  RaspberryPi boards

Node Software  Operating System  Resource Manager  Message Passing Software

Resource Management Software  Condor  Developed by UW-Madison  Allows distributed job submission  PBS (Portable Batch System)  Initially developed by NASA  Developed to schedule jobs on parallel compute clusters  Maui  Adds enhanced monitoring to existing job scheduler (i.e. PBS)  Allows administrator to set individual and group job priorities

Sample Condor Submit File  Submits 150 copies of the program foo  Each copy of the program has its own input, output, and error message file  All of the log information from Condor goes to one file

Sample Maui Configuration File  User yangq will have the highest priority users of the group ART having lowest  Members of group CS_SE are limited to 20 jobs which use no more than 100 nodes

Sample PBS Submit File  Submits job “my_job_name” that needs 1 hour and 4 CPUs with 2GB of memory  Uses file “my_job_name.in” as input  Uses file “my_job_name.log” as output  Uses file “my_job_name.err” as error output

Message Passing Software  MPI (Message Passing Interface)  Widely used in HPC community  Specification is controlled by MPI-Forum  Available for free  PVM (Parallel Virtual Machine)  First message passing protocol in be widely used  Provided for fault tolerant operation

MPI Hello World Example

MPI Hello World Example(cont)

PVM Hello World Example

Interconnection Hardware  Two main choices – technology and topology  Main Technologies  Ethernet with speeds up to 10Gbps  Infiniband with speeds up to 300 Gbps Image Source:

Interconnection Topology Torus Network Bus Network Flat Neighborhood Network

References  [1] Impagliazzo, J., & Lee, J. A. N. (2004). History of Computing in Education. Norwell: Kluwer Academic Publishers.  [2] Pfeiffer, C. (Photographer). (2006, November 25). Cray-1 Deutsches Museum [Web Photo]. Retrieved from 1-deutsches-museum.jpghttp://en.wikipedia.org/wiki/File:Cray- 1-deutsches-museum.jpg  [3] Sterling, T. (2002). Beowulf Cluster Computing with Linux. United States of America: Massahusetts Institue of Technology.  [4] Sterling, T. (2002). Beowulf Cluster Computing with Windows. United State of America: Massachusetts Institute of Technology.  [5] Condor High Throughput Computing. (2013, October 24). Retrieved October 27, 2013, from

References  [6] Beowulf: A Parallel Workstation For Scientific Computation. (1995). Retrieved October 27, 2013, from icpp95.html  [7] Development over Time | TOP500 Supercomputer Sites. Retrieved October 27, 2013, from  [8] Jain, A. (2006). Beowulf cluster design and setup. Retrieved October 27, Informally published manuscript, Department of Computer Science, Boise State University, Retrieved from  [9] Zinner, S. (2012). High Performance Computing Using Beowulf Clusters. Retrieved October 27, Retrieved from

Questions???