1. Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction of Beowulf Cluster The use of.

Slides:



Advertisements
Similar presentations
PARMON A Comprehensive Cluster Monitoring System PARMON Team Centre for Development of Advanced Computing, Bangalore, India Contact: Rajkumar Buyya
Advertisements

Multiple Processor Systems
Distributed Processing, Client/Server and Clusters
2. Computer Clusters for Scalable Parallel Computing
Beowulf Supercomputer System Lee, Jung won CS843.
Introduction CSCI 444/544 Operating Systems Fall 2008.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
Types of Parallel Computers
Information Technology Center Introduction to High Performance Computing at KFUPM.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
CMSC 421: Principles of Operating Systems Section 0202 Instructor: Dipanjan Chakraborty Office: ITE 374
Distributed Processing, Client/Server, and Clusters
Chapter 1: Introduction
Beowulf Cluster Computing Each Computer in the cluster is equipped with: – Intel Core 2 Duo 6400 Processor(Master: Core 2 Duo 6700) – 2 Gigabytes of DDR.
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
1 Computer Science, University of Warwick Architecture Classifications A taxonomy of parallel architectures: in 1972, Flynn categorised HPC architectures.
1/16/2008CSCI 315 Operating Systems Design1 Introduction Notice: The slides for this lecture have been largely based on those accompanying the textbook.
High Performance Computing (HPC) at Center for Information Communication and Technology in UTM.
07/14/08. 2 Points Introduction. Cluster and Supercomputers. Cluster Types and Advantages. Our Cluster. Cluster Performance. Cluster Computer for Basic.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
Parallel Computing The Bad News –Hardware is not getting faster fast enough –Too many architectures –Existing architectures are too specific –Programs.
1 In Summary Need more computing power Improve the operating speed of processors & other components constrained by the speed of light, thermodynamic laws,
Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?
1 5/25/2016 操作系统课件 教材: 《操作系统概念(第六版 影印版)》 【原书名】 Operating System Concepts(Sixth Edition) [ 原书信息 ] Operating System Concepts(Sixth Edition) [ 原书信息 ] 【原出版社】
Operating Systems CS3502 Fall 2014 Dr. Jose M. Garrido
CLUSTER COMPUTING STIMI K.O. ROLL NO:53 MCA B-5. INTRODUCTION  A computer cluster is a group of tightly coupled computers that work together closely.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
March 3rd, 2006 Chen Peng, Lilly System Biology1 Cluster and SGE.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
1.1 Operating System Concepts Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
Cluster Software Overview
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.
Background Computer System Architectures Computer System Software.
Computer System Evolution. Yesterday’s Computers filled Rooms IBM Selective Sequence Electroinic Calculator, 1948.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
Chapter 16 Client/Server Computing Dave Bremer Otago Polytechnic, N.Z. ©2008, Prentice Hall Operating Systems: Internals and Design Principles, 6/E William.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems Multiprocessor Systems Distributed Systems Clustered System Real.
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Constructing a system with multiple computers or processors
Chapter 1: Introduction
Parallel & Cluster Computing
Chapter 1: Introduction
Hadoop Clusters Tess Fulkerson.
Chapter 1: Introduction
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
CSE8380 Parallel and Distributed Processing Presentation
Chapter 1: Introduction
Distributed computing deals with hardware
Constructing a system with multiple computers or processors
Language Processors Application Domain – ideas concerning the behavior of a software. Execution Domain – Ideas implemented in Computer System. Semantic.
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Chapter 1: Introduction
Types of Parallel Computers
Chapter 1: Introduction
Presentation transcript:

1

Introduction What is cluster computing? Classification of Cluster Computing Technologies: Beowulf cluster Construction of Beowulf Cluster The use of cluster computing in Bioinformatics & Parallel Computing Project High performance clusters (HPC) a 256-processor Sun cluster. Build Your Own Cluster! 2

Mainly in parallel: Split problem in smaller tasks that are executed concurrently Why? Absolute physical limits of hardware components Economical reasons – more complex = more expensive Performance limits – double frequency <> double performance Large applications – demand too much memory & time Advantages: Increasing speed & optimizing resources utilization Disadvantages: Complex programming models – difficult development 3

Several applications on parallel processing: Science Computation Digital Biology Aerospace Resources Exploration 4

Architectures of Parallel Computer:  PVP (Parallel Vector Processor)  SMP (Symmetric Multiprocessor)  MPP (Massively Parallel Processor)  COW (Cluster of Workstation)  DSM (Distributed Shared Memory) Towards Inexpensive Supercomputing: Cluster Computing is the Commodity Supercomputing 58.8% 5

A computer cluster is a group of linked computers, working together closely so that in many respects they form a single computer. The components of a cluster are commonly, but not always, connected to each other through fast local area networks. Clusters are usually deployed to improve performance and/or availability over that provided by a single computer, while typically being much more cost-effective than single computers of comparable speed or availability. CPU … … High Speed Local Network Cluster Middle ware Cluster … Cluster consists of:  Nodes( master+computing )  Network  OS  Cluster middleware: Middleware such as MPI which permits compute clustering programs to be portable to a wide variety of clusters 6

High availability clusters (HA) (Linux) Mission critical applications High-availability clusters (also known as Failover Clusters) are implemented for the purpose of improving the availability of services which the cluster provides. provide redundancy eliminate single points of failure. Network Load balancing clusters Web servers, mail servers,.. operate by distributing a workload evenly over multiple back end nodes. Typically the cluster will be configured with multiple redundant load- balancing front ends. all available servers process requests. Science Clusters Beowulf 7

A Beowulf Cluster is a computer design that uses parallel processing across multiple computers to create cheap and powerful supercomputers. A Beowulf Cluster in practice is usually a collection of generic computers, either stock systems or wholesale parts purchased independently and assembled, connected through an internal network. A cluster has two types of computers, a master computer, and node computers. When a large problem or set of data is given to a Beowulf cluster, the master computer first runs a program that breaks the problem into small discrete pieces; it then sends a piece to each node to compute. As nodes finish their tasks, the master computer continually sends more pieces to them until the entire problem has been computed. 8

( Ethernet,Myrinet….) + ( MPI) ( Ethernet,Myrinet….) + ( MPI)  Master : or service node or front node ( used to interact with users and manage the cluster )  Nodes : a group of computers (computing node s)( keyboard, mouse, floppy, video…)  Communications between nodes on an interconnect network platform ( Ethernet, Myrinet….)  In order for the master and node computers to communicate, some sort message passing control structure is required. MPI,(Message Passing Interface) is the most commonly used such control. 9

 To construct Beowulf cluster there are four distinct but interrelated areas of consideration: 10

 Brief Technical Parameters:  OS : CentOS 5 managed by Rochs-cluster  Service node : 1 (Intel P4 2.4 GHz)  Computing nodes : 32 (Intel P GHz)  System Memory : 1 GB per node  Network Platforms : Gigabit Ethernet, 2 cards per node Myrinet 2 G  Language : C, C++, Fortran, java  Compiler : GNU gcc, Intel compiler, sun Java compiler Parallel Environment : MPICH  Tools : Ganglia (Monitoring) Pbs - Torque (Scheduler) 11

OS (Operating System )  Three of the most commonly used OS are including kernel level support for parallel programming:  Windows NT/2000 mainly used to build a High Availability Cluster or a NLB(Network Local Balance) Cluster, provide services such as Database, File/Print,Web,Stream Media.Support 2-4 SMP or 32 processors. Hardly used to build a Science Computing Cluster  Redhat Linux The most used OS for a Beowulf Cluster. provides High Performance and Scalability / High Reliability / Low Cost ( get freely and uses inexpensive commodity hardware )  SUN Solaris Uses expensive and unpopular hardware 12

Network Platform Some design considerations for the interconnect network are: Fast Ethernet (100Mbps): low cost / min latency: 80µs Gigabit Ethernet (1Gbps) expensive/ min latency: 300 µs Myrinet (high-speed local area networking system) (2Gbps) The best network platform. Some design considerations for the interconnect network are: Network structure Bus/Switched Maximum bandwidth Minimum latency 13

Parallel Environment  Two of the most commonly used Parallel Interface Libraries: o PVM (Parallel Virtual Machine) o MPI (Message Passing Interface)  Parallel Interface Libraries: provide a group of communication interface libraries that support message passing. Users can call these libraries directly in their Fortran and C programs. Cluster Computer Architecture 14

15

16

17

18

19

HPC platform for scientific applications Storage and processing of large data Satellites image processing Information Retrieval, Data Mining Computing systems in an academic environment Geologists also use clusters to emulate and predict earthquakes and model the interior of the Earth and sea floor clusters are even used to render and manipulate high-resolution graphics in engineering. 20

What is Bioinformatics: Also called “biomedical computing”. The application of computer science and technology to problems in the biomolecular sciences. Cluster Uses: The Beowulf cluster computing design is been used by parallel processing computer systems projects to build a powerful computer that could assist in Bioinformatics research and data analysis. In bioinformatics Clusters are used to run DNA string matching algorithms or to run protein folding applications. It also use a computer algorithm known as BLAST,(Basic Local Alignment Search Tool), to analyze massive sets of DNA sequences for research into Bioinformatics. 21

For Bioinfomatics MPICH2 is used which is an implementation of MPI that was specifically designed for use with cluster computing systems and parallel processing. It is an open source set of libraries for various high level programming languages that give programmers tools to easily control how large problems are broken apart and distributed to the various computers in a cluster. 22

Protein folding and how is folding linked to disease? Proteins are biology's workhorses -- its "nanomachines." Before proteins can carry out these important functions, they assemble themselves, or "fold." The process of protein folding, while critical and fundamental to virtually all of biology, in many ways remains a mystery. when proteins do not fold correctly: Alzheimer's, Mad Cow How ? is a distributed computing project -- people from throughout the world download and run software to band together to make one of the largest supercomputers in the world. In each computer uses novel computational methods coupled to distributed computing, to simulate problems. the results get back to the main server as you computer will automatically upload the results to the server each time it finishes a work unit, and download a new job at that time. 23

24

Brief Architectural information : Processor : AMD OPETRON 2218 DUAL CORE DUAL SOCKET NO. of Master Nodes : 1 NO. of Computing Nodes : 64 CLUSTER Software : ROCKS version 4.3 Total Peak Performance : 1.3 T. F Peak Performance: In network performance management, a set of functions that evaluate and report the behavior of: telecommunications equipment Efffectiveness of the network or network element Other subfunctions, such as gathering statistical information, maintaining and examining historical logs, determining system performance under natural and artificial conditions altering system modes of operation. 25

Calculation procedure for peak performance: No of nodes 64 Memory RAM 4 GB Hard Disk Capacity/each node : 250GB Storage Cap. 4 TB No.of processors and cores: 2 X 2 = 4(dual core + dual socket) CPU speed : 2.6 GHz No. of floating point operations per seconds for AMD processor: 2 (since it is a dual core) Total peak performance : No of nodes X No.of processors and cores X CPU speed X No of floating point operations per second = 64 X 4 X 2.6GHz X 2 = 1.33 TF 26

Scheduler used: Sun Grid Engine: Job scheduler software tool. Application software/s and compilers: Open MPI Lam MPI C, C++, FORTRAN compilers (both GNU AND INTEL) Bio roll: for Bio-Chemical applications 27

Academically:  1000 nodes Beowulf Cluster System  Used for genetic algorithm research by John Coza, Stanford University 28

29

 Parallel Environments are used in building clusters?  Two of the most commonly used Parallel Interface Libraries:  PVM (Parallel Virtual Machine)  MPI (Message passing Interface)  Why MPI over PVM? 1. MPI has more than one freely available, quality implementation (LAM, MPICH and CHIMP). 2. MPI defines a 3rd party profiling mechanism. 3. MPI has full asynchronous communication. 4. MPI groups are solid, efficient, and deterministic. 5. MPI efficiently manages message buffers. 6. MPI synchronization protects 3rd party software. 7. MPI can efficiently program MPP and clusters. 8. MPI is totally portable. 9. MPI is formally specified. 10. MPI is a standard, can be implemented with Linux, NT, on many supercomputers 30

WMU e-books library: Beowulf Cluster Computing with Windows: Thomas Sterling, ISBN: Construction of a Beowulf Cluster System for Parallel Computing Kun Feng, Jiaqi Dong, Jinhua Zhang