Cluster Computing Overview. What is a Cluster? A cluster is a collection of connected, independent computers that work together to solve a problem.

Slides:



Advertisements
Similar presentations
Clusters, Grids and their applications in Physics David Barnes (Astro) Lyle Winton (EPP)
Advertisements

National Computational Science OSCAR Jeremy Enos Systems Engineer NCSA Cluster Group June 30, 2002 Cambridge, MA.
♦ Commodity processor with commodity inter- processor connection Clusters Pentium, Itanium, Opteron, Alpha GigE, Infiniband, Myrinet, Quadrics, SCI NEC.
Beowulf Supercomputer System Lee, Jung won CS843.
Zhao Lixing.  A supercomputer is a computer that is at the frontline of current processing capacity, particularly speed of calculation.  Supercomputers.
An Overview of the Computer System
Hardware vs. Software Great Example: Data Compression
Information Technology Center Introduction to High Performance Computing at KFUPM.
What Is a Computer and What Does It Do?
SHARCNET. Multicomputer Systems r A multicomputer system comprises of a number of independent machines linked by an interconnection network. r Each computer.
History of Distributed Systems Joseph Cordina
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
Server Platforms Week 11- Lecture 1. Server Market $ 46,100,000,000 ($ 46.1 Billion) Gartner.
Cluster Computing Slides by: Kale Law. Cluster Computing Definition Uses Advantages Design Types of Clusters Connection Types Physical Cluster Interconnects.
Real Parallel Computers. Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra, Meuer, Simon Parallel.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
By Mr. Abdalla A. Shaame 1. What is Computer An electronic device that stores, retrieves, and processes data, and can be programmed with instructions.
Real Parallel Computers. Modular data centers Background Information Recent trends in the marketplace of high performance computing Strohmaier, Dongarra,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Chapter 2 Computer Clusters Lecture 2.1 Overview.
Ch 4. The Evolution of Analytic Scalability
1 Computing platform Andrew A. Chien Mohsen Saneei University of Tehran.
1 In Summary Need more computing power Improve the operating speed of processors & other components constrained by the speed of light, thermodynamic laws,
Chapter 4 COB 204. What do you need to know about hardware? 
Computer & Communications Systems Software Development Unit 1.
ICMAP-Shakeel 1 Infrastructure and Operations. ICMAP-Shakeel 2 Performance Variable for IT Functional capabilities and limitations Price-performance ratio.
Introduction to Information Technology Chapter 1 Mind Tools for Your Future.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display. Parallel Programming in C with MPI and OpenMP Michael J. Quinn.
Introduction to Computer
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Outline Course Administration Parallel Archtectures –Overview –Details Applications Special Approaches Our Class Computer Four Bad Parallel Algorithms.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Cluster Workstations. Recently the distinction between parallel and distributed computers has become blurred with the advent of the network of workstations.
Chapter 1 Intro to Computer Department of Computer Engineering Khon Kaen University.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
SoCal Infrastructure OptIPuter Southern California Network Infrastructure Philip Papadopoulos OptIPuter Co-PI University of California, San Diego Program.
Computer Guts and Operating Systems CSCI 101 Week Two.
1 CMPE 511 HIGH PERFORMANCE COMPUTING CLUSTERS Dilek Demirel İşçi.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
CLUSTER COMPUTING TECHNOLOGY BY-1.SACHIN YADAV 2.MADHAV SHINDE SECTION-3.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
Copyright © 2011 Curt Hill MIMD Multiple Instructions Multiple Data.
Cluster Software Overview
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 2.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Background Computer System Architectures Computer System Software.
The types of computers and their functionalities.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
Architecture of a platform for innovation and research Erik Deumens – University of Florida SC15 – Austin – Nov 17, 2015.
Constructing a system with multiple computers or processors 1 ITCS 4/5145 Parallel Programming, UNC-Charlotte, B. Wilkinson. Jan 13, 2016.
Clouds , Grids and Clusters
Super Computing By RIsaj t r S3 ece, roll 50.
Grid Computing.
Constructing a system with multiple computers or processors
Modern Processor Design: Superscalar and Superpipelining
הכרת המחשב האישי PC - Personal Computer
An Overview of the Computer System
What is Parallel and Distributed computing?
Ch 4. The Evolution of Analytic Scalability
CLUSTER COMPUTING.
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Lesson 1 Computer Technology.
Types of Parallel Computers
Cluster Computers.
Presentation transcript:

Cluster Computing Overview

What is a Cluster? A cluster is a collection of connected, independent computers that work together to solve a problem.

A Typical Cluster Many standalone computers All of the cluster can work together on a single problem at the same time Portions of the cluster can be working on different problems at the same time Connected together by a network –Larger clusters have separate high speed interconnects Administered as a single “machine”

Some Cluster Acronyms Node – a machine in a cluster Sizes –Kb – kilobyte – thousand bytes – small/medium sized –Mb – megabyte – million bytes – 2/3 of a 3.5” floppy –Gb – gigabyte – billion bytes – good amount of computer memory or a very old disk drive –Tb – terabyte – trillion bytes – 1/30 th Library of Congress –Pb – petabyte – 1K trillion bytes – 30 Libraries of Congress SMP – symetric multi-processing (many processors) NFS – Network File System HPC – High Performance Computing

Mainframe Vector Supercomputer Mini Computer Workstation PC 1984 Computer Food Chain

How to Build a Supercomputer: 1980’s A supercomputer was a vector SMP (symmetric multi-processor) Custom CPUs Custom memory Custom packaging Custom interconnects Custom operating system Cray* 2 Costs were Extreme: Around ~$5 million/gigaFLOP Technology Evolution Tracking: ~1/3 Moore’s Law Predictions Costs were Extreme: Around ~$5 million/gigaFLOP Technology Evolution Tracking: ~1/3 Moore’s Law Predictions

Mainframe Vector Supercomputer MPP Workstation PC Mini Computer (hitting wall soon) (future is bleak) 1994 Computer Food Chain

How to Build a Supercomputer: 1990’s A supercomputer was an MPP (massively parallel processor) COTS 1 CPUs Costs were High: Around $200K/gigaFLOP Technology Evolution Tracking: ~1/2 Moore’s Law Predictions Costs were High: Around $200K/gigaFLOP Technology Evolution Tracking: ~1/2 Moore’s Law Predictions 1 COTS = Commercial Off The Shelf COTS memory COTS memory Custom packaging Custom packaging Custom interconnects Custom interconnects Custom operating system Custom operating system Intel ® processor based ASCI - Red

NCSA 1990’s Former Cluster ~1,500 processor SGI de-commissioned Too costly to maintain Software too expensive Takes up large amounts of floor space –(Great for tours, looks impressive, nice displays) Gradually being taken out when floor space required Now being used as network file servers

Computer Food Chain (Now and Future)

Loki: an Intel ® processor based cluster at Los Alamos National Laborry (LANL How to Build a Supercomputer: 2000’s A Supercomputer is a Cluster COTS 1 CPUs Costs are Modest: Around $4K/gigaFLOP Technology Evolution Tracks Moore’s Law Costs are Modest: Around $4K/gigaFLOP Technology Evolution Tracks Moore’s Law 1 COTS = Commercial Off The Shelf COTS Memory COTS Memory COTS Packaging COTS Packaging COTS Interconnects COTS Interconnects COTS Operating System COTS Operating System

Upcoming Teragrid Clusters Over 4,000 Itanium 2 processors at 4 supercomputer sites –National Center for Supercomputing Applications (NCSA) –San Diego Supercomputer Center (SDSC) –Argonne National Laboratory –California Institute of Technology (Caltech) Teraflops computing power (8 teraflops at NCSA) 650 terabytes of disk storage Linked by cross-country 40 Gbit network (16 times faster than the fastest research network currently in existance) –16 minutes to transfer the entire Library of Congress Some uses: –The study of cosmological dark matter –Real-time weather forecasting

Larger Clusters Japan wants to top TOP500 with new cluster 30,000 node cluster planned “Black” clusters (classified) –NSA used to receive large fraction of Cray production Larger ones planned –Scaling problems –Scidac federal mandate to solve scaling problems and enable very large clusters deployment Cooperative widely separated clusters such as SETI

Clustering Today Clustering gained momentum when 3 technologies converged: –1. Very HP Microprocessors –workstation performance = yesterday supercomputers –2. High speed communication –Comm. between cluster nodes >= between processors in an SMP. –3. Standard tools for parallel/ distributed computing & their growing popularity.

Future Cluster Expansion Directions Hyper-clusters Grid computing

Clusters of Clusters (HyperClusters) Scheduler Master Daemon Execution Daemon Submit Graphical Control Clients Cluster 2 Scheduler Master Daemon Execution Daemon Submit Graphical Control Clients Cluster 3 Scheduler Master Daemon Execution Daemon Submit Graphical Control Clients Cluster 1 LAN/WAN

Towards Grid Computing….

What is the Grid ? An infrastructure that couples –Computers (PCs, workstations, clusters, traditional supercomputers, and even laptops, notebooks, mobile computers, PDA, etc) –Databases (e.g., transparent access to human genome database) –Special Instruments (e.g., radio Searching for Life in galaxy, for pulsars) –People (may be even animals who knows, frogs already planned?) across the local/wide-area networks (enterprise, organisations, or Internet) and presents them as an unified integrated (single) resource.

Network Topologies Cluster has it’s own private network –One or a few outside accessible machines –Most of cluster machines on a private network –Easier to manage –Better security (only have to secure entry machines) –Bandwidth limitations (funneling through a few machines) –Appropriate for smaller clusters –Lower latency between nodes Cluster machines are all on the public network –Academic clusters require this –Some cluster software applications require this –Harder for security (have to secure EVERY machine) –Much higher network bandwidth

Communication Networks 100 Base T (Fast Ethernet) –10 MB/sec (100 Mb/sec) – microsecond latency –Essentially free Gigabit Ethernet –Typically delivers MB/sec –~$1500 / node (going down rapidly)

Message Passing Most parallel computations cluster software requires Message Passing The speed of computations is often dependant on message passing speed as much as raw processor speed Message passing is often done through high speed interconnects because traditional networks are too slow

High Speed Interconnects Myrinet from Myricom (most popular in large clusters) –Proprietary, Myrinet 2000 delivers 200 MB/sec –10-15 microsecond latency –~$1500 / node (going down) –Scales to 1000’s of nodes SCI –Proprietary, good for small clusters –100 MB/sec –~5 microsecond latency Quadrics –Proprietary, very expensive –200 MB/s delivered –5 microsecond latency

InfiniBand – Future of Interconnects? Up to 30 Gbits/second first specifications 15 times faster than fastest high speed interconnects Just now starting to be available commercially Industry standard Will be available from numerous companies

Cluster Software Operating System Choices Linux –Redhat – most popular –Mandrake – similar to Redhat, technically superior FreeBSD, OpenBSD, other BSD’s –Technically superior to Linux’es –Much less popular than Linux’es Windoze

Pre-Packaged Cluster Software Choices Pre-packaged cluster software –NCSA cluster-in-a-box –NCSA grid-in-a-box –OSCAR –Score –Scyld/Beuwolf –MSC –NPACI Rocks

OSCAR Pre-packaged Cluster Software Packaged open source cluster software Designed to support many Unix operating systems –Currently, Redhat Linux –Soon to be released - Mandrake Supported and developed by: –NCSA –IBM –Dell –Intel –Oak Ridge Laboratories Most popular open source cluster software package

Score Pre-Packaged Cluster Software Very popular in Japan Very sophisticated

Scyld/Beuwolf Pre-Packaged Cluster Software Different model – treats cluster of separate machines like one big machine – same process space Oriented towards commercial turn-key clusters Very slick installation Not as flexible – separate machines not accessible

NPACI Rocks Pre-Packaged Cluster Software Based on Redhat Linux Similar to OSCAR Competitor of OSCAR Developed by the San Diego Supercomputer Center and others

OSCAR Overview Open Source Cluster Application Resources Cluster on a CD – automates cluster install process IBM, Intel, NCSA, ORNL, MSC Software, Dell NCSA “Cluster in a BOX” base Wizard driven Nodes are built over network OSCAR <= 64 node clusters for initial target OSCAR will probably be on two 1,000 node clusters Works on PC commodity components RedHat based (for now) Components: Open source and BSD style license