Ohio Supercomputer Center Cluster Computing Overview Summer Institute for Advanced Computing August 22, 2000 Doug Johnson, OSC.

Slides:



Advertisements
Similar presentations
IBM Software Group ® Integrated Server and Virtual Storage Management an IT Optimization Infrastructure Solution from IBM Small and Medium Business Software.
Advertisements

Distributed Processing, Client/Server and Clusters
Cloud Computing: Theirs, Mine and Ours Belinda G. Watkins, VP EIS - Network Computing FedEx Services March 11, 2011.
IBM 1350 Cluster Expansion Doug Johnson Senior Systems Developer.
Ver 0.1 Page 1 SGI Proprietary Introducing the CRAY SV1 CRAY SV1-128 SuperCluster.
Beowulf Supercomputer System Lee, Jung won CS843.
Types of Parallel Computers
Information Technology Center Introduction to High Performance Computing at KFUPM.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
Distributed Processing, Client/Server, and Clusters
A Commodity Cluster for Lattice QCD Calculations at DESY Andreas Gellrich *, Peter Wegner, Hartmut Wittig DESY CHEP03, 25 March 2003 Category 6: Lattice.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
NPACI Panel on Clusters David E. Culler Computer Science Division University of California, Berkeley
IT Infrastructure: Software September 18, LEARNING GOALS Identify the different types of systems software. Explain the main functions of operating.
HELICS Petteri Johansson & Ilkka Uuhiniemi. HELICS COW –AMD Athlon MP 1.4Ghz –512 (2 in same computing node) –35 at top500.org –Linpack Benchmark 825.
Hardware and Software Basics. Computer Hardware  Central Processing Unit - also called “The Chip”, a CPU, a processor, or a microprocessor  Memory (RAM)
Module 2: Planning to Install SQL Server. Overview Hardware Installation Considerations SQL Server 2000 Editions Software Installation Considerations.
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
1 Introduction To The New Mainframe Stephen S. Linkin Houston Community College ©HCCS & IBM® 2008 Stephen Linkin.
Information and Communication Technology Fundamentals Credits Hours: 2+1 Instructor: Ayesha Bint Saleem.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
ScotGrid: a Prototype Tier-2 Centre – Steve Thorn, Edinburgh University SCOTGRID: A PROTOTYPE TIER-2 CENTRE Steve Thorn Authors: A. Earl, P. Clark, S.
DAY TO DAY USAGE OF THE NETWORK for academic and administrative support (How we make it work) Presented by: Donnie Mize, Network Manager, FTCC Wanda Jones,
Silberschatz, Galvin and Gagne  Operating System Concepts Chapter 1: Introduction What is an Operating System? Mainframe Systems Desktop Systems.
Appendix B Planning a Virtualization Strategy for Exchange Server 2010.
University of Illinois at Urbana-Champaign NCSA Supercluster Administration NT Cluster Group Computing and Communications Division NCSA Avneesh Pant
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
University of Southampton Clusters: Changing the Face of Campus Computing Kenji Takeda School of Engineering Sciences Ian Hardy Oz Parchment Southampton.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
S&T IT Research Support 11 March, 2011 ITCC. Fast Facts Team of 4 positions 3 positions filled Focus on technical support of researchers Not “IT” for.
Using Virtual Servers for the CERN Windows infrastructure Emmanuel Ormancey, Alberto Pace CERN, Information Technology Department.
Computer Systems Lab The University of Wisconsin - Madison Department of Computer Sciences Linux Clusters David Thompson
The II SAS Testbed Site Jan Astalos - Institute of Informatics Slovak Academy of Sciences.
CCS Overview Rene Salmon Center for Computational Science.
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Probe Plans and Status SciDAC Kickoff July, 2001 Dan Million Randy Burris ORNL, Center for.
Operating System Principles And Multitasking
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
Cluster Software Overview
11 January 2005 High Performance Computing at NCAR Tom Bettge Deputy Director Scientific Computing Division National Center for Atmospheric Research Boulder,
 The End to the Means › (According to IBM ) › 03.ibm.com/innovation/us/thesmartercity/in dex_flash.html?cmp=blank&cm=v&csr=chap ter_edu&cr=youtube&ct=usbrv111&cn=agus.
By Chi-Chang Chen.  Cluster computing is a technique of linking two or more computers into a network (usually through a local area network) in order.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Computing at SSRL: Experimental User Support Timothy M. McPhillips Stanford Synchrotron Radiation Laboratory.
Building and managing production bioclusters Chris Dagdigian BIOSILICO Vol2, No. 5 September 2004 Ankur Dhanik.
3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.
Computing Issues for the ATLAS SWT2. What is SWT2? SWT2 is the U.S. ATLAS Southwestern Tier 2 Consortium UTA is lead institution, along with University.
CS4315A. Berrached:CMS:UHD1 Introduction to Operating Systems Chapter 1.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Condor Project Computer Sciences Department University of Wisconsin-Madison Condor Introduction.
Oct. 6, 1999PHENIX Comp. Mtg.1 CC-J: Progress, Prospects and PBS Shin’ya Sawada (KEK) For CCJ-WG.
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
SOFTWARE TECHNOLOGIES
NL Service Challenge Plans
Constructing a system with multiple computers or processors
NCSA Supercluster Administration
Web Server Administration
IT Infrastructure: Software
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Constructing a system with multiple computers or processors
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Types of Parallel Computers
Cluster Computers.
Presentation transcript:

Ohio Supercomputer Center Cluster Computing Overview Summer Institute for Advanced Computing August 22, 2000 Doug Johnson, OSC

Ohio Supercomputer Center Cluster Computing Overview 2 Overview  What is Cluster Computing  Why Cluster Computing  How Clusters Fit with OSC Mission  When Did It All Start  OSC 128 Processor SGI/Linux Cluster  Clusters for Production HPC Environments

Ohio Supercomputer Center Cluster Computing Overview 3 What is Cluster Computing? A Cluster is a collection of interconnected whole computers used as a single, unified computer Cluster Computing is many things...  High performance computing  Run programs with parallel algorithms  High throughput computing  Parametric studies (same program run many times with different parameters)  High availability computing  Fail-over redundancy Both scientific and commercial applications! Common Resources CPU(s) Memory Hard Drive Network Card NETWORKNETWORK

Ohio Supercomputer Center Cluster Computing Overview 4 Brief History of Cluster Computing at OSC Beowulf project at Center of Excellence in Space Data and Information Sciences (CESDIS) installs 1st cluster - 16 Intel 486 DX4 100MHz, 16 Mbytes RAM per processor, 10 Mbit Ethernet interconnect (3per node) OSC installs “Beaker” systems, a dual purpose workstation cluster - 12 DEC Alpha EV4 processors with Full Duplex FDDI interconnect OSC installs “Trout” system, dual purpose workstation cluster, 14 SGI O2 workstations, R MHz, ATM interconnect OSC 10 processor IA32 Linux Cluster, Pentium II-400MHz processors,Myrinet interconnect, 4.5 Gbyte RAM OSC SGI/Linux 128 Processor Cluster, Pentium III Xeon 550MHz processors, 66 Gbyte RAM, Myrinet and 100Mbit Ethernet interconnect

Ohio Supercomputer Center Cluster Computing Overview 5 Why Parallel Computing  Parallel computing is a strong presence at the National level and is the future of High Performance Computing(HPC)  Parallel computing platforms are a vital element in our infrastructure  Parallel systems have traditionally not been an accessible resource, compared to single processor systems  Higher cost (due mostly to high performance interconnect)  Less refined user interface  Non-traditional programming techniques with little training available OSC Mission Statement OSC provides a reliable high performance computing and communications infrastructure for a diverse, statewide/regional community including education, academic research, industry, and state government. …

Ohio Supercomputer Center Cluster Computing Overview 6 Why Cluster Computing OSC evaluates new and emerging information technologies  Cluster computing is one of the hottest fields in high performance computing Potential benefits of clusters over traditional parallel systems  High performance interconnect technology is approaching commodity availability  Performance of commodity systems are increasing at an aggressive rate due to the commercial market of home/office workstations OSC Mission Statement... In collaboration with this community, OSC evaluates, implements, and supports new and emerging information technologies....

Ohio Supercomputer Center Cluster Computing Overview 7 Why Cluster Computing Potential benefits of clusters over traditional parallel systems (cont)  Operating system gives users the same environment on their desk that they have on the parallel system Other differences  System administration implications  No single system image - OS and software upgrades must be applied to all nodes  Cluster design lends itself to more frequent hardware upgrades  Performance implications  Accounting/funding implications

Ohio Supercomputer Center Cluster Computing Overview 8 How Clusters Fit With OSC Mission  OSC evaluates new and emerging information technologies  Multiple software packages have been evaluated to provide the most robust system  Four different network interconnects have been installed to evaluate performance  Three different processors and operating systems were investigated  OSC implements new and emerging information technologies  A cluster under OSC administration has been available to users since March, 1999  OSC Partnered with Portland Group to bring Cluster Development Kit to OSC users  OSC supports new and emerging information technologies  OSC 128 processor cluster in production status  Training classes on how to build and use a cluster  Staff available to Ohio faculty to help answer questions and trouble shoot problems OSC Mission Statement... In collaboration with this community, OSC evaluates, implements, and supports new and emerging information technologies....

Ohio Supercomputer Center Cluster Computing Overview 9 To Summarize  Develop cluster technology so that it can be rolled out to university research labs  Provide a hardware and software configuration that will allow labs to construct a working cluster with minimal effort  Experienced OSC staff can provide technical assistance  Evaluate software and hardware configurations to assist researchers in defining a system that will best suit their needs  Let the researchers focus on science  Based on user applications, provide performance analysis showing the optimal hardware and software configuration  OSC wants to encourage parallel programming  Parallel programming is the future of high performance computing  Clusters provide increased access to parallel systems

Ohio Supercomputer Center Cluster Computing Overview 10 When Did It All Start? December, 1998 OSC management authorizes a dedicated 10 processor cluster for technology evaluation 1 - Front end node 2 Intel Pentium II 400MHz processors 512 Mbyte RAM, 18 Gbyte Disk 4 - Compute nodes 2 Intel Pentium II 400MHz processors 1 Gbyte RAM,9 Gbyte disk Interconnects: 100mbit Ethernet, Dolphinics SCI, Myricom Myrinet Linux OS, PBS Batch System, PGI Compiler Suite April, 1999 Performance evaluation yields promising results and machine is opened to users in April, 1999

Ohio Supercomputer Center Cluster Computing Overview 11 OSC/SGI Cluster September, 1999 Agreement signed between OSC and SGI October, 1999 System powered on November, 1999 Machine configured and running applications on floor of Supercomputing 99 December, 1999 Machine installed at OSC February, 2000 Machine opened to friendly users

Ohio Supercomputer Center Cluster Computing Overview 12 Hardware  1 front-end node configured with:  Two Gigabytes of RAM  Four 550 MHz Intel Pentium III Xeon processors, each with 512kB of secondary cache  48 Gigabytes, ultra-wide SCSI hard drives  Two 100Base-T Ethernet interfaces  One HIPPI interface  32 compute nodes each configured with:  Two Gigabytes of RAM  Four 550 MHz Intel Pentium III Xeon processors, each with 512kB of secondary cache  18 Gigabytes, ultra-wide SCSI hard drives  Two Myrinet interfaces  One 100Base-T Ethernet interface All nodes are SGI 1400L servers

Ohio Supercomputer Center Cluster Computing Overview 13 Software and Configuration  Hardware originally assembled in Mountainview, CA by SGI Professional Services  OS and software environment installed and configured by OSC staff  Linux operating system  Portable Batch System (PBS)  Portland Group Compiler Suite  Myrinet MPICH-GM interface

Ohio Supercomputer Center Cluster Computing Overview 14 Clusters for Production HPC Environment There are two significant efforts with building clusters  Building a cluster and making it operational  Making the cluster a production system  Ability to host multiple users simultaneously  Ability to schedule system resources  Ability to function without constant intervention The OSC cluster has the following attributes that make it a true HPC production system  Connection to a Mass Storage System (MSS)  Integrated into OSC account database system  Job accounting  Good utilization  High availability

Ohio Supercomputer Center Cluster Computing Overview 15 Mass Storage Support HIPPI 100 Mbit (private) 100 Mbit Switch DMF Origin Terabyte disk storage Data Migration Facility (DMF) IBM Terabyte tape storage..

Ohio Supercomputer Center Cluster Computing Overview 16 User Accounts and Accounting  User Accounts  Cluster is integrated into the Center’s database system for automatic account generation and maintenance  Job Accounting  Accounting has been configured into the environment which tracks CPU usage of users  CPU usage is converted with a charging algorithm and deducted from a Principal Investigators account  Users can view accounting history with text command from Linux command prompt

Ohio Supercomputer Center Cluster Computing Overview 17 Utilization and Availability  Utilization  System utilization is recorded and accessible via a web link  For parallel systems, utilization is expected to be around 50 to 70%  Current utilization is about 70% parallel and 30% serial  Availability  Good availability has been achieved through significant uptime and minimal system problems  Scheduling downtime every 4 weeks for software upgrades, hardware modifications and general system maintenance

Ohio Supercomputer Center Cluster Computing Overview 18 TCP Stream Performance

Ohio Supercomputer Center Cluster Computing Overview 19 TCP Stream Performance

Ohio Supercomputer Center Cluster Computing Overview 20 UDP Stream Performance./netperf -l 60 -H fe.ovl.osc.edu -i 10,2 -I 99,10 -t UDP_STREAM -- -m s S UDP UNIDIRECTIONAL SEND TEST to fe.ovl.osc.edu : 99% conf. Socket Message Elapsed Messages Size Size Time Okay Errors Throughput bytes bytes secs # # 10^6bits/sec