© 2006 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing August 8, 2006 Nadya.

© 2006 UC Regents2 Cluster Pioneers  In the mid-1990s, Network of Workstations project (UC Berkeley) and the Beowulf Project (NASA) asked the question: Can you build a high performance machine from commodity components?  NOW pioneered the vision for clusters of commodity processors  David Culler  SunOS/SPARC  First generation of Myrinet  Glunix (Global Layer Unix) execution environment  Beowulf popularized the notion and made it very affordable  Tomas Sterling & Donald Becker  Linux

© 2006 UC Regents3 Types of Clusters  Highly Available (HA)  Generally small, less than 8 nodes  Redundant components  Multiple communication paths  Visualization Clusters  Each node drives a display  OpenGL machines  Computing (HPC Clusters)  AKA Beowulf

© 2006 UC Regents4 Definition: Beowulf  Collection of commodity PCs running an opensource operating system with a commodity network  Network is usually Ethernet, although non- commodity networks are sometimes called Beowulfs  Come to mean any Linux cluster  www.beowulf.org

© 2006 UC Regents7 The Light Side of Clusters  Clusters are phenomenal price/performance computational engines …  Mainstream tools for a variety of scientific fields  Expanded performance from HPC to  High availability  Visualization  Benefits come due to  Using inexpensive commodity servers  Open source software  Large and expanding community of developers

© 2006 UC Regents8 The Dark Side of Clusters  While clusters are phenomenal price/performance computational engines …  Can be hard to manage without experience  High-performance I/O is still unsolved  Finding out where something has failed increases at least linearly as cluster size increases  Not cost-effective if every cluster “burns” a person just for care and feeding  Programming environment could be vastly improved  Technology is changing very rapidly. Scaling up is becoming commonplace (128-256 nodes)

© 2006 UC Regents9 Most Critical Problems with Clusters  The largest problem in clusters is software skew  When software configuration on some nodes is different than on others  Small differences (minor version numbers on libraries) can cripple a parallel program  The second most important problem is lack of adequate job control of the parallel process  Signal propagation  Cleanup

© 2006 UC Regents10 Top 3 Problems with Software Packages  Software installation works only in interactive mode  Need a significant work by end-user  Often rational default settings are not available  Extremely time consuming to provide values  Should be provided by package developers but …  Package is required to be installed on a running system  Means multi-step operation: install + update  Intermediate state can be insecure

© 2006 UC Regents12 What is the Grid?  1998: “A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive, and inexpensive access to high- end computational capabilities” - Carl Kesselman and Ian Foster  Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure - from WikipediA

© 2006 UC Regents13 Grid’s key elements  Coordination of resources that are subject to decentralized control  Resources from different domains (VO, company, department)  Users from different domains  Use of standard, open general-purpose protocols and interfaces  Authentication/authorization  Resource discovery/access  Delivers non-trivial quality of service  Utility of combined system >> sum of parts From “What is the grid? A three point checklist” - Ian Foster, 2002

© 2006 UC Regents14 HPC clusters vs. Grids 1. Closely coupled, operates as a single computing resource 2. Often has high-performance networking interconnects 3. Uses specialized OS designed to appear as a single computing resource 4. SIMD/MIMD models of work execution 1. Heterogeneous, operates as a generalized computing resource 2. Can use high-performance or standard interconnects 3. Individual systems are not specialized; based on standard machines and OSes 4. HPC execution + different operations or instructions. 5. Grid combines – monitoring of the nodes – queuing system for work units

© 2006 UC Regents15 Light and dark sides of Grid  Gives substantial computing power for simulations, complex computations and analisys  Existing resources may be better exploited  Through a VO can provide redundancy and share resources  May offer scalability via incremental increase of resources  Indeterminate quality of service  Commercial grid services are $$$  Rapidly changing infrastructure  Grid technologies are VARIED, some are IMMATURE  Some require additional specialized high-capacity communication links

© 2006 UC Regents17  Technology transfer of commodity clustering to application scientists  “make clusters easy”  Scientists can build their own supercomputers and migrate up to national centers as needed  Rocks is a cluster on a CD  Red Hat Enterprise Linux (opensource and free)  Clustering software (PBS, SGE, Ganglia, NMI)  Highly programmatic software configuration management  Core software technology for several campus projects  BIRN  Center for Theoretical Biological Physics  EOL  GEON  NBCR  OptIPuter  First Software release Nov, 2000  Supports x86, Opteron/EM64T, and Itanium  RedHat/CentOS 4.x Rocks - open source clustering distribution www.rocksclusters.org/wordpress

© 2006 UC Regents18 Philosophy  Not fun to “care and feed” for the system  $ sysadmin > $ cluster  1 TFLOP cluster is less than $200,000  Close to actual cost of a fulltime administrator  The system administrator is the weakest link in the cluster  Bad ones like to tinker  Good ones still make mistakes

© 2006 UC Regents19 Philosophy continued  Optimize for installation  Get the system up quickly and in a consistent state  Build supercomputers in hours not months  Manage through re-installation  Can re-install 128 nodes in under 20 minutes  No support for on-the-fly system patching  Do not spend time trying to issue system consistency - reinstall  All nodes are 100% automatically configured  Zero “hand” configuration  This includes site-specific configuration  Run on heterogeneous standard high volume components  Use components that offer the best price/performance  Software installation and configuration must support different hardware

© 2006 UC Regents21 How Rocks handles complexities  Minimizing required configuration parameters 1. Desired rolls (extensions) 2. Machine identity 3. Partitioning 4. Network configuration 5. Time zone 6. Root password  Use configuration graph to define cluster  Graph defines similarities and differences  Different nodes share configuration.  Graph can be extended as needed, new nodes defined

© 2006 UC Regents22 How Rocks handles complexities continued  Extend cluster with rolls  Rolls are containers for software packages and their configuration scripts  Rolls dissect a monolithic distribution  100% automated nodes installation & configuration  from distribution assembled at the system installation time  Node is always in a known state Down Installing Running

© 2006 UC Regents23 How successful is Rocks?  Over 500 registered clusters  30,000 CPUs and > 135 Tf aggregate  3 chip architectures X86, X86_64, Itanium  scales to 1024-CPU machines (Tungsten2 at NCSA)  Rocks clusters on Top 500 list  “Rockstar” built live at SC’03 under 2 hrs  Running applications while some nodes still installing  Top500 Ranking November 2003 - 201 June2004 - 433

© 2006 UC Regents24 Large Scale Rocks Clusters  Tungsten2  520 (currently 1024) Node Cluster  Dell Hardware  Topspin Infiniband  Deployed 11.2004  Top 500 ranking:  47th on 06.2005 (520 nodes)  78th on 06.2006 (1024 nodes)  “We went from PO to crunching code in 2 weeks. It only took another 1 week to shake out some math library conflicts, and we have been in production ever since.” -- Greg Keller, NCSA (Dell On-site Support Engineer) Core Fabric Edge Fabric 6 72-port TS270 29 24-port TS120 174 uplink cables 512 1m cables 18 Compute Nodes source: topspin (via google) Largest registered Rocks cluster

© 2006 UC Regents26 HPCwire Reader’s Choice Awards for 2004/2005  Rocks won in Several categories:  Most Important Software Innovation (Reader’s Choice)  Most Important Software Innovation (Editor’s Choice)  Most Innovative - Software (Reader’s Choice)

© 2006 UC Regents27 Where to Find More Info  Rocks website: http://www.rocksclusters.org/wordpress  Press room  Downloads  Users guides  Latest Release 4.2 Beta, June 2006  Discussion list http://www.rocksclusters.org/wordpress/?page_id=6  Registration at http://www.rocksclusters.org/rocks-register  Top 500 http://www.top500.org/lists/2006/06/

© 2006 UC Regents29 Minimum Requirements  Frontend  2 Ethernet Ports  CDROM  18 GB Disk Drive  512 MB RAM  Compute Nodes  1 Ethernet Port  18 GB Disk Drive  512 MB RAM  Complete OS Installation on all Nodes  No support for Diskless (yet)  Not a Single System Image  All Hardware must be supported by RHEL i386 (Pentium/Athlon) x86_64 (Opteron/EM64T) ia64 (Itanium) server

© 2006 UC Regents31 Optional Components  High-performance network  Myrinet  Infiniband (Infinicon or Voltaire)  Network-addressable power distribution unit  Keyboard/video/mouse network  Non-commodity  How do you manage your management network?

© 2006 UC Regents32 Build Basic Rocks Cluster  Install frontend 1. Insert Rocks Boot CD 2. Answer 6 screens of configuration data 1. Desired rolls (extentions) 2. Machine identity 3. Partitioning 4. Network configuration 5. Timezone 6. Root password 3. Drink coffee (30 min to install)  Install compute nodes 1. Login to frontend 2. Execute insert-ethers 3. Boot compute node with Boot CD (or PXE) 4. Optional: monitor installation with eKV  Add user accounts  Start computing Optional Rolls  Condor  Grid (NMI R4 based)  Intel (compilers)  Java  SCE (developed in Thailand)  Sun Grid Engine  PBS (developed in Norway)  Area51 (security monitoring tools)  Many Others …

© 2006 UC Regents33 Example of Nodes Complexity TypePackages Modified configuration files Basic frontend69292 Basic compute38564 Tiled-display frontend 73090 Tiled-display compute 58158 NFS server node37760

© 2006 UC Regents36 What can you do ?  Lunch interactive jobs using mpirun (for mpich and LAM mpi)  Run command on all nodes of the cluster using cluster-fork  cluster-fork -U$USER Prerequisite - setup up ssh authentication  ssh-agent $SHELL  ssh-add  Submit jobs via SGE  Submit jobs via PBS  Submit jobs via Condor  Submit jobs via Globus  Submit jobs via portals

© 2006 UC Regents38 Globus Commands - setup  Generate a certificate request file  grid-cert-request type name and passphrase when requested $HOME/.globus/ generated with 3 files: usercert.pem (empty) usercert_request.pem userkey.pem  Start SSL certificate proxy (similar to ssh-agent)  grid-proxy-init type passphrase once for lifetime on the proxy  Get proxy information  grid-proxy-info  Authenticate to a remote gatekeeper  globusrun -a -r host.name.here

© 2006 UC Regents39 Globus Commands  globusrun - basic command to submit a job where attributes are specified in RSL There are several ways to specify RSL: 1. As a file globusrun -o -r -f rsl-file 2. Resource option plus command line argument globusrun -r rocks-32.sdsc.edu -o '&(executable=/bin/env)’ 3. Command line arguments globusrun -o "+( &(resourceManagerContact=rocks-32.sdsc.edu) (executable=/bin/env))" -o is used to capture stdout and stderr to terminal -r is used to specify resource contact manager & (jobtype = single) (executable = "/bin/uname") (arguments = "-a") (count = 2) (stdout = "uname.out" )

© 2006 UC Regents40 Globus Commands  globus-job-run - higher level job submission. Takes the command line arguments, generates the rsl file and invokes globusrun. 1. Submit a simple job globus-job-run rocks-32.sdsc.edu /bin/hostname 2. Submit job with staging of executable  Create a shell script testrun.sh  globus-job-run rocks-32.sdsc.edu -s testrun.sh 3. Specify all options in a file  globus-job-run -file myfile 4. Use file copy then submit job without staging  globus-job-run rocks-32.sdsc.edu /bin/mkdir dir1  globus-url-copy file://$HOME/testrun \ gsiftp://rocks-32.sdsc.edu/home/nadya/dir1/testrun.sh  globus-job-run rocks-32.sdsc.edu /bin/chmod +x $HOME/dir1/testrun.sh  globus-job-run rocks-32.sdsc.edu $HOME/dir1/testrun.sh #/bin/bash date hostname -f uptime rocks32.sdsc.edu /bin/env

© 2006 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing August 8, 2006 Nadya.

Similar presentations

Presentation on theme: "© 2006 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing August 8, 2006 Nadya."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

© 2006 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing August 8, 2006 Nadya.

Similar presentations

Presentation on theme: "© 2006 UC Regents1 Track 1: Cluster and Grid Computing NBCR Summer Institute Session 1.1: Introduction to Cluster and Grid Computing August 8, 2006 Nadya."— Presentation transcript:

Similar presentations

About project

Feedback