Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Computer Science Research Group Computer.

Slides:



Advertisements
Similar presentations
Windows Deployment Services WDS for Large Scale Enterprises and Small IT Shops Presented By: Ryan Drown Systems Administrator for Krannert.
Advertisements

MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 11 Windows Server 2008 Virtualization.
Presented by Scalable Systems Software Project Al Geist Computer Science Research Group Computer Science and Mathematics Division Research supported by.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Cluster Computing Applications Project: Parallelizing BLAST The field of Bioinformatics needs faster string matching algorithms. What Exactly is BLAST?
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition Chapter 2: Operating-System Structures Modified from the text book.
Cambodia-India Entrepreneurship Development Centre - : :.... :-:-
Installing software on personal computer
Scalable Systems Software Center Resource Management and Accounting Working Group Face-to-Face Meeting February 24-25, 2003.
Section 6.1 Explain the development of operating systems Differentiate between operating systems Section 6.2 Demonstrate knowledge of basic GUI components.
Oak Ridge National Laboratory — U.S. Department of Energy 1 The ORNL Cluster Computing Experience… John L. Mugler Stephen L. Scott Oak Ridge National Laboratory.
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
May l Washington, DC l Omni Shoreham Nick Dobrovolskiy VP Parallels Open Platform May 19 th, 2008 Introducing Parallels Server.
Virtualization Lab 3 – Virtualization Fall 2012 CSCI 6303 Principles of I.T.
OFC 200 Microsoft Solution Accelerator for Intranets Scott Fynn Microsoft Consulting Services National Practices.
Open Source Cluster Applications Resources. Overview What is O.S.C.A.R.? History Installation Operation Spin-offs Conclusions.
High Performance Computing Cluster OSCAR Team Member Jin Wei, Pengfei Xuan CPSC 424/624 Project ( 2011 Spring ) Instructor Dr. Grossman.
Bright Cluster Manager Advanced cluster management made easy Dr Matthijs van Leeuwen CEO Bright Computing Mark Corcoran Director of Sales Bright Computing.
Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Network and Cluster Computing Computer.
การติดตั้งและทดสอบการทำคลัสเต อร์เสมือนบน Xen, ROCKS, และไท ยกริด Roll Implementation of Virtualization Clusters based on Xen, ROCKS, and ThaiGrid Roll.
Oak Ridge National Laboratory — U.S. Department of Energy 1 The ORNL Cluster Computing Experience… Stephen L. Scott Oak Ridge National Laboratory Computer.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
April 30, 2007 openSUSE.org Build Service a short introduction Moiz Kohari VP Engineering.
© 2008 IBM Corporation ® IBM Cognos Business Viewpoint Miguel Garcia - Solutions Architect.
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott and Christian Engelmann Computer Science.
Contents 1.Introduction, architecture 2.Live demonstration 3.Extensibility.
Installation and Development Tools National Center for Supercomputing Applications University of Illinois at Urbana-Champaign The SEASR project and its.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
Presented by System-Level Virtualization & OSCAR-V Stephen L. Scott Thomas Naughton Geoffroy Vallée Computer Science Research Group Computer Science and.
Crystal Ball Panel ORNL Heterogeneous Distributed Computing Research Al Geist ORNL March 6, 2003 SOS 7.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Headline in Arial Bold 30pt HPC User Forum, April 2008 John Hesterberg HPC OS Directions and Requirements.
Large Scale Parallel File System and Cluster Management ICT, CAS.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
OS and System Software for Ultrascale Architectures – Panel Jeffrey Vetter Oak Ridge National Laboratory Presented to SOS8 13 April 2004 ack.
ABone Architecture and Operation ABCd — ABone Control Daemon Server for remote EE management On-demand EE initiation and termination Automatic EE restart.
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Oak Ridge National Laboratory -- U.S. Department of Energy 1 SSS Deployment using OSCAR John Mugler, Thomas Naughton & Stephen Scott May 2005, Argonne,
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
Cluster Software Overview
Virtual Machines Created within the Virtualization layer, such as a hypervisor Shares the physical computer's CPU, hard disk, memory, and network interfaces.
A. Frank - P. Weisberg Operating Systems Structure of Operating Systems.
May l Washington, DC l Omni Shoreham Parallels Virtuozzo Containers Roadmap Andrey Moruga Virtualization Product Manager, Parallels.
National Energy Research Scientific Computing Center (NERSC) CHOS - CHROOT OS Shane Canon NERSC Center Division, LBNL SC 2004 November 2004.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Process Manager Specification Rusty Lusk 1/15/04.
Aaron Corso COSC Spring What is LAMP?  A ‘solution stack’, or package of an OS and software consisting of:  Linux  Apache  MySQL  PHP.
Planning Server Deployments Chapter 1. Server Deployment When planning a server deployment for a large enterprise network, the operating system edition.
Fermilab Scientific Computing Division Fermi National Accelerator Laboratory, Batavia, Illinois, USA. Off-the-Shelf Hardware and Software DAQ Performance.
OSCAR Symposium – Quebec City, Canada – June 2008 Proposal for Modifications to the OSCAR Architecture to Address Challenges in Distributed System Management.
Towards a High Performance Extensible Grid Architecture Klaus Krauter Muthucumaru Maheswaran {krauter,
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Chapter 6: Securing the Cloud
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Fundamentals Sunny Sharma Microsoft
Computing Clusters, Grids and Clouds Globus data service
System-Level Virtualization & OSCAR-V
Overview – SOE PatchTT November 2015.
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Overview – SOE PatchTT December 2013.
Computing Experience…
Cloud based Open Source Backup/Restore Tool
Chapter 6 Introduction to Network Operating Systems
More Scripting & Chapter 11
Scalable Systems Software for Terascale Computer Centers
HC Hyper-V Module GUI Portal VPS Templates Web Console
Presentation transcript:

Presented by Open Source Cluster Application Resources (OSCAR) Stephen L. Scott Thomas Naughton Geoffroy Vallée Computer Science Research Group Computer Science and Mathematics Division

2 Scott_OSCAR_SC07 OSCAR Snapshot of best known methods for building, programming, and using clusters Open Source Cluster Application Resources International consortium of academic, research, and industry members

3 Scott_OSCAR_SC07 OSCAR background  Concept first discussed in January 2000  First organizational meeting in April 2000  Cluster assembly is time consuming and repetitive  Nice to offer a toolkit to automate  Leverage wealth of open source components  First public release in April 2001  Over six years of project development and six specialized versions  Current Stable: oscar-5.0 ; Development: oscar-5.1/6.0

4 Scott_OSCAR_SC07 What does OSCAR do?  Wizard-based cluster software installation  Operating system  Cluster environment  Automatically configures cluster components  Increases consistency among cluster builds  Reduces time to build/install a cluster  Reduces need for expertise

5 Scott_OSCAR_SC07 OSCAR design goals Modular metapackage system/API— “OSCAR Packages” Keep it simple for package authors Open source to foster reuse and community participation Fosters “spin-offs” to reuse OSCAR framework Native package systems Existing distributions Management, system, and applications Keep the interface simple Provide basic operations of cluster software and node administration Enable others to reuse and extend system— deployment tool Extensibility for new software and projects Leverage “best practices” whenever possible Reduce overhead for cluster management

6 Scott_OSCAR_SC07 OSCAR overview Framework for cluster management Simplifies installation, configuration, and operation Reduces time/learning curve for cluster build – Requires preinstalled head node with supported Linux distribution – Thereafter, wizard guides user through setup/install of entire cluster Package-based framework Content: Software + configuration, tests, docs Types: – Core: SIS, C3, Switcher, ODA, OPD, APItest, Support Libs – Non-core: Selected and third party (PVM, LAM/MPI, Toque/Maui, etc.) Access: Repositories accessible via OPD/OPDer

7 Scott_OSCAR_SC07 OSCAR packages  Simple way to wrap software & configuration  “Do you offer package Foo version X?”  Basic design goals  Keep simple for package authors  Modular packaging (each self-contained)  Timely release/updates  Leverage RPM + meta file + scripts, tests, docs, etc.  Recently extended to better support RPM, Debs, etc.  Repositories for downloading via OPD/OPDer  Leverage native package format via opkgc  OSCAR Packages compiled into native binary format

8 Scott_OSCAR_SC07 OSCAR Packages (latest enhancements)  Maintain versatilty and improve manageability  High-level opkg description  Use ‘opkgc’ to convert to lower-level native binary pkg(s)  Manage binary opkgs via standard tools (rpm/yum, dpkg/apt)  Package repositories  Local repos for restricted access (all via tarball)  Online repos for simplified access (opkgs via yum/apt)  Basis for future work  Easier upgrades  Specialized OSCAR releases (reuse oscar-core with custom opkgs)

9 Scott_OSCAR_SC07 OSCAR – cluster installation wizard Step 1 Step 2 Step 3 Step 4 Step 6 Step 5 Step 7 Step 8 Done! Start Cluster deployment monitor Cluster deployment monitor

10 Scott_OSCAR_SC07 Administration /configuration HPC services /tools OSCAR components  System Installation Suite (SIS), Cluster Command & Control (C3), OPIUM, KernelPicker, and cluster services (dhcp, nfs, ntp, etc.)  Security: Pfilter, OpenSSH Core infrastructur e/manageme nt  Parallel libs: MPICH, LAM/MPI, PVM, Open MPI  OpenPBS/MAUI, Torque, SGE  HDF5  Ganglia, Clumon  Other third-party OSCAR Packages  SIS, C3, Env-Switcher  OSCAR DAtabase (ODA), OSCAR Package Downloader (OPD)  OSCAR Package Compiler (OPKGC)

11 Scott_OSCAR_SC07 OSCAR: C3 power tools  Command-line interface for cluster-system administration and parallel-user tools  Parallel execution cexec  Execute across a single cluster or multiple clusters at same time  Scatter/gather operations cpush/cget  Distribute or fetch files for all node(s)/cluster(s)  Used throughout OSCAR  Mechanism for clusterwide operations

12 Scott_OSCAR_SC07 OSCAR architecture User OSCAR user interface OSCAR user interface System environment specification System environment specification Hardware topology specification Hardware topology specification System image Deployment tool Deployment tool Diskless system Disk-full system Virtual system Graphical (GUI) Graphical (GUI) Text only (CLI) Text only (CLI)

13 Scott_OSCAR_SC07 Diskless OSCAR  Extension of OSCAR to support diskless and diskfull nodes  Ensures separation of node specific and shared data  Current (2007) diskless OSCAR approach  Based on NFS-Root for node boot without local disk  Changes primarily isolated to System Installation Suite  In future will consider parallel filesystems (e.g., PVFS, Lustre)  Modifies the initialization, init, of the compute nodes

14 Scott_OSCAR_SC07 OSCAR Normal init (disk-full) 1.Mount/proc 2.Initialize the system 3.Run scripts at run level Modified init (diskless) 1.Mount/proc 2.Start rpc.lockd, portmap 3.Mount NFS shares 4.Initialize the system 5.Run scripts at run level 6.Rc.local mounts hard disks and sends message back to head node Local disk nodes / / / Diskless nodes NFS server

15 Scott_OSCAR_SC07 OSCAR Devel (v5.1/6.0) In progress OSCAR highlights  Local/remote repository installs  Command-line interface  Enhancements to OPD/OPDer  New OSCAR Package format  Targeted platforms:  Fedora, Red Hat EL, Debian, SuSE  x86, x86_64  Diskless OSCAR  OPKG/node sets  New GUI  Google SoC’07: Benchmark, etc.  Enhanced native package installation  New KernelPicker2 (boot management tool)

16 Scott_OSCAR_SC07 OSCAR: Proven scalability Top eight clusters by CPU count from registered list at OSCAR Web site Based on data taken on 08/14/2007 from OSCAR Cluster Registration Page, OIC (ORNL)526 nodes with 1052 CPUs Endeavor232 nodes with 928 CPUs McKenzie264 nodes with 528 CPUs SUN-CLUSTER128 nodes with 512 CPUs Cacau205 nodes with 410 CPUs Barossa184 nodes with 368 CPUs Smalley66 nodes with 264 CPUs PS auguste32 nodes with 256 CPUs

17 Scott_OSCAR_SC07 More OSCAR information… Open cluster group Home page oscar.OpenClusterGroup.org Development page svn.oscar.openclustergroup.org/trac/oscar Mailing lists OSCAR OSCAR research supported by the Mathematics, Information, and Computational Sciences Office, Office of Advanced Scientific Computing Research, Office of Science, U. S. Department of Energy, under contract no. DE-AC05-00OR22725 with UT-Battelle, LLC. OSCAR symposium

18 Scott_OSCAR_SC07 HA- OSCAR NEC's OSCAR- Pro SSS- OSCAR SSI- OSCAR OSCAR-V OSCAR “flavors”

19 Scott_OSCAR_SC07 HA-OSCAR  The first known field-grade, open source HA Beowulf cluster release  Self-configuration multihead Beowulf system  HA and HPC clustering techniques to enable critical HPC infrastructure  Services: Active/hot standby  Self-healing with 3–5 s automatic failover time RAS management for HPC cluster: Self-awareness

20 Scott_OSCAR_SC07 NEC’s OSCAR-Pro Presented at OSCAR’06 OSCAR’06 keynote by Erich Focht (NEC) Presented at OSCAR’06 OSCAR’06 keynote by Erich Focht (NEC) Commercial enhancements Leverage open source tool Joined project/contributions to OSCAR core Integrate additions when applicable Feedback and direction based on user needs

21 Scott_OSCAR_SC07 Computer centers use incompatible, ad hoc set of systems tools Tools are not designed to scale to multi-teraflop systems Duplication of work to try and scale tools System growth vs. administrator growth Define standard interfaces for system components Create scalable, standardized management tools Reduce costs and improve efficiency DOE labs: ORNL, ANL, LBNL, PNNL, SNL, LANL, Ames Academics: NCSA, PSC, SDSC Industry: IBM, Cray, Intel, SGI Problems Goals Participants OSCAR: Scalable systems software

22 Scott_OSCAR_SC07 SSS-OSCAR components Bamboo BLCR Gold MAUI-SSS SSSLib Warehouse MPD2 LAM/MPI (w/ BLCR) Queue/job manager Berkeley checkpoint/restart Accounting and allocation management system Checkpoint/restart-enabled MPI Job scheduler SSS communication library Includes SD, EM, PM, BCM, NSM, NWI Distributed system monitor MPI process manager

23 Scott_OSCAR_SC07  Easy use thanks to SSI systems  SMP illusion  High performance  Fault tolerance  Easy management thanks to OCSAR  Automatic cluster install/update Single System Image – OSCAR (SSI-OSCAR)

24 Scott_OSCAR_SC07 Enhancements to support virtual clusters Abstracts differences in virtualization solutions OSCAR-core modifications Create OSCAR Packages for virtualization solutions Integrate scripts for automatic installation and configuration Must provide abstraction layer and tools— libv3m/v2m Enable easy switch between virtualization solutions High-level definition and management of VMs: Mem/cpu/etc., start/stop/pause OSCAR-V

25 Scott_OSCAR_SC07 OSCAR-V Host OS installation OPKG selection for VMs Image creation for VMs Assign VMs to Host OSes Definition of VMs’ MAC addresses 1 Definition of virtual compute nodes

26 Scott_OSCAR_SC07 OSCAR-V: Description of steps Initial setup 1. Install supported distro head node (host) 2. Download/set up OSCAR and OSCAR-V  OSCAR: untar oscar-common, oscar-base, etc., and copy distro RPMs  OSCAR: untar; run “make install” 3. Start Install Wizard  run “./oscarv $network_interface” and follow setups

27 Scott_OSCAR_SC07 OSCAR-V: Summary  Capability to create image for Host OSes  Minimal image  Take benefit of OSCAR features for the deployment  Automatic configuration of system-level virtualization solutions  Complete networking tools for virtualization solutions  Capability to create images for VMs  May be based on any OSCAR-supported distribution: Mandriva, SuSE, Debian, Fedora, Red Hat EL, etc.  Leverage the default OSCAR configuration for compute nodes  Resources  V2M/libv3m:  OSCAR-V:  OSCAR:

28 Scott_OSCAR_SC07 Stephen L. Scott Computer Science Research Group Computer Science and Mathematics Division (865) Thomas Naughton Computer Science Research Group Computer Science and Mathematics Division (865) Geoffroy Vallée Computer Science Research Group Computer Science and Mathematics Division (865) Contacts regarding OSCAR 28 Scott_OSCAR_SC07