So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.

Slides:



Advertisements
Similar presentations
This course is designed for system managers/administrators to better understand the SAAZ Desktop and Server Management components Students will learn.
Advertisements

INTRODUCTION TO SIMULATION WITH OMNET++ José Daniel García Sánchez ARCOS Group – University Carlos III of Madrid.
Providing Fault-tolerance for Parallel Programs on Grid (FT-MPICH) Heon Y. Yeom Distributed Computing Systems Lab. Seoul National University.
2. Computer Clusters for Scalable Parallel Computing
Beowulf Supercomputer System Lee, Jung won CS843.
SUPERCOMPUTER TO THE RESCUE Justin Curry EKU, Dept. of Technology, CEN/CET)
Exploiting Graphics Processors for High- performance IP Lookup in Software Routers Author: Jin Zhao, Xinya Zhang, Xin Wang, Yangdong Deng, Xiaoming Fu.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 4 Installing and Configuring the Dynamic Host Configuration Protocol.
Dinker Batra CLUSTERING Categories of Clusters. Dinker Batra Introduction A computer cluster is a group of linked computers, working together closely.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
1 Introduction to Load Balancing: l Definition of Distributed systems. Collection of independent loosely coupled computing resources. l Load Balancing.
Lesson 4-Installing Network Operating Systems. Overview Installing and configuring Novell NetWare 6.0. Installing and configuring Windows 2000 Server.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
11 SERVER CLUSTERING Chapter 6. Chapter 6: SERVER CLUSTERING2 OVERVIEW  List the types of server clusters.  Determine which type of cluster to use for.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
Rocks cluster : a cluster oriented linux distribution or how to install a computer cluster in a day.
CLUSTER COMPUTING Prepared by: Kalpesh Sindha (ITSNS)
Fundamentals of Networking Discovery 1, Chapter 2 Operating Systems.
Module 13: Configuring Availability of Network Resources and Content.
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
UNIX System Administration OS Kernal Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept Kernel or MicroKernel Concept: An OS architecture-design.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
High-Availability Linux.  Reliability  Availability  Serviceability.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
Rocks ‘n’ Rolls An Introduction to Programming Clusters using Rocks © 2008 UC Regents Anoop Rajendra.
1.  PRAGMA Grid test-bed : Shares clusters which managed by multiple sites Realizes a large-scale computational environment. › Expects as a platform.
Chapter 2: Operating-System Structures. 2.2 Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 14, 2005 Operating System.
การติดตั้งและทดสอบการทำคลัสเต อร์เสมือนบน Xen, ROCKS, และไท ยกริด Roll Implementation of Virtualization Clusters based on Xen, ROCKS, and ThaiGrid Roll.
Presented by: Sanketh Beerabbi University of Central Florida COP Cloud Computing.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
COMPTUER CLUSTERING WITH LINUX-ON-CD Robert Ibershoff Computer Electronic Networking.
G-JavaMPI: A Grid Middleware for Distributed Java Computing with MPI Binding and Process Migration Supports Lin Chen, Cho-Li Wang, Francis C. M. Lau and.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 7 OS System Structure.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 4 Installing and Configuring the Dynamic Host Configuration Protocol.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
Large Scale Parallel File System and Cluster Management ICT, CAS.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
PARALLEL COMPUTING overview What is Parallel Computing? Traditionally, software has been written for serial computation: To be run on a single computer.
 Apache Airavata Architecture Overview Shameera Rathnayaka Graduate Assistant Science Gateways Group Indiana University 07/27/2015.
Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 1(18) Deploying a Network of GNU/Linux Clusters with Rocks Arto Teräs.
Amit Warke Jerry Philip Lateef Yusuf Supraja Narasimhan Back2Cloud: Remote Backup Service.
 High-Availability Cluster with Linux-HA Matt Varnell Cameron Adkins Jeremy Landes.
1 Development of a High-Throughput Computing Cluster at Florida Tech P. FORD, R. PENA, J. HELSBY, R. HOCH, M. HOHLMANN Physics and Space Sciences Dept,
Status of Florida Tier2 Center A mini tutorial on ROCKS appliances Jorge L. Rodriguez February 2003.
Distributed database system
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Cluster Software Overview
VApp Product Support Engineering Rev E VMware Confidential.
Virtual Machines Created within the Virtualization layer, such as a hypervisor Shares the physical computer's CPU, hard disk, memory, and network interfaces.
A Dynamic Operating System for Sensor Nodes Chih-Chieh Han, Ram Kumar, Roy Shea, Eddie Kohler, Mani, Srivastava, MobiSys ‘05 Oct., 2009 발표자 : 김영선, 윤상열.
Copyright 2007, Information Builders. Slide 1 Machine Sizing and Scalability Mark Nesson, Vashti Ragoonath June 2008.
LINUX CLUSTERING USING OPENMOSIX Jose Matthews Computer Electronic Networking, EKU College of Business and Technology.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Linux Operations and Administration
Tool Integration with Data and Computation Grid “Grid Wizard 2”
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
CCNA1 v3 Module 1 v3 CCNA 1 Module 1 JEOPARDY K. Martin.
2: Operating Systems Networking for Home & Small Business.
Chapter 4: server services. The Complete Guide to Linux System Administration2 Objectives Configure network interfaces using command- line and graphical.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Open source IP Address Management Software Review
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
Introduction to Load Balancing:
Cloud based Open Source Backup/Restore Tool
An Introduction to Computer Networking
Building a Database on S3
Chapter 2: Operating-System Structures
Chapter 2: Operating-System Structures
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
Presentation transcript:

So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management in a Cluster Environment

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 2 / 20 Introduction (1/2)  Supercomputer  High performance processor / high network bandwidth  Expensive system but Beowulf system is cost-effective  Motivation  Focus on Cluster system  Cluster Management system  Manual method / add-on method / integrated method  Registry  Central repository of information about all aspects of the computer

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 3 / 20 Introduction (2/2)  Challenge  Integrated method has low availability and reliability  Can’t manage computation nodes separately  When failure occurs, system can’t be rejuvenated  Goal ( using Registry )  Improve availability and reliability of integrated method  Administrator can manage a cluster system easily  Restore cluster system with a backup snapshot

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 4 / 20 Supercomputer Domestic Supercomputer Quantity : 14 Cluster : 4 MPP : 4 Constellation : 6 ※ SNU : 2 (51/413) 60.8%

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 5 / 20 Cluster Management System  Manual approach  System administrator brings up entire system manually  Add-on method  Bring up a frontend node, then add cluster packages  OSCAR / Warewulf / OpenMosix  Integrated method  Cluster packages are installed and configured during the initial installation  Rocks / Scyld

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 6 / 20 Cluster Management System  Software Stack Linux Kernel Linux Environment HPC Device Drivers Job Scheduling and Launching Cluster software management Cluster State management / Monitoring Message passing / communication Layer Parallel code / Grid / computer lab … OS (Linux) SGE Application HPC

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 7 / 20 Rocks Overview  Identity  System to build and manage a Linux Cluster  Free : Open source project  Goal  Make clusters easy  Philosophy  Computation nodes are 100% automatically installed  Roll : set of packages  Graph / Kickstart  Run on heterogeneous system architecture  Doesn’t attempt to incrementally update software

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 8 / 20 Rocks system  Architecture Front-end node node Local Network eth1 eth0 internet

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 9 / 20 What is Registry ?  Central repository of info about all aspects of the computer  Hardware, OS, applications, users information  Function  Retrieve system information  Update / add / delete software  Backup & restore system  Advantage  Easier for applications to access system  Storing large amounts of structured data (system info)

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 10 / 20 Registry Design ID (primary key) Name Membership CPUs Rack Rank Comment Nodes ID (primary key) Node MAC IP Gateway Name Device Module Network ID (primary key) Node Name Version Release Install Package ID (primary key) Node Name Aliases ID (primary key) Name Appliance Distribution Memberships ID (primary key) Name Graph Node Appliances ID (primary key) Name Release Lang Distribution Original Relational Schema Appended Relation H/W information S/W information

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 11 / 20 Strategy of management  Rocks Setup  Minimum modification  Take advantage of original Rocks system  Deploy cluster system easily  Modify related source codes  insert-ethers, kickstart.cgi, Kpp, Kgen, Rgen  Running System  Apply package modification  Package management program : add / update / delete packages  DB consistency management program

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 12 / 20 Collection Method Rgen Registry variables Package variables Appended component

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 13 / 20 Modification Method Insert command Packages table Package name / version / release Instruction : Add / update / delete add –c=compute-0-0 –i=amanda i386 add –c=all –i=all del -c=compute-0-0 –i=amanda i386 del -c=all -i=all Packages table Add / delete / update Compute Nodes

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 14 / 20 Registry consistency  Setup time  When frontend node removes / updates computation node  Dependency : change node table → change package table  Modify Kickstart.cgi / kgen  Apply cascading tables change ※ mysql not support transaction property  Running system  Package install / delete / update  Compute node rpm information = frontend node’s registry DB

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 15 / 20 Experiment Setup Public Ethernet Frontend node Compute nodes (14) Rocks.snu.ac.kr CPU 800Mhz RAM 768MB HDD 40G Compute-0-(1~14) CPU 850Mhz RAM 1G HDD 10G 468KB 117MB capacity 3 53 volume amanda HPC name Experiment Data 1.5GB479Rocks roll

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 16 / 20 Original Rocks Evaluation average service time : 18min 14secaverage transmit time : 11min 28sec Network card DHCP request

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 17 / 20 Amanda Packages Evaluation average install time : 6.62 secAverage delete time : 5.57sec

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 18 / 20 HPC Roll Evaluation average install time : 3min 38secaverage delete time : 1min 18sec

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 19 / 20 Conclusion  Registry takes advantage of cluster system  Improve availability and reliability using Registry  Administrator can manage cluster systems easily  Restore cluster systems with backup snapshots

So, Jung-ki (SNU DCS Lab) Introduction Related Work Design Evaluation Conclusion 20 / 20 Q & A Questions or Comments ? Thank you !