Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs 2005-02-26Slide 1(18) Deploying a Network of GNU/Linux Clusters with Rocks Arto Teräs.

Slides:



Advertisements
Similar presentations
Finnish Material Sciences Grid (M-grid) Arto Teräs Nordic-Sgi Meeting October 28, 2004.
Advertisements

Linux+ Guide to Linux Certification, Second Edition
Quick Overview of NPACI Rocks Philip M. Papadopoulos Associate Director, Distributed Computing San Diego Supercomputer Center.
Information Technology Center Introduction to High Performance Computing at KFUPM.
© UC Regents 2010 Extending Rocks Clusters into Amazon EC2 Using Condor Philip Papadopoulos, Ph.D University of California, San Diego San Diego Supercomputer.
Linux+ Guide to Linux Certification, Second Edition Chapter 3 Linux Installation and Usage.
Phones OFF Please Operating System Introduction Parminder Singh Kang Home:
CSC Site Update HP Nordic TIG April 2008 Janne Ignatius Marko Myllynen Dan Still.
CSC Grid Activities Arto Teräs HIP Research Seminar February 18th 2005.
Automating Linux Installations at CERN G. Cancio, L. Cons, P. Defert, M. Olive, I. Reguero, C. Rossi IT/PDP, CERN presented by G. Cancio.
© 2013 Jones and Bartlett Learning, LLC, an Ascend Learning Company All rights reserved. Security Strategies in Linux Platforms and.
16.1 © 2004 Pearson Education, Inc. Exam Managing and Maintaining a Microsoft® Windows® Server 2003 Environment Lesson 16: Examining Software Update.
Building a High-performance Computing Cluster Using FreeBSD BSDCon '03 September 10, 2003 Brooks Davis, Michael AuYeung, Gary Green, Craig Lee The Aerospace.
Ch 11 Managing System Reliability and Availability 1.
Installing and maintaining clusters of FreeBSD servers using PXE and Rsync Cor Bosman XS4ALL
Software Depot Service CPTE 433 John Beckett. What? A centralized source for software in your organization. Managed by the SA group. Provides supported.
Rocks cluster : a cluster oriented linux distribution or how to install a computer cluster in a day.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
Fundamentals of Networking Discovery 1, Chapter 2 Operating Systems.
SSI-OSCAR A Single System Image for OSCAR Clusters Geoffroy Vallée INRIA – PARIS project team COSET-1 June 26th, 2004.
Guide to Linux Installation and Administration, 2e1 Chapter 3 Installing Linux.
IT:NETWORK:MICROSOFT SERVER 2 DHCP AND WINDOWS DEPLOYMENT SERVICES.
Module 13: Maintaining Software by Using Windows Server Update Services.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Introduction to NorduGrid ARC / Arto Teräs Slide 1(16) Introduction to NorduGrid ARC Arto Teräs Free and Open Source Software Developers' Meeting.
Academy : Red Hat Certified Engineer (RHCE) is an advanced level certification for Red Hat Certified System Administrator (RHCSA) and determines.
So, Jung-ki Distributed Computing System LAB School of Computer Science and Engineering Seoul National University Implementation of Package Management.
MSc. Miriel Martín Mesa, DIC, UCLV. The idea Installing a High Performance Cluster in the UCLV, using professional servers with open source operating.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Rocks ‘n’ Rolls An Introduction to Programming Clusters using Rocks © 2008 UC Regents Anoop Rajendra.
The Microsoft Baseline Security Analyzer A practical look….
Farm Management D. Andreotti 1), A. Crescente 2), A. Dorigo 2), F. Galeazzi 2), M. Marzolla 3), M. Morandin 2), F.
The Cluster Computing Project Robert L. Tureman Paul D. Camp Community College.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
O.S.C.A.R. Cluster Installation. O.S.C.A.R O.S.C.A.R. Open Source Cluster Application Resource Latest Version: 2.2 ( March, 2003 )
Block1 Wrapping Your Nugget Around Distributed Processing.
Batch Scheduling at LeSC with Sun Grid Engine David McBride Systems Programmer London e-Science Centre Department of Computing, Imperial College.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Quick Introduction to NorduGrid Oxana Smirnova 4 th Nordic LHC Workshop November 23, 2001, Stockholm.
Grid Middleware Tutorial / Grid Technologies IntroSlide 1 /14 Grid Technologies Intro Ivan Degtyarenko ivan.degtyarenko dog csc dot fi CSC – The Finnish.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
EVGM081 Multi-Site Virtual Cluster: A User-Oriented, Distributed Deployment and Management Mechanism for Grid Computing Environments Takahiro Hirofuchi,
Cluster Software Overview
Virtual Machines Created within the Virtualization layer, such as a hypervisor Shares the physical computer's CPU, hard disk, memory, and network interfaces.
HEP Computing Status Sheffield University Matt Robinson Paul Hodgson Andrew Beresford.
1 Copyright © 2015 Pexus LLC Patriot PS Personal Server Installing Patriot PS ISO Image on.
LCG LCG-1 Deployment and usage experience Lev Shamardin SINP MSU, Moscow
EMI INFSO-RI ARC tools for revision and nightly functional tests Jozef Cernak, Marek Kocan, Eva Cernakova (P. J. Safarik University in Kosice, Kosice,
Chapter 8: Installing Linux The Complete Guide To Linux System Administration.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
CIP HPC CIP - HPC HPC = High Performance Computer It’s not a regular computer, it’s bigger, faster, more powerful, and more.
2: Operating Systems Networking for Home & Small Business.
CEG 2400 FALL 2012 Linux/UNIX Network Operating Systems.
INRNE's participation in LCG Elena Puncheva Preslav Konstantinov IT Department.
Improving the Research Bootstrap of Condor High Throughput Computing for Non-Cluster Experts Based on Knoppix Instant Computing Technology RIKEN Genomic.
This slide deck is for LPI Academy instructors to use for lectures for LPI Academy courses. ©Copyright Network Development Group Module 01 Introduction.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Aaron Corso COSC Spring What is LAMP?  A ‘solution stack’, or package of an OS and software consisting of:  Linux  Apache  MySQL  PHP.
The Great Migration: From Pacman to RPMs Alain Roy OSG Software Coordinator.
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
UFIT Infrastructure Self-Service. Service Offerings And Changes Virtual Machine Hosting Self service portal Virtual Machine Backups Virtual Machine Snapshots.
Operated by Los Alamos National Security, LLC for NNSA U N C L A S S I F I E D Slide 1 Institutional Install of Red Hat Enterprise Linux From One CD In.
Andrea Chierici Virtualization tutorial Catania 1-3 dicember 2010
Computing Clusters, Grids and Clouds Globus data service
Guide to Linux Installation and Administration, 2e
Diskless network security
IS3440 Linux Security Unit 8 Software Management
Presentation transcript:

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 1(18) Deploying a Network of GNU/Linux Clusters with Rocks Arto Teräs Free and Open Source Software Developers' European Meeting Brussels, February 26 th, 2005

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 2(18) Contents ● Rocks Cluster Distribution ● Task: The Finnish Material Sciences Grid ● Installing a Rocks Master Server ● Customizing Rocks ● Experiences in deploying the systems ● Rocks pros and cons ● Short comparison with Debian FAI ● Questions

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 3(18) NPACI Rocks Cluster Distribution ● Cluster oriented GNU/Linux distribution ● Main developers in the San Diego Supercomputing Center (U.S.A.) ● Base packages, installer components and kernel from Red Hat Enterprise Linux 3.0 ● XML-based configuration tree ● Cluster monitoring tools, batch queue systems and other useful software packaged as "rolls" which can be selected during installation ●

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 4(18) Installing a Single Cluster ● Single cluster installation with Rocks is relatively straightforward: 1.Download a cd set or a smaller boot cd 2.Install and configure the cluster front end 3.Power up compute nodes one by one and let them install the local OS over the network from the frontend ● frontend gives ips and stores the mac addresses automatically 4.Test the installation, start computing ● Well described in the Rocks manual => won't go into details in this presentation

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 5(18) Finnish Material Sciences Grid (M-grid) ● Joint project between seven Finnish universities, Helsinki Institute of Physics and CSC ● Jointly funded by the Academy of Finland and the participating universities ● First large initiative to put Grid middleware into production use in Finland ● Based on GNU/Linux clusters, targeted for throughput computing, serial and ”pleasantly parallel” applications ● Users mainly physicists and chemists ●

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 6(18) M-grid Hardware and CPU Distribution ● Dual AMD Opteron GHz nodes with 2-8 GB memory, 1-2 TB storage, 2xGbit Ethernet, remote administration ● Number of CPUs: 410 (computing nodes only), 1.5 Tflops theoretical computing power ● 9 sites, size of sites varies greatly

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 7(18) One M-grid Cluster

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 8(18) System Administration in M-grid ● Tasks divided between CSC and site administrators ● CSC administrators – Maintain (remotely) the OS, LRMS, Grid middleware, certain libraries for all sites except Oulu – Separate mini cluster for testing new software releases ● Site administrators – Local applications and libraries, system monitoring, user support – Most site admins are researchers working for the department or lab, not I.T. (but are quite competent) ● Regular meetings of administrators, support network

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 9(18) Installation Plan

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 10(18) Installing the Rocks Master Server ● A standard GNU/Linux box with a web server, http and rsync open to the clients – We installed it as a Rocks frontend without nodes (shutting down a few unnecessary services), but could also be some other distribution ● Rocks distribution mirrored from the Rocks main site as the basis for development – Rocks Makefile conventions and build system needed some studying to get used to ● For development it is convenient to make a frontend boot cd which automatically sets the ip address and starts the install from the master server over the network – Just append the relevant kernel parameters in isolinux.cfg: central=your_master_server_name ip=... gateway=... netmask=... dns=... frontend ksdevice=eth1 ks=

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 11(18) Customizing Rocks ● Basic customizations in one cluster fairly easy – E.g. changing compute node disk partitioning, adding software packages, adding monitoring components – Large parts of configuration stored in a MySQL database ● Customizing the XML configuration tree more difficult – Flexible, but needs work to get familiar with it – Red Hat kickstart file is generated from the XML files ● Especially front-end installation hard to debug: a typo in the XML can make installation fail and system reboot – Most people don't make customizations before installing the frontend, so there is little documentation for it – Developers on the mailing list have been helpful

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 12(18) Our Customizations ● Main principle: Don't touch the Rocks base distribution but do all customizations as an additional "roll" – Basic localization: Finnish keyboard layout etc. (This has been improved in Rocks 3.3.0, we started with 3.2.0) – Separating NFS and MPI traffic to two networks – Additional packages: Numerical libraries, a proprietary Fortran compiler, NorduGrid ARC middleware (soon) – Firewall settings, a preconfigured administrator account ● Problem: The configuration is parsed at the installation stage when many things aren't yet in place – Not nice when combined with poor debugging => ended up writing scripts which are executed at first boot after installation

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 13(18) Deployment Experiences ● CSC prepared the distribution and a boot cd, local administrators were responsible for installing their cluster – One installation was done completely off-site using remote administration hardware (our front ends can boot from virtual media and we have remote access to power and console) ● Preparing the distribution took more time than expected ● Actual deployment went quite smoothly – Most sites spent from a few hours to one day installing the OS and nodes, larger sites took two days – One site had strange problems taking more time ● Some things needed fixing afterwards as the distribution was not preconfigured as well as we planned

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 14(18) Updates ● The Rocks team doesn't currently provide security advisories or updates between releases – Critical security updates need be recompiled from RHEL source RPMs or fetched from other sources ● "The Rocks way" is to simply reinstall compute nodes whenever updates or configuration changes are needed – We wanted to install some packages without reboot ● Yum is convenient but does not support a three level structure => use rsync to mirror packages, then install using yum to both frontend and nodes – Another nice way would be to schedule updates in nodes as high priority reinstall jobs which are executed when computing jobs finish (probably our next step)

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 15(18) Installing Updates

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 16(18) Rocks Pros and Cons Good: ● Easy to get started, designed for clusters ● Nice monitoring tools, many things work out of the box ● Many big vendors have their hardware certified for RHEL => Rocks usually works too ● Competent people on the mailing list Something to improve: ● Security updates not provided by the Rocks team ● Hard to diagnose and debug installation problems when customizing the distribution

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 17(18) Rocks Compared to Debian FAI (Warning: My hands-on experience with FAI is 2-3 years ago) ● Both Rocks and FAI (Fully Automatic Installation) for Debian GNU/Linux are based on installers, not imaging tools ● Rocks is a bit easier to get started with: tftp, nis, ganglia monitoring etc. are preconfigured ● Debian FAI installation structure is easier to understand, customize and debug – FAI can be easily bent to exotic things by replacing parts of the standard installation sequence with custom scripts ● FAI is a natural choice for Debian fans, OSCAR is another popular Free package in the "rpm world"

Deploying a Network of GNU/Linux Clusters with Rocks / Arto Teräs Slide 18(18) More Information ● Rocks home page: ● M-grid home page: ● These slides: Deploying_M-grid_with_Rocks_ pdf ● Technical contact people at CSC: – Arto Teräs – Olli-Pekka Lehto ● Thank you! Questions?