Linux Cluster Tools Development

Slides:



Advertisements
Similar presentations
OpenVMS System Management A different perspective by Andy Park TrueBit b.v.
Advertisements

70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 9: Implementing and Using Group Policy.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 12: Managing and Implementing Backups and Disaster Recovery.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 9: Implementing and Using Group Policy.
Guide To UNIX Using Linux Third Edition
Overview Basic functions Features Installation: Windows host and Linux host.
CH 13 Server and Network Monitoring. Hands-On Microsoft Windows Server Objectives Understand the importance of server monitoring Monitor server.
Windows Server 2008 Chapter 11 Last Update
Virtual Machine Management
Presented by INTRUSION DETECTION SYSYTEM. CONTENT Basically this presentation contains, What is TripWire? How does TripWire work? Where is TripWire used?
Client Management. Introduction In a typical organization there are a lot of client machines used for day to day operations Client management is a necessary.
FNAL Configuration Management Jack Schmidt Cyber Security Workshop May th 2006.
Linux Operations and Administration
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 9: Implementing and Using Group Policy.
1 Guide to Novell NetWare 6.0 Network Administration Chapter 11.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 12: Managing and Implementing Backups and Disaster Recovery.
CENT 305 Information Systems Security Linux Introduction.
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
The SLAC Cluster Chuck Boeheim Assistant Director, SLAC Computing Services.
11 MANAGING AND DISTRIBUTING SOFTWARE BY USING GROUP POLICY Chapter 5.
FNAL System Patching Design Jack Schmidt, Al Lilianstrom, Andy Romero, Troy Dawson, Connie Sieh (Fermi National Accelerator Laboratory) Introduction FNAL.
PC MANAGER MEETING January 23, Agenda  Next Meeting  Training  Windows Policy  Main Topic: Windows AV Service Review.
DIT314 ~ Client Operating System & Administration CHAPTER 5 MANAGING USER ACCOUNTS AND GROUPS Prepared By : Suraya Alias.
Paul Scherrer Institut 5232 Villigen PSI HEPIX_AMST / / BJ95 PAUL SCHERRER INSTITUT THE PAUL SCHERRER INSTITUTE Swiss Light Source (SLS) Particle accelerator.
Michael Still Google Inc. October, Managing Unix servers the slack way Tools and techniques for managing large numbers of Unix machines Michael.
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Guide to Linux Installation and Administration1 Chapter 4 Running a Linux System.
Fermilab Distributed Monitoring System (NGOP) Progress Report J.Fromm K.Genser T.Levshina M.Mengel V.Podstavkov.
A Networked Machine Management System 16, 1999.
F. Rademakers - CERN/EPLinux Certification - FOCUS Linux Certification Fons Rademakers.
1 The new Fabric Management Tools in Production at CERN Thorsten Kleinwort for CERN IT/FIO HEPiX Autumn 2003 Triumf Vancouver Monday, October 20, 2003.
Security monitoring boxes Andrew McNab University of Manchester.
Software repository replication using the ASIS Local Copy Manager IT/DIS/OSE, CERN ASIS Team Presented by: German Cancio
Deployment work at CERN: installation and configuration tasks WP4 workshop Barcelona project conference 5/03 German Cancio CERN IT/FIO.
G. Cancio, L. Cons, Ph. Defert - n°1 October 2002 Software Packages Management System for the EU DataGrid G. Cancio Melia, L. Cons, Ph. Defert. CERN/IT.
1 PUPPET AND DSC. INTRODUCTION AND USAGE IN CONTINUOUS DELIVERY PROCESS. VIKTAR VEDMICH PAVEL PESETSKIY AUGUST 1, 2015.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
HEPiX FNAL ‘02 25 th Oct 2002 Alan Silverman HEPiX Large Cluster SIG Report Alan Silverman 25 th October 2002 HEPiX 2002, FNAL.
INTRUSION DETECTION SYSYTEM. CONTENT Basically this presentation contains, What is TripWire? How does TripWire work? Where is TripWire used? Tripwire.
RAL Site report John Gordon ITD October 1999
Deploying Software with Group Policy Chapter Twelve.
CD FY09 Tactical Plan Status FY09 Tactical Plan Status Report for Neutrino Program (MINOS, MINERvA, General) Margaret Votava April 21, 2009 Tactical plan.
PC Windows CVS Server PC Linux Triple’A Test Server Triple’A CVS Versioning 2. Add object to CVS Server - add.sh [format_name.fmt] - cvs commit 1. Export.
2-December Offline Report Matthias Schröder Topics: Monte Carlo Production New Linux Version Tape Handling Desktop Computers.
10/18/01Linux Reconstruction Farms at Fermilab 1 Steven C. Timm--Fermilab.
Overview of cluster management tools Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
C Copyright © 2006, Oracle. All rights reserved. Oracle Secure Backup Additional Installation Topics.
Scientific Linux Inventory Project (SLIP) Troy Dawson Connie Sieh.
Configuring the User and Computer Environment Using Group Policy Lesson 8.
Course : PGClass : MCA Subject: Operating SystemSub.Code : 3CT11 Staff Name : S.SomasundaramYear & Sem : II nd & III rd.
FermiLinux STS Scientific Linux 6 Connie Sieh HEPIX Spring 2009 May 25, 2009.
INFSO-RI Enabling Grids for E-sciencE Workshop WLCG Security for Grid Sites Louis Poncet System Engineer SA3 - OSCT.
Architecture Review 10/11/2004
IT320 Operating System Concepts
Installation of MySQL Objectives Contents Practical Summary
CMS DCS: WinCC OA Installation Strategy
Connect:Direct for UNIX v4.2.x Silent Installation
TECH TRACK: RHEV Backup AND Recovery
COP 4343 Unix System Administration
CompTIA Server+ Certification (Exam SK0-004)
Overview – SOE PatchTT November 2015.
WP4-install status update
Concurrent Version Control
A Modular Administration Tool for Linux Computers
TRIP WIRE INTRUSION DETECTION SYSYTEM Presented by.
SAP R/3 Installation on WIN NT-ORACLE
System Management in a Windows based Control Environment
The Problem ~6,000 PCs Another ~1,000 boxes But! Affected by:
Linux Operations and Administration
Ruth Pordes, Lauri Loebel Carpenter, Elizabeth Schermerhorn
Presentation transcript:

Linux Cluster Tools Development Dane Skow Fermilab October 8, 1999 HEPNT/HEPiX

Projects Linux Farms (FT and Run II) Level 3 trigger farms Tape mover nodes (Enstore) Desktops Prototyping systems (DAQ tests)

Delete sample document icons and replace with working document icons as follows: From Insert Menu, select Object... Click “Create from File” Locate File name in “File” box Make sure “Display as Icon” is checked Click OK Select icon From Slide Show Menu, Select “Action Settings” Click “Object Action” and select “Edit” FNAL System Census 1999 8/30/2019

Farms Facility farm nearly completely Linux now (39+50 dual PCs, 6 quad SGIs) Run II farms ramping up from 8 nodes to 50 (CDF & D0 each). Decision to use an I/O node for output building made. SGI Origin’s for I/O nodes. Production farm on 2.0.32 kernel been fine. Prototype on 2.0.35 has been rocky. Burn-in on 2.2.10 has had many machines hang. Moving to 2.2.12. Level 3 trigger farms Tests have been good to date. Large scale purchases delayed until late 2000 ?

Prototyping systems Linux boxes popular for test clusters to develop ideas and software testing. Used extensively by the online data logging, D0 data handling teams.

Desktops Over half of all Linux boxes still are on the desktop. Growth continues to pace farms deployment (even with 100+ node purchases). Code developers are prime deployment targets. Physics analysis users beginning but running into troubles with tapes. People still using VAX and Unix workstation mindset. Most desktops are run in “Orange” mode. “Self-help” mailing list linux-users@fnal.gov very successful

Security AutoRPM system with has been popular and effective for distributing security patches. Distribution continues to have the default service configuration pared down. Applications bundled follow the RedHat release. Users are supportive of “minimal” default. In early deployment of AFS client with good success. Plan on making standard for next release (RH 6.1).

Infrastructure Discussions of tools that are needed seem to break down into 4 categories: system monitoring and alarm Currently use simple ping tests and PATROL. This is area of greatest activity of Beowulf world. system installation and config (patch) management. Use network install server and AutoRPM Backup and failure recovery. Systracker and other ideas. Still early Resource accounting and capacity planning. Use batch systems for scheduling and pacct’ing scripts for usage tracking.

Futures - Infrastructure Many people interested in this area, but uncoordinated efforts. Beowulf, MOSIX, etc. Small DOE grant funded change control work over the summer (RAP) called systracker. Discussion group for the “Next Generations Operations” for FNAL datacenter operations. Just completing requirements gathering phase.

Systracker Based on our success with AutoRPM we invited Kirk Bauer to come work on a configuration management tool. Prototype of system change tracking system (logger and replay mechanism). Desire is for easy method to restore changes to install configuration. PERL modules based on concepts of tripwire, Autorpm and RCS. Local machine alpha version available. Next step would be archive server, addition of other package handling methods (UPS, etc.).

Systracker Config Files CVS repository Systracker Difference engine System Dirs RPMs Replay engine UPS

Systracker Presume that one can install a system to a base configuration. Take a snapshot of this as the system baseline. Use tripwire mechanisms to monitor system files and directories for changes and check updates into a CVS repository. Modified RPM to archive RPMs to a repository. Create a module to create a “replay” script from differences between baseline and target. Working on installation scripts to replay the “replay” Alpha code available at http://home.fnal.gov/~dane/systracker.tgz

Futures - Software Desktop environment decision (KDE vrs Gnome) likely to be desired soon. Strong desire for centralized backup or archive service. Both of these will be exacerbated by increase in use of physics analysis tools (PAW now, ROOT most likely). Discussions about whether one wants tracked

Futures - Hardware Looking harder at high density systems (2U cases, racked bare boards, etc.) Run II purchases likely delayed until FY01. Purchasing preconfigured hardware from specified vendors is work not yet done. Brave ideas from several future experiments about 1000’s of PC per experiment.

Summary At FNAL, Linux installation infrastructure better than most OS flavors. Users are “violently” in favor of an “Orange” configuration but not diligent in carrying out admin duties. Linux growth not yet maxed out. Likely to completely supplant the Unix desktop. Serious use by amateurs not yet there. Coming soon. Desired applications for Linux continue to rise. Expect to see videoconferencing, etc coming.