Renewal of Puppet for Australia-ATLAS

Slides:



Advertisements
Similar presentations
About Me CTO, Individual Digital, Inc. (Startup) Author of ext/tidy, PHP 5 Unleashed, Zend Ent. PHP Patterns
Advertisements

Puppet for GENI Experiments
Windows Deployment Services WDS for Large Scale Enterprises and Small IT Shops Presented By: Ryan Drown Systems Administrator for Krannert.
Version Control Systems Phil Pratt-Szeliga Fall 2010.
G51FSE Version Control Naisan Benatar. Lecture 5 - Version Control 2 On today’s menu... The problems with lots of code and lots of people Version control.
Getting to Push Button Deploys Moovweb January 19, 2012.
AI project components: Facter and Hiera
October, Scientific Linux INFN/Trieste B.Gobbo – Compass R.Gomezel - T.Macorini - L.Strizzolo INFN - Trieste.
Review Security Hardening IPTables SELinux. Today Installations and updates – Rpm command and packages Apache “Issue Ownership”
The Art and Zen of Managing Nagios with Puppet Michael Merideth - VictorOps
SWEN 302: AGILE METHODS Roma Klapaukh & Alex Potanin.
CERN IT Department CH-1211 Genève 23 Switzerland t Experiences running a production Puppet Ben Jones HEPiX Bologna Spring.
Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.
Lucien Boland and Sean Crosby Research Computing.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
Support in setting up a non-grid Atlas Tier 3 Doug Benjamin Duke University.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES AI’s user access, OpenStack security groups and firewall.
1 PUPPET AND DSC. INTRODUCTION AND USAGE IN CONTINUOUS DELIVERY PROCESS. VIKTAR VEDMICH PAVEL PESETSKIY AUGUST 1, 2015.
Web Hosting Providers TERRY HALL. Requirements  FREE  No advertising  FTP access (or another secure transfer method)  Near 100% uptime  Adequate.
DPM Python tools Ivan Calvet IT/SDC-ID DPM Workshop 10 th October 2014.
1 CERN IT Department CH-1211 Genève 23 Switzerland t Puppet in the CERN CC Tomas Karasek Steve Traylen Oct
FROM QUATTOR TO PUPPET A T2 point of view. BACKGROUND GRIF distributed T2 site  6 sub-sites  Used quattor for  GRIF-LAL is the home of 2 well known.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Development Workflow of the Configuration Management.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
15-Feb-02Steve Traylen, RAL WP6 Test Bed Report1 RAL/UK WP6 Test Bed Report Steve Traylen, WP6 PPGRID/RAL, UK
1 Update at RAL and in the Quattor community Ian Collier - RAL Tier1 HEPiX FAll 2010, Cornell.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CERN AI Config Management 16/07/15 AI for INFN visit2 Overview for INFN visit.
Cloud Installation & Configuration Management. Outline  Definitions  Tools, “Comparison”  References.
Overview of cluster management tools Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Configuration Services at CERN HEPiX fall Ben Jones, HEPiX Fall 2014.
1 Web Technologies Website Publishing/Going Live! Copyright © Texas Education Agency, All rights reserved.
SCDB Update Michel Jouvin LAL, Orsay March 17, 2010 Quattor Workshop, Thessaloniki.
Passwords Passwords are unpleasant Hard to remember Remember a couple
1 Policy Based Systems Management with Puppet Sean Dague
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES GIT Service in the Agile Infrastructure Project Vítor.
INFSO-RI Enabling Grids for E-sciencE Workshop WLCG Security for Grid Sites Louis Poncet System Engineer SA3 - OSCT.
Nov 05, 2008, PragueSA3 Workshop1 A short presentation from Owen Synge SA3 and dCache.
Lightweight inventory management with Pallet Jack
BY: SALMAN 1.
Pulling the Galaxy’s Strings
Australia Site Report Lucien Boland Goncalo Borges Sean Crosby
Progress Apama Fundamentals
Introduction Adult website business is very big and it has loads of cash. You cannot imagine how much a single famous porn site makes a day. There are.
Agenda:- DevOps Tools Chef Jenkins Puppet Apache Ant Apache Maven Logstash Docker New Relic Gradle Git.
Stress Free Deployments with Octopus Deploy
@ Bucharest DevOps Hacker Meetup
HEPiX Configuration Management WG
L – Modeling and Simulating Social Systems with MATLAB
Site Administration Tools: Ansible
DAQ LHC Workshop Configuration management systems
NGI and Site Nagios Monitoring
AI How to: System Update and Additional Software
BY: SALMAN.
Jason Bury Dylan Drake Rush Corey Watt
How to open source your Puppet configuration
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Transition from git.cern.ch
Spring Cleaning the Software Repositories Matthias Schröder
4th Forum How to easily offer your application as a self-service template by using OpenShift and GitLab-CI 4th Forum Alberto.
Quattor Usage at Nikhef
Quality Control in the dCache team.
Puppet
Fun with Reporting Services Tools
June 2011 David Front Weizmann Institute
DHCP, DNS, Client Connection, Assignment 1 1.3
Scaling Puppet and Foreman for HPC
Agile testing for web API with Postman
Pete Gronbech, Kashif Mohammad and Vipul Davda
Presentation transcript:

Renewal of Puppet for Australia-ATLAS Sean Crosby Goncalo Borges Lucien Boland Jeremy Hack

Australia-ATLAS services ATLAS Tier 2 100 WNs + Torque/Maui server DPM headnode + 16 storage nodes Regular EMI services (CE, BDII, APEL) perfSonar CoEPP services Public web servers, dokuwiki LDAP/Kerberos auth servers Tier 3 UIs NFS /home CephFS /coepp/cephfs 250 nodes

History of configuration management Until 2013, cfengine2 Scripts written by previous admins Covered SL5 services Lucien and I were not comfortable Didn’t cover everything Lots of hacks for bad packages or bugs fixed ages ago Basically called YAIM for most Grid services But no overwhelming reason to change Along came SL6 however…

Quattor Cfengine Puppet Choices, choices… Quattor Went to many EGI conferences, and was quite popular with French cloud Cfengine Didn’t really like syntax. Cfengine2 to cfengine3 was a big jump Puppet Spoke with Steve Traylen at a HEPIX, and CERN was just getting into it Lucien loves Ruby

Puppet written in Ruby, and has its own DSL Puppet basics Puppet is basically resources (file, service, package, exec, ssh_authorized_key,…) grouped into manifests and modules Puppet written in Ruby, and has its own DSL Facts are constants (OS version, IP address, hard drives etc) accessible to a Puppet run Hiera is a key/value lookup tool for configuration data, built to make Puppet better and let you set node-specific data without repeating yourself. (Puppet website) Forge is a Puppetlabs hosted community page where users can upload their own Puppet modules for others to use. CERN uploads most (all?) of their modules to here, as well as GitHub.

Puppet 3.2/PuppetDB 1.5.2/mod_passenger/Puppet Dashboard Original design Puppet 3.2/PuppetDB 1.5.2/mod_passenger/Puppet Dashboard All modules in single Git repository Hieradata in separate repository Environments were just branches of Git repo Git hooks

Original design Came from cfengine system of classifying nodes into groups (classes) for Nagios checks, common tasks Puppet Dashboard hostgroups with custom Puppet parser function to pull contents from MySQL Custom static Ruby Facter facts grouping hosts into host_group, host_type, location

Use Dashboard hostgroups to populate Nagios hostgroups Original design Use Dashboard hostgroups to populate Nagios hostgroups We did this due to slowness of exported resource compilation in Puppet 3.2

Use Facter host_groups for Puppet manifests (e.g. iptables manifest) Original design Use Facter host_groups for Puppet manifests (e.g. iptables manifest) Use if statements for choices

Single git repo made “safe” development harder Problems Single git repo made “safe” development harder Changing module to support new version of software, but keeping existing clients running involved branching full git repo Merging back was sometimes difficult due to many changes happening in main branch We wrote many modules before the Forge/CERNops was made Harder to integrate 3rd party modules due to dependencies (e.g. DPM puppet modules, VOMS, MySQL) Written in a “just get it done” way. Not very extensible or shareable

Not every part of server was Puppeted Problems Not every part of server was Puppeted Some packages installed in Kickstart Networking not configured (e.g. bonds, LACP, machines with static IP e.g. DHCP server) perfSonar has a small Puppet config applied (ssh keys, firewall) Xenservers also with a small Puppet config At the start, still relied on YAIM for some Grid services Some machines not Puppeted at all /home NFS server. We originally deemed it “too critical” to be Puppeted

Lots of steps can be missed when commissioning server Problems When a machine is not completely controlled by Puppet, it breeds a lack of confidence in server We did manual config to get Puppet servers up and running to accept Puppet connections Puppet servers haven’t been updated in 3 years because we don’t have complete confidence what was done to make them work Lots of steps can be missed when commissioning server Forget to add host to Puppet Dashboard hostgroup stops monitoring Forget to add host to Facter facts stop certain packages/iptables rules being added

Harder to get other team members up to speed Problems Harder to get other team members up to speed “Why isn’t this host joining the right Ganglia cluster” Were not following best practice Cool new features like auto Hiera lookup and structured data from Hiera were impossible for most of our modules without a complete rewrite

Moving virtualisation technology from Xenserver to KVM Impetus for change Moving virtualisation technology from Xenserver to KVM Reinstall Puppetserver on KVM? Could convert existing Puppetserver to KVM… Puppet 4 is coming We relied on node inheritance – deprecated in Puppet 4 Centos 7 servers The final kick for us to move and rewrite

Lots of deprecations Cool new features Node inheritance Puppet 4 changes Lots of deprecations Node inheritance “import” for manifests Variables can’t start with capital letters Class names can’t have hyphens Updating array/hash values Ruby DSL Config file environments Facts no longer stringified Cool new features AIO packaging: newer version of Ruby – made exported resources so much faster EPP templates

Settled on a few golden rules Many “best” practices We searched around, including in textbooks and websites, for “best” practice Settled on a few golden rules Data is in Hiera Always use Puppet variable autolookup Default values for variables in module Hiera Module name should reflect package name (with few exceptions) Search Forge/CERNops first before writing (pick module with least dependencies) 1 module per Git repo (r10k) Roles/profiles ENC sets all node intrinsic values Puppet EVERYTHING, including Puppet

Lots of our new modules are just copied/pasted from Forge Puppet: A new way We created a set of bootstrap scripts, which just install puppet, run r10k, and enough config to then run puppet to install our puppet servers Implemented a separate CA server, to allow for easy scale up of Puppet servers Lots of our new modules are just copied/pasted from Forge

New, simple ENC Hiera based All “intrinsic” properties Puppet: A new way New, simple ENC Hiera based All “intrinsic” properties Easy to add more if needed

Better and simpler Git hooks Puppet: A new way Better and simpler Git hooks

Simpler way to customise modules based on Facts Puppet: A new way Simpler way to customise modules based on Facts Easier to customise/disable Nagios checks for hosts Old hostgroup model made it very difficult to disable a check for a specific host Module writing is actually easier No if statements, no edge cases

Old Puppet worked, but was not optimal Summary Old Puppet worked, but was not optimal Centos 7 was last push for us to change “Best practices” are best for a reason New system is easier to commission a host, and easier to maintain We love the Forge – please share your modules!