CERN IT Department CH-1211 Genève 23 Switzerland t Experiences running a production Puppet Ben Jones HEPiX Bologna Spring.

Slides:



Advertisements
Similar presentations
Focus on Your Content, Not on Ingesting Your Content Terry Brady Applications Programmer Analyst Georgetown University Library
Advertisements

Tier-1 Evolution and Futures GridPP 29, Oxford Ian Collier September 27 th 2012.
About Me CTO, Individual Digital, Inc. (Startup) Author of ext/tidy, PHP 5 Unleashed, Zend Ent. PHP Patterns
Paperless Online Payroll, Integrated HR & Report Generating System.
Hit or Miss ? !!!.  Cache RAM is high-speed memory (usually SRAM).  The Cache stores frequently requested data.  If the CPU needs data, it will check.
Internet Networking Spring 2002 Tutorial 13 Web Caching Protocols ICP, CARP.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Simplifying Configuration Ricardo Rocha ( on behalf of the LCGDM.
CERN - IT Department CH-1211 Genève 23 Switzerland t Service-Now UDS training [Jan 2011] - 1 Service-now training for UDS Service-now training.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
CONTINUOUS INTEGRATION, DELIVERY & DEPLOYMENT ONE CLICK DELIVERY.
Module 1: Installing Active Directory Domain Services
Getting to Push Button Deploys Moovweb January 19, 2012.
Christopher Jeffers August 2012
Tools and software process for the FLP prototype B. von Haller 9. June 2015 CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
AI project components: Facter and Hiera
Puppet with vSphere Workshop Install, configure and use Puppet on your laptop for vSphere DevOps Billy Lieberman August 1, 2015.
The Art and Zen of Managing Nagios with Puppet Michael Merideth - VictorOps
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Towards Low Overhead Provenance Tracking in Near Real-Time Stream Filtering Nithya N. Vijayakumar, Beth Plale DDE Lab, Indiana University {nvijayak,
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal Database Selection Tim Bell 6 th June.
Configuration Management Evolution at CERN Gavin
Configuration Management with Cobbler and Puppet Kashif Mohammad University of Oxford.
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES AI’s user access, OpenStack security groups and firewall.
® IBM Software Group © 2007 IBM Corporation Best Practices for Session Management
1 PUPPET AND DSC. INTRODUCTION AND USAGE IN CONTINUOUS DELIVERY PROCESS. VIKTAR VEDMICH PAVEL PESETSKIY AUGUST 1, 2015.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Stephen Childs Trinity College Dublin &
Virtualised Worker Nodes Where are we? What next? Tony Cass GDB /12/12.
1 Andrea Sciabà CERN Critical Services and Monitoring - CMS Andrea Sciabà WLCG Service Reliability Workshop 26 – 30 November, 2007.
WLCG infrastructure monitoring proposal Pablo Saiz IT/SDC/MI 16 th August 2013.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS First look at the Mobile Framework Ivan Deloose,
CERN IT Department CH-1211 Genève 23 Switzerland PES 1 Ermis service for DNS Load Balancer configuration HEPiX Fall 2014 Aris Angelogiannopoulos,
CERN IT Department CH-1211 Genève 23 Switzerland t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012.
1 CERN IT Department CH-1211 Genève 23 Switzerland t Puppet in the CERN CC Tomas Karasek Steve Traylen Oct
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities IT Monitoring CERN IT-CF HEPiX Fall 2013.
CERN IT Department CH-1211 Geneva 23 Switzerland t A proposal for improving Job Reliability Monitoring GDB 2 nd April 2008.
Scaling the CERN OpenStack cloud Stefano Zilli On behalf of CERN Cloud Infrastructure Team 2.
Data Transfer Service Challenge Infrastructure Ian Bird GDB 12 th January 2005.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Development Workflow of the Configuration Management.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Alarming with GNI VOC WG meeting 12 th September.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
CERN - IT Department CH-1211 Genève 23 Switzerland t Operating systems and Information Services OIS Proposed Drupal Service Definition IT-OIS.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES AI Images, flavours and partitions Vítor Gouveia,
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
CERN AI Config Management 16/07/15 AI for INFN visit2 Overview for INFN visit.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.
Breaking the frontiers of the Grid R. Graciani EGI TF 2012.
CERN IT Department CH-1211 Genève 23 Switzerland t Bamboo users meeting IT-CS-CT.
Cloud Installation & Configuration Management. Outline  Definitions  Tools, “Comparison”  References.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Software collaboration tools as a stack of services Borja Aparicio Cotarelo IT-PES-IS 2HEPiX Fall 2015 Workshop.
Configuration Services at CERN HEPiX fall Ben Jones, HEPiX Fall 2014.
Automating operational procedures with Daniel Fernández Rodríguez - Akos Hencz -
SERVERS. General Design Issues  Server Definition  Type of server organizing  Contacting to a server Iterative Concurrent Globally assign end points.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES GIT Service in the Agile Infrastructure Project Vítor.
Renewal of Puppet for Australia-ATLAS
@ Bucharest DevOps Hacker Meetup
Dag Toppe Larsen UiB/CERN CERN,
High Availability Linux (HA Linux)
Dag Toppe Larsen UiB/CERN CERN,
Cisco Data Virtualization
Elevator Inspection System
Puppet
Scaling Puppet and Foreman for HPC
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t Experiences running a production Puppet Ben Jones HEPiX Bologna Spring 2013

Infrastructure Foreman – kickstart, ENC, report visualisation Multiple puppet masters – load balanced (simple for now) PuppetDB – stored configs, complex manifests, data mining hiera data separation CERN CA git managed manifests – split into “modules”, “hostgroups”, “hieradata” Puppet in production - 2

Production service Customers running production services using the infra – openstack, cvmfs, lxplus – modules appearing for bddi, MyProxy, glexec WN, argus, voms, castor Plant still modest but increasing – ~1350 clients, 2/3 physical – numbers will change soon as batch comes online Lots of changes in this year due to production experience Puppet in production - 3

Don’t turn your computer centre into a giant DDOS machine Puppet in production - 4

Facter Facter a great tool to abstract common information for a machine – virtual fact good example of hidden complexity (180 loc) Unlike what we were used to in Quattor, machine facts can inform configuration Custom facts are simple to write… too simple! – LANDB integration for location, network info, owner etc – 1000s of clients * short runinterval == unhappy LANDB Solutions: cache or use masters – cache directly in facter code or wait for facter 2.0 – use custom functions Puppet in production - 5

Decide on one entry point to apply configuration Puppet in production - 6

the foreman Foreman useful tool for many tasks – provisioning (physical and virtual) – configuration (ENC) – monitoring (reports) Important to consider extent to use the ENC Environment best defined by ENC – no choice anyway in puppet 3 Hostgroups essential in ENC – hostgroups similar to quattor clusters castor/server/diskserver/atlas/t0atlas – hiera data assigned to hostgroup, so machine cannot assert membership Puppet in production - 7

foreman classes Puppet in production - 8

module imports We switched off foreman classes. History of changes not in git Either no parameters, or class{} declarations – most non-trivial modules require some parameters Class / environment import takes time Nested hostgroup module definition was useful – we implemented a puppet include_if_exists (on github) – hostgroup module tree included if in hierarchy – castor/headnode/c2repack hostgroup would load: hostgroups/hg_castor/manifests/init.pp hostgroups/hg_castor/manifests/headnode/c2repack.pp Puppet in production - 9

Don’t let your machines become time travellers. Puppet in production - 10

Compile process Facts Facts received from node ENC Node info from the ENC, cache result. Catalog Combine facts, enc data & config, compile catalog. Puppet in production - 11

Stale definitions With multiple puppet masters, if foreman is unavailable the cache can be stale depending on which master contacted. A change of hostgroup or environment can be painful Obvious answer to not cache, or to share / update cache Lesson is components are horizontally scalable, but not always case that this is robust Puppet in production - 12

Try not to make all your secrets public Puppet in production - 13

PuppetDB Collects facts & catalogs generated by puppet Supports exported resources & inventory service Can be directly queried in manifests – github.com/dalen/puppetdbquery Wider ad-hoc usage useful – git hooks – “who’s using X” Dangerous! – /commands – “secret” data Puppet in production - 14

PuppetDB proxy Whitelist traffic from puppet masters & lb Refuse all /commands/ from load balancers Use resource tags to filter “secrets” { "parameters" : { "entry" : "egroup", "k5file" : "/root/.k5login”,}, "sourceline" : 54, "sourcefile" : "/etc/puppet/environments/bejones/modules/kerberos/manifests/config.pp", "exported" : false, "tags" : [ "config", "k5entry", "ai-admins", "kerberos::k5entry", "base", "class", "kerberos", "kerberos::config" ], "title" : "ai-admins", "type" : "Kerberos::K5entry”, "certname" : "benvm01.cern.ch" } Puppet in production - 15

git workflow is easy to get wrong Puppet in production - 16

git workflow puppet “dynamic environments” – puppet masters pull git repositories, create environment per branch – foreman imports environment list Managed branches for majority – production (master) – devel Topic branches should be easy to create Don’t allow non fast-forward updates to managed branches – obviously, who would forget to do that? Puppet in production - 17

git workflow Goal is to get safe changes applied as quickly as possible Only cherry-picked signed off changes to master Fast track for low impact one-module changes, hostgroup & hiera More involvement from core team for complex changes Future plans to use CI to streamline – bamboo (or similar) for system tests – canary machines Puppet in production - 18

Community “We’re not special” Fork where necessary but always try to return to mainstream Would welcome more involvement from HEPiX sites on common problems Puppet in production - 19