CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/i t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012.

Slides:



Advertisements
Similar presentations
About Me CTO, Individual Digital, Inc. (Startup) Author of ext/tidy, PHP 5 Unleashed, Zend Ent. PHP Patterns
Advertisements

Service Manager 2012 Overview
CERN IT Department CH-1211 Genève 23 Switzerland t Messaging System for the Grid as a core component of the monitoring infrastructure for.
CERN - IT Department CH-1211 Genève 23 Switzerland t SVN Pilot: CVS Replacement Manuel Guijarro Jonatan Hugo Hugosson Artur Wiecek David.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
CERN IT Department CH-1211 Genève 23 Switzerland t Some Hints for “Best Practice” Regarding VO Boxes Running Critical Services and Real Use-cases.
System Center 2012 Setup The components of system center App Controller Data Protection Manager Operations Manager Orchestrator Service.
Pilots 2.0: DIRAC pilots for all the skies Federico Stagni, A.McNab, C.Luzzi, A.Tsaregorodtsev On behalf of the DIRAC consortium and the LHCb collaboration.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 7 2/23/2015.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
AI project components: Facter and Hiera
CERN IT Department CH-1211 Genève 23 Switzerland t The CERN Agile Infrastructure Project: Configuration and Operations Tools Helge Meinhard.
CERN - IT Department CH-1211 Genève 23 Switzerland t Tier0 database extensions and multi-core/64 bit studies Maria Girone, CERN IT-PSS LCG.
CERN IT Department CH-1211 Genève 23 Switzerland t EIS section review of recent activities Harry Renshall Andrea Sciabà IT-GS group meeting.
CERN IT Department CH-1211 Genève 23 Switzerland t Experiences running a production Puppet Ben Jones HEPiX Bologna Spring.
Configuration Management Evolution at CERN Gavin
PROOF Cluster Management in ALICE Jan Fiete Grosse-Oetringhaus, CERN PH/ALICE CAF / PROOF Workshop,
CERN IT Department CH-1211 Genève 23 Switzerland t Monitoring: Tracking your tasks with Task Monitoring PAT eLearning – Module 11 Edward.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES AI’s user access, OpenStack security groups and firewall.
Automating Operational and Management Tasks in Microsoft Operations Management Suite and Azure
Jose Castro Leon CERN – IT/OIS CERN Agile Infrastructure Infrastructure as a Service.
Installing, running, and maintaining large Linux Clusters at CERN Thorsten Kleinwort CERN-IT/FIO CHEP
CERN IT Department CH-1211 Genève 23 Switzerland t The Agile Infrastructure Project Part 1: Configuration Management Tim Bell Gavin McCance.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle Real Application Clusters (RAC) Techniques for implementing & running robust.
CERN IT Department CH-1211 Geneva 23 Switzerland t GDB CERN, 4 th March 2008 James Casey A Strategy for WLCG Monitoring.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN IT Department CH-1211 Geneva 23 Switzerland t CF Computing Facilities Agile Infrastructure Monitoring CERN IT/CF.
Agile Infrastructure: an updated overview of IaaS at CERN
1 CERN IT Department CH-1211 Genève 23 Switzerland t Puppet in the CERN CC Tomas Karasek Steve Traylen Oct
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Tim Bell 04/07/2013 Intel Openlab Briefing2.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Scaling the CERN OpenStack cloud Stefano Zilli On behalf of CERN Cloud Infrastructure Team 2.
Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Drupal at CERN Juraj Sucik Jarosław Polok.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Alarming with GNI VOC WG meeting 12 th September.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
CERN - IT Department CH-1211 Genève 23 Switzerland Operations procedures CERN Site Report Grid operations workshop Stockholm 13 June 2007.
Grid Technology CERN IT Department CH-1211 Geneva 23 Switzerland t DBCF GT Grid Technology SL Section Software Lifecycle Duarte Meneses.
T3g software services Outline of the T3g Components R. Yoshida (ANL)
CERN IT Department CH-1211 Genève 23 Switzerland t SL(C) 5 Migration at CERN CHEP 2009, Prague Ulrich SCHWICKERATH Ricardo SILVA CERN, IT-FIO-FS.
Feedback from CMS Andrew Lahiff STFC Rutherford Appleton Laboratory Contributions from Christoph Wissing, Bockjoo Kim, Alessandro Degano CernVM Users Workshop.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES AI Images, flavours and partitions Vítor Gouveia,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF CC Monitoring I.Fedorko on behalf of CF/ASI 18/02/2011 Overview.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
CERN AI Config Management 16/07/15 AI for INFN visit2 Overview for INFN visit.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Improving resilience of T0 grid services Manuel Guijarro.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Cluman: Advanced Cluster Management for Large-scale Infrastructures.
CERN IT Department CH-1211 Genève 23 Switzerland t Bamboo users meeting IT-CS-CT.
Cloud Installation & Configuration Management. Outline  Definitions  Tools, “Comparison”  References.
Overview of cluster management tools Marco Mambelli – August OSG Summer Workshop TTU - Lubbock, TX THE UNIVERSITY OF CHICAGO.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Software collaboration tools as a stack of services Borja Aparicio Cotarelo IT-PES-IS 2HEPiX Fall 2015 Workshop.
Configuration Services at CERN HEPiX fall Ben Jones, HEPiX Fall 2014.
Automating operational procedures with Daniel Fernández Rodríguez - Akos Hencz -
CERN - IT Department CH-1211 Genève 23 Switzerland t ASM and Oracle Service Availability Monitoring LCG 3D Workshop CERN, January 26 th,
DECTRIS Ltd Baden-Daettwil Switzerland Continuous Integration and Automatic Testing for the FLUKA release using Jenkins (and Docker)
CERN IT Systems Management Gavin McCance CERN IT-CM.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Operating Systems & Information Services CERN IT Department CH-1211 Geneva 23 Switzerland.
CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.
Smart Cities and Communities and Social Innovation
Web application hosting with Openshift, and Docker images
Web application hosting with Openshift, and Docker images
Pablo Pinés León – FTEC 2016 Program
Configuration Management with Azure Automation DSC
Michael Mast Senior Architect
Drupal VM and Docker4Drupal For Drupal Development Platform
Drupal VM and Docker4Drupal as Consistent Drupal Development Platform
Cloud Migrations Pose Important Questions
Presentation transcript:

CERN IT Department CH-1211 Genève 23 Switzerland t IT Configuration Activities Gavin McCance Online Cross-experiment Meeting, 14 June 2012

Why? 2 We’re changing the tools we use to manage the centre Ten years ago, we were big in compute –There were no real IT ops tools at our scale, so we developed our own –Our tools are becoming increasingly brittle and high maintenance –Inefficiencies exist but root cause cannot be easily identified –Learning curve remains high About to expand to new remote tier-0 Our needs are no longer special

Why? Last few years have seen an explosion in the IT operations tool space –Configuration, management and monitoring –Large, supportive user communities Strategy is absolute minimum development –Other than involvement in upstream projects 3

Scaling challenges: hosts and people 4 Currently we have 10k hosts We’ll add another 5k in the medium term and move to VMs –50 – 300k “hosts” depending on how we chop the CPUs up Many, diverse applications (“clusters”) managed by different teams..and 700+ other “unmanaged” Linux nodes in VMs that could benefit from a simple configuration system

What’s the config stack? Based around the Puppet tool and eco-system –Declarative configuration tool –Scales well –Very active, wide community –Very well integrated with other tools 5

Deployment status ~140 nodes in test with single puppetmaster –Will be soon expanding to 4k (virtual) nodes on load- balanced puppet setup –Integrating with Openstack for VMs Investigating and understanding tools –IT-internal “early adopters” starting (castor, lxbatch, lxplus, webservices, …) Foreman dashboard as front-end and ENC 6

Major bits Puppet and Foreman dashboard using git to version the templates –We’re putting “useful to others” modules in –We’ve added integration of Puppet to the CERN CA –Hiera for cluster-specific parameterisation Should make modules more portable in the future. Our software (and scripts) are built using Koji -> mash -> yum Automation: Looking at Crucible for automated configuration-code-review Keeping Lemon for monitoring (for now) though changing alarms to use messaging notifications mcollective for task orchestration 7

Current arch 8

mcollective: task orchestration 9 Broadcast Run Collect Very fast response Automatable

Interesting CERNish modules Will be putting things in Modules –AFS –Keytab, Kerberos –CVMFS –SSO with Apache httpd –SSL Apache load-balancer –CERN auth with LDAP (SSSD) –CERN Lemon –+ usual OS level configurations Openstack integration Cloud-init auto-registration into Puppet 10

Summary We’re moving to standard tools for configuration (and VM + monitoring) We’re gaining experience using Puppet and friends –Internal IT early adopters now –On track to move our IT services 2013 We interested to collaborate on the work 11