CERN Cloud Infrastructure Report 2 Arne Wiebalck for the CERN Cloud Team HEPiX Spring Meeting DESY, Zeuthen, Germany Apr 19, 2019 Numbers Operations What’s.

Slides:



Advertisements
Similar presentations
© 2012 IBM Corporation Architecture of Quantum Folsom Release Yong Sheng Gong ( 龚永生 ) gongysh #openstack-dev Quantum Core developer.
Advertisements

ATLAS Tier-3 in Geneva Szymon Gadomski, Uni GE at CSCS, November 2009 S. Gadomski, ”ATLAS T3 in Geneva", CSCS meeting, Nov 091 the Geneva ATLAS Tier-3.
Profit from the cloud TM Parallels Dynamic Infrastructure AndOpenStack.
VSphere vs. Hyper-V Metron Performance Showdown. Objectives Architecture Available metrics Challenges in virtual environments Test environment and methods.
11 HDS TECHNOLOGY DEMONSTRATION Steve Sonnenberg May 12, 2014 © Hitachi Data Systems Corporation All Rights Reserved.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
CERN Cloud Infrastructure Report 2 Arne Wiebalck for the CERN Cloud Team HEPiX Autumn Meeting Lincoln, Nebraska, U.S. Oct 17, 2014 Arne Wiebalck: CERN.
Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.
Tier-1 experience with provisioning virtualised worker nodes on demand Andrew Lahiff, Ian Collier STFC Rutherford Appleton Laboratory, Harwell Oxford,
Testing as a Service with HammerCloud Ramón Medrano Llamas CERN, IT-SDC
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN Cloud Infrastructure Report 2 Bruno Bompastor for the CERN Cloud Team HEPiX Spring 2015 Oxford University, UK Bruno Bompastor: CERN Cloud Report.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Status of WLCG Tier-0 Maite Barroso, CERN-IT With input from T0 service managers Grid Deployment Board 9 April Apr-2014 Maite Barroso Lopez (at)
+ CS 325: CS Hardware and Software Organization and Architecture Cloud Architectures.
Module 9: Configuring Storage
Ceph Storage in OpenStack Part 2 openstack-ch,
2 OpenStack Design Summit Summary Swiss and Rhone Alpes - OpenStack User Group Meeting 6 th December, CERN Belmiro Moreira
Eric Burgener VP, Product Management A New Approach to Storage in Virtual Environments March 2012.
Jose Castro Leon CERN – IT/OIS CERN Agile Infrastructure Infrastructure as a Service.
Virtualisation & Cloud Computing at RAL Ian Collier- RAL Tier 1 HEPiX Prague 25 April 2012.
RAL Site Report HEPiX FAll 2014 Lincoln, Nebraska October 2014 Martin Bly, STFC-RAL.
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Usage of virtualization in gLite certification Andreas Unterkircher.
OpenStack cloud at Oxford Kashif Mohammad University of Oxford.
Visual Studio Windows Azure Portal Rest APIs / PS Cmdlets US-North Central Region FC TOR PDU Servers TOR PDU Servers TOR PDU Servers TOR PDU.
Agile Infrastructure IaaS Compute Jan van Eldik CERN IT Department Status Update 6 July 2012.
SLACFederated Storage Workshop Summary For pre-GDB (Data Access) Meeting 5/13/14 Andrew Hanushevsky SLAC National Accelerator Laboratory.
Agile Infrastructure: an updated overview of IaaS at CERN
Enabling Grids for E-sciencE EGEE and gLite are registered trademarks Tools and techniques for managing virtual machine images Andreas.
CoprHD and OpenStack Ideas for future.
How Adobe Built An OpenStack Cloud
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Arne Wiebalck -- VM Performance: I/O
Commissioning the CERN IT Agile Infrastructure with experiment workloads Ramón Medrano Llamas IT-SDC-OL
How Adobe Has Built An OpenStack Cloud
Cloud Computing Lecture 5-6 Muhammad Ahmad Jan.
Scaling the CERN OpenStack cloud Stefano Zilli On behalf of CERN Cloud Infrastructure Team 2.
1 Open Stack Cloud System Lecture 7. 2 What is OpenStack  It is not a single open source project  It is not a hypervisor  It is not a storage platform.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
1 TCS Confidential. 2 Objective: In this session we will be able to learn  What is Openstack?  History  Capabilities  Openstack as IaaS  Advantages.
Automated virtualisation performance framework 1 Tim Bell Sean Crosby (Univ. of Melbourne) Jan van Eldik Ulrich Schwickerath Arne Wiebalck HEPiX Fall 2015.
Virtual Cluster Computing in IHEPCloud Haibo Li, Yaodong Cheng, Jingyan Shi, Tao Cui Computer Center, IHEP HEPIX Spring 2016.
Platform & Engineering Services CERN IT Department CH-1211 Geneva 23 Switzerland t PES Agile Infrastructure Project Overview : Status and.
Automating operational procedures with Daniel Fernández Rodríguez - Akos Hencz -
The StratusLab Distribution and Its Evolution 4ème Journée Cloud (Bordeaux, France) 30 November 2012.
EPAM Cloud Orchestration
Preamble Way off topic (or is it?) Kooky prediction $8 bil arm procs. 100% annual Intel about $80 billion ~2016 collision - economy of scale. Microsoft?,
CERN IT Department CH-1211 Genève 23 Switzerland The CERN internal Cloud Sebastien Goasguen, Belmiro Rodrigues Moreira, Ewan Roche, Ulrich.
Md Baitul Al Sadi, Isaac J. Cushman, Lei Chen, Rami J. Haddad
CERN Cloud Service Update
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
Smart Cities and Communities and Social Innovation
The advances in IHEP Cloud facility
Resource Provisioning Services Introduction and Plans
IT Services Katarzyna Dziedziniewicz-Wojcik IT-DB.
Helge Meinhard, CERN-IT Grid Deployment Board 04-Nov-2015
Using OpenStack to Measure OpenStack Cinder Performance
Running virtualized Hadoop, does it make sense?
Database Services at CERN Status Update
Sebastian Solbach Consulting Member of Technical Staff
Using OpenStack Sahara & Manila to run Analytics
SCD Cloud at STFC By Alexander Dibbo.
Interoperability in Modern Clouds using DevOps
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Searchlight Lei Zhang Search service for OpenStack
OpenShift vs. Vanilla k8s on OpenStack IaaS
OpenStack Summit Berlin – November 14, 2018
Squeezing Every Drop of Ink out of Ceph
Managing allocatable resources
Presentation transcript:

CERN Cloud Infrastructure Report 2 Arne Wiebalck for the CERN Cloud Team HEPiX Spring Meeting DESY, Zeuthen, Germany Apr 19, 2019 Numbers Operations What’s new WIP

CERN Cloud Recap CERN Cloud Service one of the three major components in IT’s AI project - Policy: Servers in CERN IT shall be virtual Based on OpenStack - Production service since July Performed (almost) 4 rolling upgrades since - Currently in transition from Kilo to Liberty - Nova, Glance, Keystone, Horizon, Cinder, Ceilometer, Heat, Neutron 3Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

CERN Cloud Architecture (1) 4 Two data centers - 1 region (1 API), 36 cells (+10!) - Cells map use cases hardware, hypervisor type, location, users, … Top cell on several physical nodes in HA - Clustered RabbitMQ with mirrored queues - API servers are VMs in various child cells Child cell controllers are OpenStack VMs - One controller per cell - Tradeoff between complexity and failure impact Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

CERN Cloud Architecture (2) 5 nova-cells rabbitmq Top cell controller API server nova-api rabbitmq nova-cells nova-api nova-scheduler nova-conductor nova-network Child cell controller Compute node nova-compute rabbitmq nova-cells nova-api nova-scheduler nova-conductor nova-network Child cell controller Compute node nova-compute DB infrastructure Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

CERN Cloud in Numbers (1) 5’800 hypervisors in production (6m ago: +25%) - Majority qemu/kvm now on CC7 (~150 Hyper-V hosts) - ~2’100 HVs at Wigner in Hungary (batch, compute, services) HVs on critical power (+50%) 155k Cores (+30k) ~350 TB RAM (+100TB) ~18’000 VMs (+3’000) To be increased in 2016! - +57k cores in spring kHS06 in autumn 6Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

CERN Cloud in Numbers (2) 2’700 images/snapshots (+700) - Glance on Ceph 2’200 volumes (+700, uptake doubled) - Cinder on Ceph (& NetApp) in GVA & Wigner 7 Every 10s a VM gets created or deleted in our cloud! Rally down Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

Operations: NUMA/THP Recap (1) 8Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen The HS06 rating of full-node VMs was about 20% lower than the one of the underlying host - Smaller VMs much better Investigated various tuning options - KSM, EPT, PAE, Pinning, … +hardware type dependencies Comparison with Hyper-V: no general issue - Loss w/o tuning ~3% (full-node), <1% for small VMs - NUMA-awareness!

Operations: NUMA/THP Recap (2) 9Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen NUMA-awareness identified as most efficient setting - Full node VMs have ~3% overhead in HS06 “EPT-off” side-effect - Small number of hosts, but very visible there Use 2MB Huge Pages - Keep the “ EPT off ” performance gain with “ EPT on” All details in Arne Wiebalck’s talk at BNLArne Wiebalck’s talk at BNL

Operations: NUMA/THP Roll-out 10Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Rolled out on ~2’000 batch hypervisors (~6’000 VMs) - HP allocation as boot parameter  reboot - VM NUMA awareness as flavor metadata  delete/recreate Cell-by-cell (~200 hosts): - Queue-reshuffle to minimize resource impact - Draining & deletion of batch VMs - Hypervisor reconfiguration (Puppet) & reboot - Recreation of batch VMs Whole update took about 8 weeks - Organized between batch and cloud teams - No performance issue observed since

Operations: SSDs 11Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen # fio --name xyz --rw=randwrite --size=1G --direct=1 xyz: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1 fio Starting 1 process Jobs: 1 (f=1): [w(1)] [100.0% done] [0KB/41132KB/0KB /s] [0/10.3K/0 iops] [eta 00m:00s] xyz: (groupid=0, jobs=1): err= 0: pid=3507: Mon Apr 11 16:09: write: io=1024.0MB, bw=42442KB/s, iops=10610, runt= 24706msec clat (usec): min=70, max=2200, avg=92.32, stdev=17.43 lat (usec): min=70, max=2201, avg=92.57, stdev=17.44 clat percentiles (usec): | 1.00th=[ 73], 5.00th=[ 76], 10.00th=[ 78], 20.00th=[ 82], | 30.00th=[ 84], 40.00th=[ 86], 50.00th=[ 88], 60.00th=[ 91], | 70.00th=[ 95], 80.00th=[ 100], 90.00th=[ 112], 95.00th=[ 124], | 99.00th=[ 141], 99.50th=[ 149], 99.90th=[ 171], 99.95th=[ 191], | 99.99th=[ 644] bw (KB /s): min=39192, max=46832, per=100.00%, avg= , stdev= lat (usec) : 100=79.44%, 250=20.53%, 500=0.01%, 750=0.01%, 1000=0.01% lat (msec) : 2=0.01%, 4=0.01% cpu : usr=2.15%, sys=11.96%, ctx=262146, majf=0, minf=32 IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued : total=r=0/w=262144/d=0, short=r=0/w=0/d=0, drop=r=0/w=0/d=0 latency : target=0, window=0, percentile=100.00%, depth=1 Run status group 0 (all jobs): WRITE: io=1024.0MB, aggrb=42442KB/s, minb=42442KB/s, maxb=42442KB/s, mint=24706msec, maxt=24706msec Disk stats (read/write): vda: ios=0/261747, merge=0/0, ticks=0/22147, in_queue=22114, util=89.35% New h/w has SSDs only - To solve the recurrent I/O issues New flavors - To match smaller disks SSD caching … - ZFS ZIL/l2arc in front of Cinder volumes? bcache/dm-cache update: -dm-cache for lxplus: worked OK -we’ll drop it none the less in favor of bcache (performance vs. operations)

Operations: Retirement Campaign 12 About 1’600 nodes to retire from the service by 3Q16 - ~1’200 compute (started), ~400 with services We have gained some experience with (manual) migration - Live and cold - Seems to work reliably (where it can) We have developed a tool that you can instruct to drain hypervisor (or simply live-migrate given VMs) - Main tasks are VM classification and progress monitoring - The nova scheduler will pick the target (nova patch) We will use the “IP service bridging” mentioned at BNL - See Carles Kishimoto Bisbe’s talk for all detailsCarles Kishimoto Bisbe’s talk Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

Operations: Recent Issues 13Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Unexpected VM shutdowns  Symptoms: VMs shut down without reason, looks like ‘ shutdown –h now ’ - Rare! ~1-2 VMs / month in >15’000 VMs - SLC6, CC7, no core dumps (well almost …), VMs w/ volumes - Seems we’re hitting a Ceph Issue 6480: using non thread-safe libnss functionsCeph Issue 6480 libvirtd updates can freeze VMs - Symptoms: Nova is blocked (b/c libvirt is stuck), stuck msgs in Rabbit, no operations get through  Work-around: virsh destroy Blocked I/Os on SSD-hosted VMs - Symptoms: I/Os in the VMs completely blocked - Happened only on new SSD hypervisors  Fix: Update to recent qemu-kvm version

Operations: Neutron Migration 14 We’ll need to replace nova-network - It’s going to be deprecated (really really really this time) We have a number of patches to adapt to the CERN network constraints - We patched nova for each release … - … neutron allows for out-of-tree plugins! New potential features (Tenant networks, LBaaS, Floating IPs, …) Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen

15 We have a working Neutron cell … - Neutron control plane in Liberty (fully HA) - Bridge agent in Kilo (nova) … but there is still quite some work to do. - IPv6, custom DHCP, network naming, … “As mentioned, there is currently no way to cleanly migrate from nova-network to neutron.” “As mentioned, there is currently no way to cleanly migrate from nova-network to neutron.” - All efforts to establish a general migration path failed so far - Should be OK for us, various options (incl. in-place, w/ migration, …) Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Operations: Neutron Status

What’s new: Keystone 16Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Access to EduGAIN users via Horizon - Allow (limited) access given appropriate membership Additional project attributes - LanDB owner/responsible for extended access - More attributes being discussed (accounting groups) Introduced endpoint filtering - Allows access to features on a per tenant basis - Open service on demand (e.g. Magnum)

What’s new: EC2 API Project 17Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen EC2 API support in nova was deprecated with Kilo, will be removed with Mitaka New EC2API project - CERN provided initial packaging and Puppet modules - Currently being tested with first users at CERN -

What’s new: Container Integration 18Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Magnum: OpenStack project to treat Container Orchestration Engines (COEs) as 1 st class resources Pre-production service available - Supporting Docker Swarm and Kubernetes for now Many users interested, usage ramping up - GitLab CI, Jupyter/Swan, FTS, … All details in Bertrand’s talk on Thursday!

Future Plans 19Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen Investigate Ironic (Bare metal provisioning) - OpenStack as one interface for compute resource provisioning - Allow for complete accounting Replace Hyper-V by qemu/kvm? - Successfully done at other sites - Remove dependency on Windows expertise - Reduce complexity in service setup

Summary 20 Cloud service continues to grow and mature - While experimental, good experience with cells for scaling - Experience gained helps with general resource provisioning - New features added (federation, containers) - Expansion planned (bare metal provisioning) Major operational challenges ahead - Transparent retirement of service hosts - Replacement of network layer Arne Wiebalck: CERN Cloud Report, 2016 HEPiX Spring Meeting at DESY Zeuthen