ATLAS Cloud Operations

Slides:



Advertisements
Similar presentations
ANTHONY TIRADANI AND THE GLIDEINWMS TEAM glideinWMS in the Cloud.
Advertisements

FI-WARE – Future Internet Core Platform FI-WARE Cloud Hosting July 2011 High-level description.
1 Bridging Clouds with CernVM: ATLAS/PanDA example Wenjing Wu
Feedback from ATLAS Speaker – Doug Benjamin (Duke University) On behalf of the ATLAS collaboration Contributors to talk: DB, Frank Berghaus, Alessandro.
Flexible Services for the Support of Research Project Overview.
TG RoundTable, Purdue RP Update October 11, 2008 Carol Song Purdue RP PI Rosen Center for Advanced Computing.
SCD FIFE Workshop - GlideinWMS Overview GlideinWMS Overview FIFE Workshop (June 04, 2013) - Parag Mhashilkar Why GlideinWMS? GlideinWMS Architecture Summary.
StratusLab: Darn Simple Cloud Charles (Cal) Loomis & Mohammed Airaj LAL, Univ. Paris-Sud, CNRS/IN2P October 2013.
Creating an EC2 Provisioning Module for VCL Cameron Mann & Everett Toews.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Grids, Clouds and the Community. Cloud Technology and the NGS Steve Thorn Edinburgh University Matteo Turilli, Oxford University Presented by David Fergusson.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
1 Resource Provisioning Overview Laurence Field 12 April 2015.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI Exploiting Virtualization & Cloud Computing in ATLAS 1 Fernando H. Barreiro.
Jose Castro Leon CERN – IT/OIS CERN Agile Infrastructure Infrastructure as a Service.
GLIDEINWMS - PARAG MHASHILKAR Department Meeting, August 07, 2013.
Trusted Virtual Machine Images a step towards Cloud Computing for HEP? Tony Cass on behalf of the HEPiX Virtualisation Working Group October 19 th 2010.
| nectar.org.au NECTAR TRAINING Module 1 Overview of cloud computing and NeCTAR services.
Commissioning the CERN IT Agile Infrastructure with experiment workloads Ramón Medrano Llamas IT-SDC-OL
Doug Benjamin Duke University. 2 ESD/AOD, D 1 PD, D 2 PD - POOL based D 3 PD - flat ntuple Contents defined by physics group(s) - made in official production.
Workload management, virtualisation, clouds & multicore Andrew Lahiff.
Condor + Cloud Scheduler Ashok Agarwal, Patrick Armstrong, Andre Charbonneau, Ryan Enge, Kyle Fransham, Colin Leavett-Brown, Michael Paterson, Randall.
OpenStack overview of the project Belmiro Daniel Rodrigues Moreira CERN IT-PES-PS January 2011 Disclaimer: This presentation reflects the experience and.
1 Cloud Services Requirements and Challenges of Large International User Groups Laurence Field IT/SDC 2/12/2014.
Shifters Jamboree Kaushik De ADC Jamboree, CERN December 4, 2014.
OpenStack Chances and Practice at IHEP Haibo, Li Computing Center, the Institute of High Energy Physics, CAS, China 2012/10/15.
Cloud computing: IaaS. IaaS is the simplest cloud offerings. IaaS is the simplest cloud offerings. It is an evolution of virtual private server offerings.
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI CERN and HelixNebula, the Science Cloud Fernando Barreiro Megino (CERN IT)
EGI-InSPIRE RI EGI Webinar EGI-InSPIRE RI Porting your application to the EGI Federated Cloud 17 Feb
StratusLab is co-funded by the European Community’s Seventh Framework Programme (Capacities) Grant Agreement INFSO-RI Demonstration StratusLab First.
Trusted Virtual Machine Images the HEPiX Point of View Tony Cass October 21 st 2011.
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Virtualisation & Containers (A RAL Perspective)
CLOUD ARCHITECTURE Many organizations and researchers have defined the architecture for cloud computing. Basically the whole system can be divided into.
WLCG IPv6 deployment strategy
Review of the WLCG experiments compute plans
Running LHC jobs using Kubernetes
Cloud Technology and the NGS Steve Thorn Edinburgh University (Matteo Turilli, Oxford University)‏ Presented by David Fergusson.
ALICE & Clouds GDB Meeting 15/01/2013
Use of HLT farm and Clouds in ALICE
The advances in IHEP Cloud facility
StratusLab First Periodic Review
AWS Integration in Distributed Computing
Elastic Computing Resource Management Based on HTCondor
Sviluppi in ambito WLCG Highlights
Virtualization and Clouds ATLAS position
Virtualisation for NA49/NA61
NA61/NA49 virtualisation:
Blueprint of Persistent Infrastructure as a Service
Dag Toppe Larsen UiB/CERN CERN,
Progress on NA61/NA49 software virtualisation Dag Toppe Larsen Wrocław
Integration of Openstack Cloud Resources in BES III Computing Cluster
Dag Toppe Larsen UiB/CERN CERN,
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Provisioning 160,000 cores with HEPCloud at SC17
WLCG experiments FedCloud through VAC/VCycle in the EGI
DIRAC services.
How to enable computing
David Cameron ATLAS Site Jamboree, 20 Jan 2017
Virtualisation for NA49/NA61
ATLAS Sites Jamboree, CERN January, 2017
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
Discussions on group meeting
WLCG Collaboration Workshop;
VMDIRAC status Vanessa HAMAR CC-IN2P3.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Cloud computing mechanisms
Cloud Computing: Concepts
Presentation transcript:

ATLAS Cloud Operations Frank Berghaus (University of Victoria) On behalf of the ATLAS Cloud Computing Group

ATLAS Facilities Jamboree Overview Review of the Cloud Scheduler System Worker Node Virtual Machines Squid Discovery with Shoal Recommendations and procedure for adding clouds Dec 4, 2014 ATLAS Facilities Jamboree

Cloud Job Flow (on the Grid) User Job User Panda Queue Virtual Machine Pilot Job Boot Instances Pilot Factory HTCondor Clouds Cloud Scheduler Easy to connect and use many clouds Dec 4, 2014 ATLAS Facilities Jamboree

CernVM3Worker Nodes Features: Operating system and project software is made available over cvmfs Cloud-init contextualize images on boot Same image works anywhere, on any Hypervisor, and on any Cloud Type https://github.com/berghaus/userdata Dynamic condor slot configuration Dec 4, 2014 ATLAS Facilities Jamboree

Shoal: Dynamic Squid Discovery Ready for larger scale deployment: installation instructions Current server: http://shoal.heprc.uvic.ca/ Connected squids: UVic, TRIUMF, Oxford, CERN Cloud Included with CernVM since release 3.2 Meets the requirements of the squid discovery task force Dec 4, 2014 ATLAS Facilities Jamboree

Recommendations for New Clouds Reference from the ATLAS Cloud Ops wiki: AtlasCloudSiteGuide Use OpenStack Working on StratusLab as an option Use CernVM Install shoal agent on your local squid Test booting an connecting to an instance Test your network connectivity to the CERN cloud scheduler: aiatlas009.cern.ch Expose your cloud interface to allow requests from the cloud scheduler Create a service account for the ATLAS cloud scheduler Email the Cloud Ops Team: atlas-adc-cloudcomputing-ops@cern.ch Dec 4, 2014 ATLAS Facilities Jamboree

Procedure for Adding a New Site Once the cloud team has the credentials and endpoint to the new cloud we will: Create a panda queue or site to associate with your cloud Add single, multi-core, analysis, and/or high memory queues Run a set of Hammer Cloud tests Start running jobs Dec 4, 2014 ATLAS Facilities Jamboree

ATLAS Facilities Jamboree Summary & Outlook ATLAS Production and Analysis is running on IaaS clouds Analysis usage limited because limited manpower Ready to add more cloud sites Dec 4, 2014 ATLAS Facilities Jamboree

ATLAS Facilities Jamboree Backup Dec 4, 2014 ATLAS Facilities Jamboree

Infrastructure-as-a-Service (IaaS) Clouds IaaS Cloud: A pool of virtual machine hypervisors presenting a single controller interface Run many instances of one virtual machine configured for ATLAS computing Advantages: Isolate complex application software from site administration Minimize dependence on local system Flexible resource allocation Examples: OpenStack Nimbus Commercial clouds: Amazon, Google, etc. Running at labs (e.g., CERN), universities (e.g., Victoria), and research networks (e.g., GridPP) Dec 4, 2014 ATLAS Facilities Jamboree

Clouds boot Virtual Machines Cloud Scheduler Cloud Scheduler Cloud Interface Cloud Interface Cloud Interface ... Google Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine Virtual Machine OpenStack OpenStack Condor Central Manager Clouds boot Virtual Machines VM contextualizes to attach to the condor and processes jobs Cloud scheduler retires VM when no jobs require that VM User Scheduler status communication Dec 4, 2014 ATLAS Facilities Jamboree

ATLAS Cloud Production in 2014 Helix Nebula Google Amazon Over 1.2M ATLAS Jobs Completed Mostly Single core production New ATLAS requirements: Multi-core High memory Completed Jobs in 2014 1.2M CERN North America Australia UK Jan 2014 Sep 2014 Dec 4, 2014 ATLAS Facilities Jamboree