AWS Integration in Distributed Computing

Slides:



Advertisements
Similar presentations
Ivan Pleština Amazon Simple Storage Service (S3) Amazon Elastic Block Storage (EBS) Amazon Elastic Compute Cloud (EC2)
Advertisements

Cloud Computing Mick Watson Director of ARK-Genomics The Roslin Institute.
Amazon Web Services Justin DeBrabant CIS Advanced Systems - Fall 2013.
University of Notre Dame
Amazon Web Services and Eucalyptus
Status of BESIII Distributed Computing BESIII Workshop, Mar 2015 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
Profit from the cloud TM Parallels Dynamic Infrastructure AndOpenStack.
Cloud Computing Imranul Hoque. Today’s Cloud Computing.
1 NETE4631 Cloud deployment models and migration Lecture Notes #4.
Infrastructure as a Service (IaaS) Amazon EC2
K.Harrison CERN, 23rd October 2002 HOW TO COMMISSION A NEW CENTRE FOR LHCb PRODUCTION - Overview of LHCb distributed production system - Configuration.
Matt Bertrand Building GIS Apps in the Cloud. Infrastructure - Provides computer infrastructure, typically a platform virtualization environment, as a.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Accessing the Amazon Elastic Compute Cloud (EC2) Angadh Singh Jerome Braun.
Cloud Standard API and Contextualization
Introduction to CVMFS A way to distribute HEP software on cloud Tian Yan (IHEP Computing Center, BESIIICGEM Cloud Computing Summer School.
EGEE-III INFSO-RI Enabling Grids for E-sciencE EGEE and gLite are registered trademarks C. Loomis (CNRS/LAL) M.-E. Bégin (SixSq.
BESIII Production with Distributed Computing Xiaomei Zhang, Tian Yan, Xianghu Zhao Institute of High Energy Physics, Chinese Academy of Sciences, Beijing.
How AWS Pricing Works Jinesh Varia Technology Evangelist.
Enterprise Cloud Computing
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
Status of BESIII Distributed Computing BESIII Workshop, Sep 2014 Xianghu Zhao On Behalf of the BESIII Distributed Computing Group.
Launch Amazon Instance. Amazon EC2 Amazon Elastic Compute Cloud (Amazon EC2) provides resizable computing capacity in the Amazon Web Services (AWS) cloud.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
CloudBerry Explorer for S3. CB Explorer Free to use Browse and manage files PowerShell functions Open and edit files  CloudBerry Explorer is an easy.
© 2015 MetricStream, Inc. All Rights Reserved. AWS server provisioning © 2015 MetricStream, Inc. All Rights Reserved. By, Srikanth K & Rohit.
Power your applications and website with our DDoS Protected VPS hosting. Latest Intel Xeon CPU’s, Pure SSD storage and elastic scalability. Deploy your.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
INTRODUCTION TO AMAZON WEB SERVICES (EC2). AMAZON WEB SERVICES  Services  Storage (Glacier, S3)  Compute (Elastic Compute Cloud, EC2)  Databases (Redshift,
SEMINAR ON.  OVERVIEW -  What is Cloud Computing???  Amazon Elastic Cloud Computing (Amazon EC2)  Amazon EC2 Core Concept  How to use Amazon EC2.
Lessons learned administering a larger setup for LHCb
Tekslate Introduction to AWS. Introduction to Cloud Computing Cloud computing is the on-demand delivery of IT resources and applications via the Internet.
Platform as a Service (PaaS)
Course: Cluster, grid and cloud computing systems Course author: Prof
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
THE BATTLE OF CLOUDS Openstack vs. Amazon
Modeling Big Data Execution speed limited by: Model complexity
Platform as a Service (PaaS)
Virtualization Overview
Working With Azure Batch AI
Amazon Storage- S3 and Glacier
AWS Tutorials i2c Lab.
ATLAS Cloud Operations
GWE Core Grid Wizard Enterprise (
Quattor in Amazon Cloud
StratusLab Final Periodic Review
Agenda Who am I? Whirlwind introduction to the Cloud
StratusLab Final Periodic Review
Introduction to CVMFS A way to distribute HEP software on cloud
Submit BOSS Jobs on Distributed Computing System
Spatial Analysis With Big Data
Computing at CEPC Xiaomei Zhang Xianghu Zhao
The Improvement of PaaS Platform ZENG Shu-Qing, Xu Jie-Bin 2010 First International Conference on Networking and Distributed Computing SQUARE.
Virtualization in the gLite Grid Middleware software process
CernVM Status Report Predrag Buncic (CERN/PH-SFT).
AWS COURSE DEMO BY PROFESSIONAL-GURU. Amazon History Ladder & Offering.
Amazon AWS Solution Architect Associate Exam Dumps For Full Exam Info Visit This Link:
Xiaomei Zhang On behalf of CEPC software & computing group Nov 6, 2017
AWS Cloud Computing Masaki.
Exploring Multi-Core on
SQL Server on Amazon Web Services
SQL Server on Amazon Web Services
Presentation transcript:

AWS Integration in Distributed Computing Xianghu Zhao IHEP Computing Center 2016 BESIIICGEM Cloud Computing Summer School

Content Why use AWS in HEP experiments How to use AWS AWS integration in distributed computing

Amazon Web Services (AWS)

Pricing Models

Free Tier for 12 Months New registers could test AWS for free in 12 months

Flexible Resource Usage

Situation for HEP Experiments Computing requirements fluctuate a lot Increase before some meeting Get the analysis results faster than competitors Deploy of local resources need time and manpower Local resources

Why Use Commercial Cloud Relatively limitless resources Mature and stable Reduce the task of maintaining a local site AWS as the first choice AWS is the largest and most popular one CERN has researched and tested on AWS a lot

How to Use AWS

AWS EC2 Instance Types Predefined by AWS General Purpose T2 (Burstable Performance Instances) M4, M3 Compute Optimized C4, C3 Memory Optimized X1, R3 GPU Instances G2

Create Instance on Web Panel Log in to AWS EC2 panel

Create Instance on Web Panel

Create Instance on Web Panel

Create Instance on Web Panel Storage located on EBS (Elastic Block Storage) Disks used by instance SSD or Magnetic

Create Instance on Web Panel Configure the firewall rules

Manage The Access Key Create access key in IAM panel Get access key and secret key for AWS command and API

Command Tools for AWS AWS Command Line Interface Amazon EC2 API Tools AWS official full-featured cli Install with pip pip install awscli Amazon EC2 API Tools ec2-api-tools EC2 official tools written in java euca2ools Compatible with Amazon EC2 and IAM APIs

EC2 SDK Example boto is the official SDK for AWS with full functionalities Here is a simple example using boto3 SDK to access AWS EC2 List all private images and instances

AWS Integration in Distributed Computing

Virtual Machine Scheduler DIRAC provide job scheduler VMDIRAC provide virtual machine scheduler Support AWS EC2 with boto SDK

Detailed Configuration Import image No public image suitable for BESIII software environment Import with ec2-api-tools (ec2-import-instance) Test the image and network Create a squid instance for caching http request Python boto SDK test Create access and secret key Configuration in VMDIRAC and add new site Add simple support for multi-core instance

AWS Test with BESIII Software Using AWS EC2 to do the computing task. Output data transferred back to IHEP grid storage Test with BESIII simulation, reconstruction and analysis jobs 600 jobs finished, 10 GB data transferred back to IHEP High success rate close to 100% Computing efficiency and data transfer are reliable Job Number Jobs in several submissions

AWS Performance Test Comparison between different instance types c3 instances are best for BESIII computing Higher computing efficiency Comparable lower price Computing efficiency is comparative with local physics server Local server CPU E5-2630 v3 Simulation (s/event) Reconstruction (s/event) Analysis (s/event) CPU Usage t2.micro 4.08 1.61 0.0357 86.5% m3.medium 1.03 0.32 0.0073 95.7% c3.large 0.64 0.21 0.0044 95.6% Local Server 0.40 0.13 0.0028 99.5%

AWS Billing Analysis Enable the billing report in web panel to get the detailed billing information in S3 Test with c3.large instance. Running about 4 hours EC2 holds most part of the billing Consider BESIII MC job (sim+rec+ana) as example, 1000 rhopi events need to pay 0.20 RMB Billing (CNY) Percentage Data Transfer 1.60 2 % EC2 c3.large Instance 73.60 92 % EBS I/O Requests 2.40 3 % EBS Storage Other -

Possible Usage in Future A good complement to the local resources Spot instance Get computing resources with much lower price Adjust the virtual machine and job scheduling policy It could also require the physics software to change the computing model Storage Storage data on S3/Glacier Not used in the previous test High price and security consideration

Thanks!