Boyd Wilson (Boydw at Omnibond dot com)

Slides:



Advertisements
Similar presentations
Request Dispatching for Cheap Energy Prices in Cloud Data Centers
Advertisements

SpringerLink Training Kit
Luminosity measurements at Hadron Colliders
From Word Embeddings To Document Distances
Choosing a Dental Plan Student Name
Virtual Environments and Computer Graphics
Chương 1: CÁC PHƯƠNG THỨC GIAO DỊCH TRÊN THỊ TRƯỜNG THẾ GIỚI
THỰC TIỄN KINH DOANH TRONG CỘNG ĐỒNG KINH TẾ ASEAN –
D. Phát triển thương hiệu
NHỮNG VẤN ĐỀ NỔI BẬT CỦA NỀN KINH TẾ VIỆT NAM GIAI ĐOẠN
Điều trị chống huyết khối trong tai biến mạch máu não
BÖnh Parkinson PGS.TS.BS NGUYỄN TRỌNG HƯNG BỆNH VIỆN LÃO KHOA TRUNG ƯƠNG TRƯỜNG ĐẠI HỌC Y HÀ NỘI Bác Ninh 2013.
Nasal Cannula X particulate mask
Evolving Architecture for Beyond the Standard Model
HF NOISE FILTERS PERFORMANCE
Electronics for Pedestrians – Passive Components –
Parameterization of Tabulated BRDFs Ian Mallett (me), Cem Yuksel
L-Systems and Affine Transformations
CMSC423: Bioinformatic Algorithms, Databases and Tools
Some aspect concerning the LMDZ dynamical core and its use
Bayesian Confidence Limits and Intervals
实习总结 (Internship Summary)
Current State of Japanese Economy under Negative Interest Rate and Proposed Remedies Naoyuki Yoshino Dean Asian Development Bank Institute Professor Emeritus,
Front End Electronics for SOI Monolithic Pixel Sensor
Face Recognition Monday, February 1, 2016.
Solving Rubik's Cube By: Etai Nativ.
CS284 Paper Presentation Arpad Kovacs
انتقال حرارت 2 خانم خسرویار.
Summer Student Program First results
Theoretical Results on Neutrinos
HERMESでのHard Exclusive生成過程による 核子内クォーク全角運動量についての研究
Wavelet Coherence & Cross-Wavelet Transform
yaSpMV: Yet Another SpMV Framework on GPUs
Creating Synthetic Microdata for Higher Educational Use in Japan: Reproduction of Distribution Type based on the Descriptive Statistics Kiyomi Shirakawa.
MOCLA02 Design of a Compact L-­band Transverse Deflecting Cavity with Arbitrary Polarizations for the SACLA Injector Sep. 14th, 2015 H. Maesaka, T. Asaka,
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Fuel cell development program for electric vehicle
Overview of TST-2 Experiment
Optomechanics with atoms
داده کاوی سئوالات نمونه
Inter-system biases estimation in multi-GNSS relative positioning with GPS and Galileo Cecile Deprez and Rene Warnant University of Liege, Belgium  
ლექცია 4 - ფული და ინფლაცია
10. predavanje Novac i financijski sustav
Wissenschaftliche Aussprache zur Dissertation
FLUORECENCE MICROSCOPY SUPERRESOLUTION BLINK MICROSCOPY ON THE BASIS OF ENGINEERED DARK STATES* *Christian Steinhauer, Carsten Forthmann, Jan Vogelsang,
Particle acceleration during the gamma-ray flares of the Crab Nebular
Interpretations of the Derivative Gottfried Wilhelm Leibniz
Advisor: Chiuyuan Chen Student: Shao-Chun Lin
Widow Rockfish Assessment
SiW-ECAL Beam Test 2015 Kick-Off meeting
On Robust Neighbor Discovery in Mobile Wireless Networks
Chapter 6 并发:死锁和饥饿 Operating Systems: Internals and Design Principles
You NEED your book!!! Frequency Distribution
Y V =0 a V =V0 x b b V =0 z
Fairness-oriented Scheduling Support for Multicore Systems
Climate-Energy-Policy Interaction
Hui Wang†*, Canturk Isci‡, Lavanya Subramanian*,
Ch48 Statistics by Chtan FYHSKulai
The ABCD matrix for parabolic reflectors and its application to astigmatism free four-mirror cavities.
Measure Twice and Cut Once: Robust Dynamic Voltage Scaling for FPGAs
Online Learning: An Introduction
Factor Based Index of Systemic Stress (FISS)
What is Chemistry? Chemistry is: the study of matter & the changes it undergoes Composition Structure Properties Energy changes.
THE BERRY PHASE OF A BOGOLIUBOV QUASIPARTICLE IN AN ABRIKOSOV VORTEX*
Quantum-classical transition in optical twin beams and experimental applications to quantum metrology Ivano Ruo-Berchera Frascati.
The Toroidal Sporadic Source: Understanding Temporal Variations
FW 3.4: More Circle Practice
ارائه یک روش حل مبتنی بر استراتژی های تکاملی گروه بندی برای حل مسئله بسته بندی اقلام در ظروف
Decision Procedures Christoph M. Wintersteiger 9/11/2017 3:14 PM
Limits on Anomalous WWγ and WWZ Couplings from DØ
Presentation transcript:

Boyd Wilson (Boydw at Omnibond dot com) Cloud HPC Overview Boyd Wilson (Boydw at Omnibond dot com) March 2016

Outline Overview Public Cloud History -> Present Problem Historical Issues Security / Compliance Cost Problem Building Blocks of HPC & Big Data Clusters AWS Security Networking Compute Instance Storage How its Built Compute Groups & Schedulers High Performance Working or Scratch Storage Storage Access Software Collaboration Demo

Public Cloud History -> Present Initially the discussion around the public cloud was comparing it to on premise virtualization What people are learning now is the public cloud is really a dynamic API driven infrastructure Forward looking companies have used this API driven infrastructure to leap ahead: Netflix, Airbnb, Yelp, Expedia, Adobe, Pinterest, zynga, gilt, Mlbam, Slack, Foursquare, Lyft, Dow Jones, Bristol -Myers Squibb, etc…

Security & Compliance of the Public Cloud CSA ISO 9001 ISO 27001 ISO 27018 MPAA CJIS DIACAP DoD FDA FedRamp FERPA FIPS HIPAA GxP ITAR NIST EU Data Protection IT-Grundschutz G-Cloud Malaysian Privacy Consideration MLPS MTCS Singapore Privacy Consideration IRAP New Zealand Privacy Consideration

Costs of Public Clouds (2013-2014) Source: RBC Capital Markets, Company Reports

Economies of Scale? Computers = public facing http://news.netcraft.com/archives/2013/05/20/amazon-web-services-growth-unrelenting.html

Price War II is coming AWS – the Gorilla in the space Gartner May 2015, “10x bigger than its next 14 competitors combined”, “5x the cloud capacity in use than the aggregate total of the other 14 providers” Azure – Investing Heavily Just Released ARM (not the processor) Supports IB Google – Silently Releasing more and more AWS like services

Confessions of a Former Data Center Director Power and Cooling (kWh) Compute Capacity (cost per GB Ram) Storage (cost per TB) Network (cost per port) Costs always calculated at Max Utilization Lets not discuss labor, it’s a sunk cost… Time to Use (depreciation) Per unit costs go down as more use a resource, there is a cost associated with delayed adoption Headroom tape library example

Social Side of Funding Open Questions (Conversation with Rick) When does the are the break even costs vs. local resource? How do we Compare at scale? How does a site ramp up to the cloud (Training)? How would funding look if certain places went Public Cloud only? How would funding agencies recognize the Public Cloud wrt funding?

Example AWS EC2 Instance Pricing

Building Blocks

How to pull it all together

CloudyCluster Goals On Demand HPC and BigData Resources in AWS Compute Instances High Performance Storage Choice of Schedulers (Initially Torque) Simple Deployment, pausing and deletion from phone, tablet or desktop Elastic HPC based on Jobs Submitted with CCQ Available in the AWS Marketplace with a pay as you go model (payment goes through amazon)

Security Virtual Private Cloud IAM Roles and Permissions Provides network security layer Public and Private Subnet options Requires Bastion Host for SSH access Amazon VPC IAM Roles and Permissions Assign Roles Permissions to create and interact with AWS Constructs Option to assign Roles to Instances, enabling them to perform actions via APIs Security Groups provide for restrictions on network interfaces roles IAM permissions

Networking Subnets NAT Instances Public and private options within a VPC Dynamically calculated subnet-mask based upon the number of instances requested Subnets NAT Instances Provide external access (Internet) for instances in the VPC NAT

Compute Amazon Machine Image Compute Instances Is the unit of compute Can be of many OS types and Flavors Can have the software needed preinstalled Customers can add their own SW and save a new AMI AMI Compute Instances The running instantiation of an AMI Option to create an Auto Scaling group with policies for increasing/decreasing the number of instances based on workload through CCQ. Can assign Roles to Instances enabling software on an instance to perform AWS actions via APIs Compute Instance Auto Scaling

Instance Storage & Metadata Elastic Block Storage (EBS) Volume Attached to an instance for block level storage IOPS can be configured at creation time EBS Local Instance Storage Storage volumes available on the local instance SSD and Rotational Types of varying sizes depending on Instance type Local Instance Storage EFS DynamoDB NoSQL data service provided by AWS DynamoDB

How CloudyCluster is Built

Compute Groups & Schedulers A CloudyCluster Compute Group is an AWS auto-scaling group of a given instance type, with all instances configured to work with the same scheduler. Compute Group 1 C-1 C-2 C-3 C-4 Schedulers CloudyCluster provides options for a scheduler for one or more compute groups. Torque/Maui, Slurm and SGE are planned scheduler options, initial release supports Torque/Maui Compute groups are automatically registered with the corresponding scheduler as they are added. Scheduler Compute Scaling option for compute groups If a scheduler is configured for elastic scaling through CCQ dispatcher, jobs will drive the instance launching and post-job-termination automatically C-5 C-6 C-7 C-8 Utility Torque1 SGE1 Condor1

Working / Scratch and Home Storage Working / Scratch Storage A CloudyCluster working or scratch storage automatically combines multiple instances and EBS storage into a unified high performance parallel file system. Future versions will allow storage across the compute instance local storage Option to configure automatic failover instances if an instance dies The working / scratch file system is automatically mounted on every node in the cluster. Option to configure WebDav (via an Login Instance) availability for the working / scratch file system and or EFS OrangeFS1 WS-1 WS-2 WS-3 WS-4 WS-5 WS-6 WS-7 WS-8 Utility Other Storage CloudyCluster offers EFS support (NFS service by AWS) The EFS file system is automatically mounted on every compute node if selected. EFS

Storage Access Storage Access OrangeFS1 Storage Access Option to make Working / Scratch and EFS storage automatically accessible from WebDav Data integration can also be accomplished with Globus as a supported storage access methods. iRODS will be integrated in the future Future development targets simplified CloudyCluster data loading and results retrieval DynamoDB Stores Metadata for CloudyCluster User and collaboration data. CCQ job and data WS-1 WS-2 WS-3 WS-4 WS-5 WS-6 WS-7 WS-8 Utility Access Instance EFS DynamoDB

All Together Compute Groups Public Subnet Scheduler Compute Groups Login Instance: WebDAV Globus DynamoDB NAT Highly Available Working / Scratch OrangeFS Storage Management Instance

CCQ - Elastic HPC Dispatching CCQ holds job determines and launches instances needed CCQ Sends the job to the scheduler when ready Submit Job Through CCQ Public Subnet Login Instance WebDAV Scheduler launches normally DynamoDB Scheduler If no jobs in the queue for that instance type near the hour, instances are terminated

HPC Software Included You can also add your own (ex EMC 2-Tier) Ambertools ANN ATLAS BLAS Blast Blender Burrows-Wheeler Aligner CESM GROMACS LAMM NCAR NCL NCO Nwchem OpenFoam PAPI Paraview Quantum Espresso SAMtools WRF You can also add your own (ex EMC 2-Tier)

HPC Infrastructure and Libs Included Boost Cuda Toolkit Docker FFTW FLTK GCC Gengetopt GRIB2 GSL Hadoop HDF5 ImageMagick JasPer NetCDF NumPy Octave OpenCV OpenMPI PROJ R Rmpi SciPy SWIG WGRIB UDUNITS JasPer Octave OpenCV OpenMPI PROJ R Rmpi SciPy SWIG WGRIB UDUNITS

Collaboration Have the ability to create collaborations Invite other collaborators to CloudyCluster Initially can share Google Drive Folders Oauth and Shib InCommon support

Other Items To run CloudyCluster you may have to ask for some of the initial AWS limits be raised: The number of Instances are initially limited to 20 Some instances are limited further. Read up on limits before you attempt to spin up a larger cluster. The number of EIPs is limited to 5 5 VPCs per Region (each cluster requires a VPC) Billing Alerts Are good for general cloud usage

See CloudyCluster.com for videos, Docs and Quickstart Guide Demo See CloudyCluster.com for videos, Docs and Quickstart Guide

Conclusion Imagine a researcher needing to solve the next big problem sitting in front of them in their area of expertise and having the ability to go to a Public Cloud Marketplace and launch, manage and easily maintain a high performance computing infrastructure, without needed to go through long procurement processes and start to get the results needed. Now multiply this by the number of people staring at their monitors wondering how they could possibly compute what is needed. …and the advancements and innovation in the world just exponentially increased.

Omnibond.com Info at: CloudyCluster.com

Intelligent Transportation Solutions Solution Areas Intelligent Transportation Solutions Identity Manager Drivers & Sentinel Connectors Parallel Scale-Out Storage Software Social Media Interaction System