Agenda Who am I? Whirlwind introduction to the Cloud

Slides:



Advertisements
Similar presentations
Making Fly Parviz Deyhim
Advertisements

Gold Sponsors Bronze Sponsors Silver Sponsors Taking SharePoint to the Cloud Aaron Saikovski Readify – Software Solution Specialist.
B. Ramamurthy 4/17/ Overview of EC2 Components (fig. 2.1) 10..* /17/20152.
Lecture 12: Cloud Computing-C Amazon Web Service Tutorial.
1 NETE4631 Cloud deployment models and migration Lecture Notes #4.
Infrastructure as a Service (IaaS) Amazon EC2
Deploying an Application on the Cloud Chapter 4. Topics Your experience with Google App Engine and mine with Pop!World Web application Architecture Machine.
Matt Bertrand Building GIS Apps in the Cloud. Infrastructure - Provides computer infrastructure, typically a platform virtualization environment, as a.
Chien-Chung Shen Google Compute Engine Chien-Chung Shen
© 2009 IBM Corporation ® IBM Software Group Introduction to Cloud Computing Vivek C Agarwal IBM India Software Labs.
Google AppEngine. Google App Engine enables you to build and host web apps on the same systems that power Google applications. App Engine offers fast.
Amazon EC2 Quick Start adapted from EC2_GetStarted.html.
VIR314. Understand the scenarios Application support Understand the scenarios Application support Review of the sequencing process Demo Review of the.
Experiences with AWS and RightScale By: Max Gribov Presented at New York PHP, March 22, 2011
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
Accessing the Amazon Elastic Compute Cloud (EC2) Angadh Singh Jerome Braun.
© 2010 VMware Inc. All rights reserved Confidential VMware vFabric Data Director Powering Database-as-a-Service for Oracle, SQL Server, Hadoop and vFabric.
From Virtualization Management to Private Cloud with SCVMM 2012 Dan Stolts Sr. IT Pro Evangelist Microsoft Corporation
Eucalyptus 3 (&3.1). Eucalyptus 3 Product Overview – Govind Rangasamy.
Conversing in the Cloud Ryan Kupfer, Scott Wetter, Bryan Welfel, Shekhar Pradhan.
ArcGIS Server for Administrators
Continuous Delivery on AWS
Ubuntu, SUSE, OpenSUSE, CentOS & Oracle EL + hundreds on VM Depot Bring your own framework! Ecosystem Supported Microsoft 1st Party Support.
A Technical Overview Bill Branan DuraCloud Technical Lead.
Deploying Docker Datacenter on AWS © 2016, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Tekslate Introduction to AWS. Introduction to Cloud Computing Cloud computing is the on-demand delivery of IT resources and applications via the Internet.
If it’s not automated, it’s broken!
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Windows 2012R2 Hyper-V and System Center 2012
Connected Infrastructure
data & analytics beyond dashboards
100% Exam Passing Guarantee & Money Back Assurance
Efficient development and deployment of Hydra projects using Vagrant
AWS Integration in Distributed Computing
DocFusion 365 Intelligent Template Designer and Document Generation Engine on Azure Enables Your Team to Increase Productivity MICROSOFT AZURE APP BUILDER.
Customized cloud platform for computing on your terms !
Deploy, Manage, and Scale Your Apps with OpsWorks, Elastic Beanstalk, and CodeDeploy Part 1 – Elastic Beanstalk © 2017 Amazon Web Services, Inc. and.
Connect Your Apps and Automate Your Business: OneSaas on Microsoft Azure Empowers SMBs to Save Time and Make Better Business Decisions MICROSOFT AZURE.
AWS Tutorials i2c Lab.
Cloud Adoption Framework
EPAM Cloud Orchestration
Platform as a Service.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Connected Infrastructure
9/13/2018 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
Acutelearn Amazon Web Services Training Classroom Training Instructor led trainings at Acutelearn premises Corporate Training Custom tailored trainings.
OpenNebula Offers an Enterprise-Ready, Fully Open Management Solution for Private and Public Clouds – Try It Easily with an Azure Marketplace Sandbox MICROSOFT.
Cloud Computing ISY143.
Take Control of Insurance Product Management: Build, Test, and Launch Any Product Globally 10x Faster, 10x More Cheaply with INSTANDA on Azure Partner.
AWS DevOps Engineer - Professional dumps.html Exam Code Exam Name.
2018 Amazon AWS DevOps Engineer Professional Dumps - DumpsProfessor
Get Amazon AWS-DevOps-Engineer-Professional Exam Real Questions - Amazon AWS-DevOps-Engineer-Professional Dumps Realexamdumps.com
Microsoft Virtual Academy
Managed Public Cloud Services
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
Data Security for Microsoft Azure
AdQ is Azure-Powered Pre-Roll Ad Management Software That Improves Pre-Roll Ad Performance, Increases Profits, and Optimizes User Experience MICROSOFT.
Introduction to Apache
Microsoft Virtual Academy
Syllabus and Introduction Keke Chen
In this session… Introduce what we’re talking about
Managing Services with VMM and App Controller
Microsoft Virtual Academy
Lecture 16B: Instructions on how to use Hadoop on Amazon Web Services
Different types of Linux installation
Deploying Your First Full Stack Application to the Cloud
Building Serverless Enterprise Applications
Agenda Need of Cloud Computing What is Cloud Computing
Ready Pre-day Azure Monitoring Workshop
Setting up PostgreSQL for Production in AWS
Presentation transcript:

Automating Cloud Cluster Deployment: Beyond the Book Bill Havanki, Cloudera

Agenda Who am I? Whirlwind introduction to the Cloud How things take time Automation techniques and tools Cluster usage patterns from automation Dragons

Who am I? Bill Havanki Working for Cloudera ~4 years now Worked in US government contracting for ~15 years before that Originally from New Jersey Wrote a book about Hadoop and Cloud

You probably already know. MOVING ON What is Hadoop? You probably already know. MOVING ON

instances networking storage What is the Cloud? "The cloud" is a large set of computing resources made available by a cloud provider for customers to use and control for general purposes. starting stopping modifying AWS Google MS Azure Hadoop!

What is Hadoop in the Cloud? Hadoop in the Cloud means running Hadoop clusters on resources offered by a cloud provider.

Key cloud resource concepts c = a + b Instances Networking Storage

Easy! Through web consoles, API, or CLI! Resource Creation Easy! Through web consoles, API, or CLI! https://www.youtube.com/watch?v=0OIBMBteXcI

Faster, Must Go Faster cluster admins analysts Manually is fine for trying things out, but you want automation for real work.

Why does it take so loooooooong Cloud provider operations take time launching the manager instance launching the worker instances patching / updating as necessary

Why does it take so loooooooong Hadoop setup operations take time installing Java and Hadoop setting up cluster accounts configuring HDFS and YARN and … formatting HDFS establishing SSH tunnels or SOCKS proxies or … Hive? Installing Hive, setting up the metastore, … Spark? Installing Spark, …

Bake in the common setup work for instances. Custom images Bake in the common setup work for instances. OS updates Java and Hadoop installation account creation (and SSH key configuration) most of Hadoop configuration

Automate image creation Images can be created manually, but tools like Packer automate the process.

Example Packer template JSON { "builders": [{ "type": "amazon-ebs", "instance_type": "m4.large", "region": "us-east-1", "source_ami": "ami-6d1c2007", ... }], "provisioners": [ { "type": "shell", "script": "prov/packages.sh" }, "script": "prov/users.sh" ... ] }

Hands-on automation using provider CLIs and SDKs aws ec2 run-instances --image-id ami-6d1c2007 --instance-type m4.large --key-name mypair gcloud compute instances create manager-1 --image-family centos-7 --machine-type n1-highcpu-4 az vm create –n manager-1 –g my-resource-group --image CentOS --size Standard_D4_v3 AWS CLI pip install awscli Google Cloud SDK yum install google-cloud-sdk apt-get install google-cloud-sdk Azure CLI curl -L https://aka.ms/InstallAzureCli | bash apt-get install azure-cli

Just when you thought you were out … Creating clusters is one thing. Growing and shrinking is another.

Monitors show you the current conditions Monitor cluster metrics like HDFS usage and YARN / compute load

Alerts activate when thresholds are crossed Create alarm conditions for when metrics cross thresholds that indicate overutilization … or underutilization Alarms can send messages and trigger actions, like adding or removing instances

Automation makes things faster, which leads to new ways of working. New patterns Automation makes things faster, which leads to new ways of working.

Standardized cluster types Build out images, each with the optimal mix of services for a type of work log / IoT ingest and ETL heavy back-end analysis live web or application support Maintain images with the latest and greatest services and tools. Version control them for repeatability and testability.

Automated cluster elasticity Use those metrics and alarms to adjust cluster sizes based on demand. end of quarter long weekend holiday rush Save money and remain responsive.

Transient clusters Don’t run clusters unless they are in use. Create and destroy them based on need. save cloud costs reduce administrative and troubleshooting burden reduce cluster variations shift between regions … or providers, more easily improve security by reducing sharing

Self-service clusters Let cluster users run the automation. no organizational bottlenecks higher customer satisfaction

Ready to bang out some scripts? You’ll want to use an automation framework of course.

BUT

It’s tougher than it looks How will you manage credentials? How will you track asynchronous tasks? What do you do when the provider has a problem? How will you retry tasks? Do you need to set up networks or security rules?

It’s tougher than it looks Do you need to span providers? Will you need a UI? Will others want to use the automation? How will you set up databases? High availability? TLS? Kerberos?

Think about existing cluster automation tools Some are provider specific.

Think about existing cluster automation tools Some are not provider specific.

Where to go from here? identify your own opportunities for automation make some images, by hand if you like try Packer learn your cloud provider’s API and CLI write some automation tools learn about and try existing cluster automation tools

mh2c.com/strata17

Bill Havanki, Cloudera bhavanki@cloudera.com