UC Berkeley 1 Above the Clouds: A Berkeley View of Cloud Computing Armando Fox, UC Berkeley Reliable Adaptive Distributed Systems Lab © 2009 Image: John.

Slides:



Advertisements
Similar presentations
UC Berkeley Above the Clouds A Berkeley View of Cloud Computing 1 UC Berkeley RAD Lab.
Advertisements

Large Scale Computing Systems
Take your CMS to the cloud to lighten the load Brett Pollak Campus Web Office UC San Diego.
Amazon. Cloud computing also known as on-demand computing or utility computing. Similar to other utility providers like electric, water, and natural gas,
Cloud Computing Imranul Hoque. Today’s Cloud Computing.
The Next I.T. Tsunami Paul A. Strassmann. Copyright © 2005, Paul A. Strassmann - IP4IT - 11/15/05 2 Perspective Months  Weeks.
UC Berkeley 1 Above the Clouds: A Berkeley View of Cloud Computing Armando Fox, UC Berkeley Reliable Adaptive Distributed Systems Lab Image: John Curley.
Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2010.
Data-Intensive Science (eScience) Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2011.
Live for today as if it is your last day but plan for tomorrow as if it will last forever!
CLOUD COMPUTING.
Cloud Computing: Overview 1. This lecture What is cloud computing? What are its essential characteristics? Why cloud computing? Classification/service.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
An Introduction to Cloud Computing. The challenge Add new services for your users quickly and cost effectively.
EA and IT Infrastructure - 1© Minder Chen, Stages in IT Infrastructure Evolution Mainframe/Mini Computers Personal Computer Client/Sever Computing.
Cloud Computing – The Cloud Dr. Jie Liu. Definition  Cloud computing is Web-based processing, whereby shared resources, software, and information are.
Clouds on IT horizon Faculty of Maritime Studies University of Rijeka Sanja Mohorovičić INFuture 2009, Zagreb, 5 November 2009.
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Introduction to Cloud Computing
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Cloud Computing: Overview
CEG7380 Cloud Computing Lecture 1
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
“Clouds: a construction zone” (and Why PaaS is the future…) Matt Thompson General Manager, Developer & Platform Evangelism Microsoft.
Cloud Computing Definitions Cloud The set of hardware, networks, storage, services and interfaces that combine to deliver computing as a service Cloud.
GIS and Cloud Computing. Flickr  Upload and manage your photos online  Share your photos with your family and friends  Post your photos everywhere.
Cloud Computing. What is Cloud Computing? Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable.
Cloud Computing 1. Outline  Introduction  Evolution  Cloud architecture  Map reduce operation  Platform 2.
Introduction to Cloud Computing
UC Berkeley 1 The Potential of Cloud Computing: Challenges, Opportunities, Impact Armando Fox, UC Berkeley Adapted from NAE Gilbreth Lecture, October 2011.
The Magic of the Cloud: Supercomputers for Everyone, Everywhere Prof. Eric A. Brewer UC Berkeley.
Cloud Computing Dave Elliman 11/10/2015G53ELC 1. Source: NY Times (6/14/2006) The datacenter is the computer!
Presented by: Mostafa Magdi. Contents Introduction. Cloud Computing Definition. Cloud Computing Characteristics. Cloud Computing Key features. Cost Virtualization.
 Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over a network (typically the Internet). 
The New Zealand Institute for Plant & Food Research Limited Use of Cloud computing in impact assessment of climate change Kwang Soo Kim and Doug MacKenzie.
How AWS Pricing Works Jinesh Varia Technology Evangelist.
“Big Data” and Data-Intensive Science (eScience) Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington July.
CSE 451: Operating Systems Spring 2013 Module 26 Cloud Computing Ed Lazowska Allen Center 570 © 2013 Gribble, Lazowska, Levy,
Distributed Systems Foundations Lecture 0. Evolution of computing history Main Frame with terminals Network of PCs & Workstations. Client-Server Now,
Systems and Networking Challenges in Cloud Computing: Toward Software-Defined Clouds Aditya Akella TA: Aaron Gember Fall
Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington.
CSE 451: Operating Systems Autumn 2010 Module 25 Cloud Computing Ed Lazowska Allen Center 570.
Enterprise Cloud Computing
EScience: Techniques and Technologies for 21st Century Discovery Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering Computer Science.
MidVision Enables Clients to Rent IBM WebSphere for Development, Test, and Peak Production Workloads in the Cloud on Microsoft Azure MICROSOFT AZURE ISV.
Web Technologies Lecture 13 Introduction to cloud computing.
Information Systems in Organizations 5.2 Cloud Computing.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Cloud Computing Shannon McManus Michael Weihert. What is Cloud Computing?
RANDY MODOWSKI COSC Cloud Computing. Road Map What is Cloud Computing? History of “The Cloud” Cloud Milestones How Cloud Computing is being used.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
4a. Aula 2o. Período de Livro texto Copyright © 2012, Elsevier Inc. All rights reserved March 5, 2012 Prof. Kai Hwang, USC Cloud Roles in.
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Clouds, Grids and Clusters Prepared by M.Chandana Department of CSE Engineered for Tomorrow Course code: 10CS845.
Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2012.
Above the Clouds: A Berkeley View of Cloud Computing Annajiat Alim Rasel, P Shimul Bala, P Raquibul Bari, P Annajiat Alim.
Intro to Software as a Service (SaaS) and Cloud Computing
What is Cloud Computing - How cloud computing help your Business?
An Introduction to Cloud Computing
Amazon Web Services Submitted By- Section - B Group - 4
Cloud Computing Ed Lazowska August 2011 Bill & Melinda Gates Chair in
CEG7380 Cloud Computing Lecture 1
Introduction to Cloud Computing
Cloud Computing Dr. Sharad Saxena.
Above the Clouds A Berkeley View of Cloud Computing
INFO 344 Web Tools And Development
Syllabus and Introduction Keke Chen
Presentation transcript:

UC Berkeley 1 Above the Clouds: A Berkeley View of Cloud Computing Armando Fox, UC Berkeley Reliable Adaptive Distributed Systems Lab © 2009 Image: John Curley

What is distributed computing? What is warehouse-scale computing? What is cloud computing? Why should you care? 2

UC Berkeley What is distributed computing? 3

The first demonstration of how to build really large Internet sites out of clusters of computers was done by: (a) Stanford (b) Berkeley (c) Yahoo! (d) Google (e) IBM 4

Big Computers c Sun E “supermini” Up to 64 Up to 64 GB RAM Up to 20 TB Disk Used by eBay, among others PC 200 MHz CPU, 32MB RAM, 4 GB disk 5

UC Berkeley Networks Of Workstations ( ) 6

NOW-0 (4 HP-735’s) NOW Four HP-735’s 7

NOW Sun SPARC- stations 8

NOW Sun SPARC-2 9

A Google datacenter built c.2005 would be designed to house approximately _______ computers. (a) 1,000 (b) 5,000 (c) 10,000 (d) 50,000 (e) 100,000 10

UC Berkeley Challenge: how do you program a NOW? (or: what is it good for?) 11

The Killer App for NOWs Prof. Eric Brewer, Armando Fox, Steve Gribble, Paul Gauthier, Yatin Chawathe: Cluster-Based Scalable Network Servers in Symposium on Operating Systems Principles, 1997 Non-goal: build best/fastest search engine –But led to Inktomi, first truly scalable search engine that took advantage of NOW ideas Goal: show general techniques for programming NOW’s for Internet services 12

Access Is the Killer App! UC Berkeley, Project Daedalus: Profs. Katz & Brewer Data, services in infrastructure cloud –search, , personal comms, productivity... Mobile access anywhere, anytime Many “firsts”: –server architecture with auto-scaling –cluster-based Internet service: Inktomi –mobile Web: TopGun Wingman on Palm Challenge: deploying the service!

14

: Datacenter is new “server” “Program” => Web search, , map/GIS, … “Computer” => 1000’s computers, storage, network Warehouse-sized facilities and workloads photos: Sun Microsystems, CNET, & datacenterknowledge.com

RAD Lab 5-year Mission Enable 1 person to develop, deploy, operate next -generation Internet application Key enabling technology: Statistical machine learning Highly interdisciplinary faculty & students –7 faculty across CS, from theory to systems –2 postdocs, ~30 PhD students, ~5 undergrads 16

Utility Computing Arrives Amazon Elastic Compute Cloud (EC2) “Compute unit” rental: $ /hr. –1 CU ≈ GHz 2007 AMD Opteron/Xeon core N No up-front cost, no contract, no minimum Billing rounded to nearest hour; pay-as-you-go storage also available A new paradigm for deploying services? 17 “Instances”PlatformCoresMemoryDisk Small - $0.10 / hr32-bit11.7 GB 160 GB Large - $0.40 / hr64-bit47.5 GB 850 GB – 2 spindles XLarge - $0.80 / hr64-bit815.0 GB1690 GB – 3 spindles 17

But... What is cloud computing, exactly? 18

“It’s nothing (new)” “...we’ve redefined Cloud Computing to include everything that we already do... I don’t understand what we would do differently... other than change the wording of some of our ads.” – Larry Ellison, CEO, Oracle (Wall Street Journal, Sept. 26, 2008) 19

Above the Clouds: A Berkeley View of Cloud Computing abovetheclouds.cs.berkeley.edu 2/09 White paper by RAD Lab PI’s and students Goal: stimulate discussion on what’s really new –Clarify terminology –Comparison with conventional computing –Identify challenges & opportunities Why can we offer new perspective? –Strong engagement with industry –Users of cloud computing in our own research and teaching Over 60,000 downloads 20

Above The Clouds Impact Research collaborations/hires: Amazon, Google, Microsoft, Twitter, Facebook, Cloudera, Yahoo!... Invited presentations/advice –Google, Fujitsu, IBM, HP, Microsoft, SAP, Juniper,... –World Economic Forum –Nat’l Academy of Engineering –OpenCirrus Summit –UCB Office of the CIO –UC Systemwide Cloud Computing Task Force UCB is academic leader in cloud computing in both research & education

What is it? What’s new? Old idea: Software as a Service (SaaS) –Software hosted in the infrastructure vs. installed on local servers or desktops; dumb (but brawny) terminals New: pay-as-you-go utility computing –Illusion of infinite resources on demand –Fine-grained billing: release == don’t pay –Earlier examples: Sun, Intel Computing Services—longer commitment, more $$$/hour, no storage –Public (utility) vs. private clouds 22

How much data per month, approximately, is processed through Google’s BigTable storage system? (a) 1 TB (1,000 GB) (b) 100 TB (c) 1 PB (1,000 TB) (d) 100 PB (e) 1 EB (exabyte = 1,000 PB) 23

Why Now (not then)? The Web “Space Race”: Build-out of extremely large datacenters (10,000’s of commodity PCs) Driven by growth in demand (more users) –Infrastructure software: e.g., Google File System –Operational expertise –Discovered economy of scale: 5-7x cheaper than provisioning a medium-sized (100’s machines) facility More pervasive broadband Internet Free & open source software 24

Unused resources Cloud Economics 101 Static provisioning for peak - wasteful, but necessary for SLA “Statically provisioned” data center “Virtual” data center in the cloud Demand Capacity Time Machines Demand Capacity Time $ 25

Unused resources Risk of Under Utilization Underutilization results if “peak” predictions are too optimistic Static data center Demand Capacity Time Resources 26

Risks of Under Provisioning Lost revenue Lost users Resources Demand Capacity Time (days) 1 23 Resources Demand Capacity Time (days) 1 23 Resources Demand Capacity Time (days)

UC Berkeley What can you do with this? 28

Cost Associativity 1,000 CPUs for 1 hour same price as 1 CPU for 1,000 hours Washington Post converted Hillary Clinton’s travel documents to post on WWW –Conversion time: <1 day after released –Cost: less than $200 RAD Lab graduate students demonstrate improved MapReduce scheduling—on 1,000 servers 29

Risk transfer 2001: CNN home page meltdown on 9/11 –~10x traffic increase in ~15 minutes –result: site had to go offline 2008: Animoto – traffic doubled every 12 hours for 3 days when released as Facebook plug-in –Scaled from 50 to >3500 servers –...then scaled back down 30

Indexing the Web 31 To be or not to be......or a better fool......better to love a fool... toA beA orA notA orB aB betterB foolB betterC toC loveC aC foolC to A,C to A,C be A be A or A,B or A,B not A not A better B,C better B,C love C love C a B,C a B,C fool B,C fool B,C Map & Combine

MapReduce in Practice Example: spam classification –training: 10 7 URLs x 64KB data each = 640GB data –One heavy-duty server: ~270 hours –100 servers in cloud: ~3 hours (= ~$255) Rapid uptake in other scientific research –Large-population genetic risk analysis & simulation (Harvard Medical School) –Genome sequencing (UNC Chapel Hill Cancer Ctr) –many others... so what’s the downside?

Challenges & Opportunities Challenges to adoption, growth, & business/policy models Both technical and nontechnical Most translate to 1 or more opportunities Complete list in paper; I’ll discuss subset 33

Challenge: Cloud Programming Challenge: exposing parallelism –MapReduce relies on “embarrassing parallelism” Programmers must (re)write problems to expose this parallelism, if it’s there to be found Tools still primitive, though progressing rapidly

Challenge: Big Data ApplicationData generated per day DNA Sequencing (Illumina HiSeq machine) 1 TB Large Synoptic Survey Telescope30 TB; 400 Mbps sustained data rate between Chile and NCSA Large Hadron Collider60 TB Source: Ed Lazowska, eScience 2010, Microsoft Cloud Futures Workshop, lazowska.cs.washington.edu/cloud2010.pdf Challenge: Long-haul networking is most expensive cloud resource, and improving most slowly Copy 8 TB to Amazon over ~20 Mbps network => ~35 days, ~$800 in transfer fees How about shipping 8TB drive to Amazon instead? => 1 day, ~$150 (shipping + transfer fees)

Web services in the cloud

Cloud in Education 1.Berkeley research culture: integrate leading research into teaching at all levels 2.RAD Lab need for “killer apps” to show off infrastructure Current efforts (student counts approximate): Great Ideas in Computer Architecture (reinvented Fall 2010): 190 students Software Engineering for SaaS (in its 4 th iteration): students Operating Systems: 70 students Intro. Data Science (Spring 2010): 30 Adv. topics in HCI: 20 students Natural language processing: 20 students

AWS is a great fit for courses... New undergraduate teaching opportunities –SaaS: make a database fall over—would need 200 servers for ~20 project teams –deploy projects publicly, many continue after course Better use of resources –Heavy usage right before lab deadlines

Success stories

Summary Cloud computing democratizes access to “supercomputer-class” capability –All you need is a credit card Puts students, academia on more level playing field to have high impact in industry The next Google, eBay, Amazon, etc. can come from a small team of entrepreneurs even without heavy dose of $$ up front 40

Going back to NOW : using medium- sized clusters for Internet services => several PhD’s 2010: CS169 students do it in 6-8 weeks and deploy on cloud computing 2020: ?

UC Berkeley Thank you! 42