Presentation is loading. Please wait.

Presentation is loading. Please wait.

CEG7380 Cloud Computing Lecture 1

Similar presentations


Presentation on theme: "CEG7380 Cloud Computing Lecture 1"— Presentation transcript:

1 CEG7380 Cloud Computing Lecture 1
Keke Chen

2 Outline Syllabus Introduction Scope of this course Tentative schedule
Prerequisites Resources Assignments Introduction

3 Scope of this course Understand the basic ideas of cloud computing
Get familiar with Tools Systems Expose to some research topics

4 Two major parts: Processing large data with the cloud
Scaling up/down web applications with the cloud Note: some programming parts need self-study

5 Prerequisites Some programming skills Sufficient knowledge about
Java, python, shell Comfortable with learning new programming frameworks Sufficient knowledge about Data structure and databases Operating systems Distributed systems

6 Assignments and Grading
Reading papers (~3) (10%) Some miniprojects (4~5) (60%) Help you master the concepts Learn to use tools and systems Self-motivated research projects are strongly encouraged! Final exam (20%) Class attendance and discussion (10%)

7 Resources updated reference list Inhouse hadoop cluster AWS access
coupon code for each student Pilot Submitting reading assignments and projects

8 Tentative Schedule Parallel data processing Cloud infrastructures
Distributed file systems (GFS, HDFS) MapReduce High-level distributed data management Cloud infrastructures Virtualization AWS and Eucalyptus Interactive front-end – Google App Engine Cloud security and privacy Research topics

9 In projects, we will learn to use
Hadoop Mapreduce, Pig Latin AWS google app engine

10 Cloud Computing lecture 1-2
Some slides are borrowed from UC Berkeley RAD Lab Keke Chen

11 Outline What is cloud computing? Why now? Cloud killer applications
Cloud economics Challenges and opportunities “above the cloud” “Clairemont Report”

12 What is Cloud Computing?
Old idea: Software as a Service (SaaS) Def: delivering applications over the Internet Recently: “[Hardware, Infrastrucuture, Platform] as a service” Utility Computing: pay-as-you-use computing Illusion of infinite resources No up-front cost Fine-grained billing (e.g. hourly)

13 Cloud computing vs. grid computing
Cloud computing = virtualization+ grid + services + utility computing Grid computing: resource provisioning, load balancing, parallel processing Views of different users System admin/hadoop users: grid Application owners/service users: service, utility

14 Users and cloud providers

15 Why Now? Experience with very large datacenters – profitable for cloud providers economics of scale Pervasive broadband Internet Fast x86 virtualization Pay-as-you-go billing model Large user base Online payment Online Ads Content distribution  Web 2.0 lowers the entry point to e-business  more small e-business owners  Large user base of clouds

16 Spectrum of Clouds Instruction Set VM (Amazon EC2, 3Tera)
Bytecode VM (Microsoft Azure) Framework VM Google AppEngine, Force.com Lower-level, Less management Higher-level, More management EC2 Azure AppEngine Force.com

17 Cloud Killer Apps Mobile and web applications
Batch processing / MapReduce Data analytics (big data) E.g., OLAP, data mining, machine learning Extensions of desktop software Matlab, Mathematica

18 Data center in the cloud
Cloud Economics Pay by use instead of provisioning for peak Demand Capacity Time Resources Demand Capacity Time Resources Unused resources Static data center Data center in the cloud

19 Economics of Cloud Users
Risk of over-provisioning: underutilization Demand Capacity Time Resources Unused resources Static data center

20 Economics of Cloud Users
Heavy penalty for under-provisioning Resources Demand Capacity Time (days) 1 2 3 Resources Demand Capacity Time (days) 1 2 3 Lost revenue Resources Demand Capacity Time (days) 1 2 3 Lost users

21 Economics of Cloud Providers
5-7x economies of scale [Hamilton 2008] Extra benefits Amazon: utilize off-peak capacity Microsoft: sell .NET tools Google: reuse existing infrastructure Resource Cost in Medium DC Very Large DC Ratio Network $95 / Mbps / month $13 / Mbps / month 7.1x Storage $2.20 / GB / month $0.40 / GB / month 5.7x Administration ≈140 servers/admin >1000 servers/admin

22 Adoption Challenges Challenge Opportunity Availability
Multiple providers & DCs Data lock-in Standardization Data Confidentiality, Auditability, and privacy Encryption, VLANs, Firewalls; Geographical Data Storage; Privacy preserving data outsourcing

23 Growth Challenges Challenge Opportunity Data transfer bottlenecks
FedEx-ing disks, Data Backup/Archival Performance unpredictability Improved VM support, flash memory, scheduling VMs Scalable storage Invent scalable store Bugs in large distributed systems Invent Debugger that relies on Distributed VMs Scaling quickly Invent Auto-Scaler that relies on ML; Snapshots

24 Policy and Business Challenges
Opportunity Reputation Fate Sharing Offer reputation-guarding services like those for Software Licensing Pay-for-use licenses; Bulk use sales

25 Research Challenges Mentioned by Database Community (Claremont Report)

26 Functionality and operational cost
Background: compare massive-scale data intensive computing systems with today’s DBMS Limited functionality Simple APIs (e.g. mapreduce) Pushes more burden on developers Benefits Easier to manage Lower operational cost Service Level Agreement (SLA) that is hard to provide for a SQL DBMS P.S. DB Systems are notorious for their expenses in installation and maintenance.

27 Manageability Features of cloud systems
Limited human intervention High variance workloads A variety of shared infrastructures No DBAs or Administrators to assist developers Systems need to do work automatically Self-managing Adaptive (autonomous) computing

28 Data security and privacy
Users sharing physical resources in a cloud Protect from each other (security) Protect from curious cloud providers (privacy) Successes may depend on specific target usage scenarios Examples Query based services Mining based services

29 Datasets over multiple clouds
Interesting datasets might be available in different clouds Different cloud providers Private or public clouds Services mashing up datasets Inevitably crossing clouds Federated cloud architectures

30 Algorithms on Big data Working on “Big Data”
Data mining Machine learning Visualization Traditionally assume data is in flat files or relational databases Distributed data organization puts new challenges Redesign algorithms Redesign frameworks


Download ppt "CEG7380 Cloud Computing Lecture 1"

Similar presentations


Ads by Google