Introduction Dr. Ying Lu Schorr Center 104 CSCE990 Advanced Distributed Systems Seminar.

Slides:



Advertisements
Similar presentations
University of Notre Dame
Advertisements

Amazon Web Services and Eucalyptus
Cloud Computing Imranul Hoque. Today’s Cloud Computing.
INTRODUCTION TO CLOUD COMPUTING CS 595 LECTURE 6 2/13/2015.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Slides for Chapter 1 Characterization.
Introduction Dr. Ying Lu CSCE455/855 Distributed Operating Systems.
ANALYSIS OF CLOUD COMPUTING SERVICES USING AMAZON EC2 CS 526 : Project Presentation MOUNIKA NAMBURU.
Distributed Systems CS Overview and Introduction Lecture 1, Sep 5, 2011 Majd F. Sakr, Mohammad Hammoud, Vinay Kolar.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
Cloud Computing (101).
AN INTRODUCTION TO CLOUD COMPUTING Web, as a Platform…
Engineering the Cloud Andrew McCombs March 10th, 2011.
Addition to Networking.  There is no unique and standard definition out there  Cloud Computing is a general term used to describe a new class of network.
Cloud computing Tahani aljehani.
INTRODUCTION TO CLOUD COMPUTING Cs 595 Lecture 5 2/11/2015.
Plan Introduction What is Cloud Computing?
Introduction to Amazon Web Services (AWS)
Introduction. Readings r Van Steen and Tanenbaum: 5.1 r Coulouris: 10.3.
Cloud Computing الحوسبة السحابية. subject History of Cloud Before the cloud Cloud Conditions Definition of Cloud Computing Cloud Anatomy Type of Cloud.
A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.
Chapter 1 Characterization of Distributed Systems
Cloud Computing Saneel Bidaye uni-slb2181. What is Cloud Computing? Cloud Computing refers to both the applications delivered as services over the Internet.
PhD course - Milan, March /09/ Some additional words about cloud computing Lionel Brunie National Institute of Applied Science (INSA) LIRIS.
Computer System Architectures Computer System Software
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
GIS and Cloud Computing. Flickr  Upload and manage your photos online  Share your photos with your family and friends  Post your photos everywhere.
Distributed Systems CS Overview and Introduction Lecture 1, Aug 25, 2014 Mohammad Hammoud.
Geographic Information Systems Cloud GIS. ► The use of computing resources (hardware and software) that are delivered as a service over the Internet ►
Introduction to Cloud Computing
M.A.Doman Short video intro Model for enabling the delivery of computing as a SERVICE.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Cloud Computing Dave Elliman 11/10/2015G53ELC 1. Source: NY Times (6/14/2006) The datacenter is the computer!
Presented by: Mostafa Magdi. Contents Introduction. Cloud Computing Definition. Cloud Computing Characteristics. Cloud Computing Key features. Cost Virtualization.
Plan  Introduction  What is Cloud Computing?  Why is it called ‘’Cloud Computing’’?  Characteristics of Cloud Computing  Advantages of Cloud Computing.
What is the cloud ? IT as a service Cloud allows access to services without user technical knowledge or control of supporting infrastructure Best described.
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Design of Parallel and Distributed.
Enterprise Cloud Computing
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Paperless Timesheet Management Project Anant Pednekar.
Cloud Computing is a Nebulous Subject Or how I learned to love VDF on Amazon.
My project  Small-Medium Enterprises (SMEs)  faces goods distribution problems  needs necessary resources, money and technical expertise, to purchase.
Web Technologies Lecture 13 Introduction to cloud computing.
Information Systems in Organizations 5.2 Cloud Computing.
1 TCS Confidential. 2 Objective : In this session we will be able to learn:  What is Cloud Computing?  Characteristics  Cloud Flavors  Cloud Deployment.
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
Submitted to :- Neeraj Raheja Submitted by :- Ghelib A. Shuaib (Asst. Professor) Roll No : Class :- M.Tech(CSE) 2 nd Year.
Distributed Computing Systems Overview of Distributed Systems Andrew Tanenbaum and Marten van Steen, Distributed Systems – Principles and Paradigms, Prentice.
KAASHIV INFOTECH – A SOFTWARE CUM RESEARCH COMPANY IN ELECTRONICS, ELECTRICAL, CIVIL AND MECHANICAL AREAS
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
Distributed Systems: Concepts and Design Edition 5, © Addison-Wesley 2012 Chapter 1 Characterization of Distributed Systems.
Yue Zhou. Overall of cloud computing Definition of Could Computing Characteristics and Advantages Type of Services Current Leaders: Google, Amazon, Microsoft,
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Agenda  What is Cloud Computing?  Milestone of Cloud Computing  Common Attributes of Cloud Computing  Cloud Service Layers  Cloud Implementation.
Distributed Systems CS Overview and Introduction Lecture 1, Aug 22, 2016 Mohammad Hammoud.
Prof. Jong-Moon Chung’s Lecture Notes at Yonsei University
Chapter 1 Characterization of Distributed Systems
Unit 3 Virtualization.
Deploying Web Application
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING CLOUD COMPUTING
Distributed Systems CS
Platform as a Service.
Andrew McCombs March 10th, 2011
Introduction to Cloud Computing
Slides for Chapter 1 Characterization of Distributed Systems
Outline Virtualization Cloud Computing Microsoft Azure Platform
AWS Cloud Computing Masaki.
Slides for Chapter 1 Characterization of Distributed Systems
Slides for Chapter 1 Characterization of Distributed Systems
Distributed Systems CS
Presentation transcript:

Introduction Dr. Ying Lu Schorr Center 104 CSCE990 Advanced Distributed Systems Seminar

–Some lecture notes are based on slides created by Dr. Zahorjan at Univ. of Washington, Dr. Konev at Univ. of Liverpool, Steve Crouch at Software Sustainability Institute, Petru Eles at Link ö pings University, and Dr. Majd F. Sakr, Mohammad Hammoud, Vinay Kolar at CMU –I have modified them and added new slides Giving credit where credit is due:

Types of Distributed Systems?

‘Cloud’ & ‘Grid’ – Utility Computing? The Grid…The Cloud… Is it really like the grid? Is it more like a fog? But… they’re both about providing access to compute and data resources j.o.h.n walker charles.frith

The Problem l Basically, want to run compute/data intensive task l Don’t have enough resources to run job locally n At least, to return results within sensible timeframe l Would like to use another, more capable resource

Distributed Computing in Olden Times l Small number of ‘fast’ computers n Very expensive n Centralized n Used nearly all the time n Time allocations for users n Not updated often brewbooks Cray X-MP (Cray -1 successor) Univac 1710 Michael L. Umbricht and Carl R. Friend Punched cards Wait time huge MailNet, SneakerNet, etc… Mainframes Cray $8.8 million, 160 megaflops, 8MB memory

It’s About Scaling Up… l Compute and data – you need more, you go somewhere else to get it Local Institutional National International Then… the march towards localization of computation, the Personal Computer Computational Science develops in laboratories Is this changing again? Images: nasaimages, Extra Ketchup, Google Maps, Dave Page

What is Cloud Computing? l Many ways to define it i.e. one for every supplier of ‘cloud’ l Key characteristics: n On demand, dynamic allocation of resources – ‘elasticity’ n Abstraction n Self-managed n Billed for what you use e.g. in terms of CPU, storage n Standardized interfaces e.g. OCCI l … it’s more like an electricity grid than the Grid

How Does it Deliver? 1- IaaS - Infrastructure as a Service HardwareOS 2- PaaS - Platform as a Service FrameworkMiddleware 3- SaaS - Software as a Service Hosted ApplicationsInfrastructure l Cloud computing can deliver at any of these levels l These levels are often blurred and routinely disputed! l Resources provided on demand Internet End user/ Customer Developer / Service Provider

IaaS – Infrastructure as a Service l You get access to (usually) virtualised hardware n Servers, storage, networking n Operating system l Responsible for managing OS, middleware, runtime, data, application (development) l e.g. Amazon EC2

Amazon EC2 – The Idea l ‘Elastic Computing’ l Sign up l Select & configure virtualized resources n Emulated OS: RHEL, OEL, Windows Server, OpenSolaris, Fedora, Ubuntu, Debian, SUSE, Gentoo, Amazon Linux AMI n Infrastructure: u Data: IBM DB2, IBM Informix, Microsoft SQL, MySQL, Oracle u Web Hosting: Apache HTTP, IIS/Asp.NET, IBM WebSphere u Batch Processing: Hadoop, Condor, Open MPI n Newer addition - development environments: u IBM sMash, Ruby on Rails, Jboss Enterprise Application Platform n Moving towards PaaS! (Already there?) l Additional web services n S3: Simple Storage Solution – transfer data in/out, 1 byte to 5 TB n SQS: Simple Queue Service

Amazon EC2: Pricing l Free (at the start!): n Run single Amazon Micro Instance for a year n 750 hours of EC2, 750 hours of Elastic Load Balancing plus 15 GB data processing n 15 GB bandwidth in/out across all services l On demand instances: n Pay per hour, no long-term commitment n From $0.025/hour -> $0.76/hour l Reserved instances: n Upfront payment, with discount per hour n From $227/year + $0.01/hour -> $1820/year + $0.32/hour l Spot instances: n Bid for unused EC2 capacity: n Spot Price fluctuates with supply/demand, if bid over Spot Price, you get it n From $0.007/hour -> $0.68/hour

EC2 Application Example l Peter Harkins, a Senior Engineer at The Washington Post, used 200 EC2 instances (1,407 server hours) to convert 17,481 pages of Hillary Clinton’s travel documents into a form more friendly to use on the WWW within nine hours after they were released* *

PaaS – Platform as a Service l You get integrated development environment n e.g. application design, testing, deployment, hosting, frameworks for database integration, storage, app versioning, etc. l Develop applications on top l Responsible for managing data, application (development) l e.g. Google App Engine

Google App Engine: The Idea l Sign up via Google Accounts l Develop App Engine web applications locally using SDK – emulates all services n Includes tool to upload application code, static files and config files n Can ‘version’ your web application instances l Apps run in a Java/Python ‘sandbox’ l Automatic scaling and load balancing – abstract across underlying resources

Google App Engine: Pricing l Free within a quota: n 500MB storage, 5 million page views a month (~6.5 CPU hours, 1GB) n 10 applications/developer l Billed model: n Each app $8/user (max $1000) a month n For each app: ResourceUnitUnit cost Outgoing bandwidthGB$0.12 Incoming bandwidthGB$0.10 CPU TimeCPU hours$0.10 Stored DataGB/month$0.15 High Replication Stg. GB/month$0.45 Recipients edRecipients$ Always OnN/A (daily)$0.30

SaaS – Software as a Service l Top layer consumed directly by end user – the ‘business’ functionality l Application software provided, you configure it (more or less) l Various levels of maturity: n Level 1: each customer has own customised version of application in own instance n Level 2: all instances use same application code, but configured individually n Level 3: single instance of application across all customers n Level 4: multiple customers served on load-balanced ‘farm’ of identical instances n Levels 3 & 4: separate customer data! l e.g. Gmail, Google Sites, Google Docs, Facebook

Summary of Provision Application Migration – adopt the level you want

Cloud Open Standards l Implementations typically have proprietary standards and interfaces n Vendors like this – often locked in to one implementation l Community ‘push’ towards open cloud standards: n Open Grid Forum (OGF) – Open Cloud Computing Interface (OCCI) n Distributed Management Task Force (DMTF) – Open Virtualisation Format (OVF)

Why should you Study Distributed Systems? Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis Application DomainAssociated Networked Application Finance and commerceeCommerce e.g. Amazon and eBay, PayPal, online banking and trading The information societyWeb information and search engines, ebooks, Wikipedia; social networking: Facebook and MySpace. Creative industries and entertainmentonline gaming, music and film in the home, user-generated content, e.g. YouTube, Flickr Healthcarehealth informatics, on online patient records, monitoring patients Educatione-learning, virtual learning environments; distance learning Transport and logisticsGPS in route finding systems, map services: Google Maps, Google Earth ScienceThe Grid as an enabling technology for collaboration between scientists Environmental managementsensor technology to monitor earthquakes, floods or tsunamis

Definition of a Distributed System A distributed system is: A collection of independent computers that appear to its users as a single coherent system (Tanenbaum book) One in which components located at networked computers communicate and coordinate their actions only by passing messages (Coulouris book)

Why Distributed Systems? l Scale n Processing n Data l Diversity in Application Domains l Collaboration l Cost

Why Distributed Systems? A. Big data continues to grow:  In mid-2010, the information universe carried 1.2 zettabytes and 2020 predictions expect nearly 44 times more at 35 zettabytes coming our way. B. Applications are becoming data-intensive.

Why Distributed Systems? C. Individual computers have limited resources compared to scale of current day problems & application domains: 1. Caches and Memory: L1 Cache L2 Cache L3 Cache Main Memory 16KB- 64KB, 2-4 cycles 512KB- 8MB, 6-15 cycles 4MB- 32MB, cycles 1GB- 4GB, 300+ cycles

Why Distributed Systems? 2. Hard Disk Drive:  Limited capacity  Limited number of channels  Limited bandwidth

Why Distributed Systems? P L1 L2 P L1 L2 Cache P L1 P P Interconnect 3.Processor:  The number of transistors that can be integrated on a single die has continued to grow at Moore’s pace.  Chip Multiprocessors (CMPs) are now available A single Processor Chip A CMP

Why Distributed Systems? 3.Processor (cont’d):  Up until a few years ago, CPU speed grew at the rate of 55% annually, while the memory speed grew at the rate of only 7% [H & P]. Memory P M P L1 L2 P L1 L2 Cache P L1 P P Interconnect Processor-Memory speed gap

Why Distributed Systems?  Even if 100s or 1000s of cores are placed on a CMP, it is a challenge to deliver input data to these cores fast enough for processing. A Data Set of 4 TBs 4 100MB/S IO Channels seconds (or 3 hours) to load data Memory P L1 L2 Cache P L1 P P Interconnect

Why Distributed Systems? Only 3 minutes to load data A Data Set (data) of 4 TBs Splits Memory P L1 L2 Memory P L1 L2 100 Machines

Requirements  But this requires:  A way to express the problem as parallel processes and execute them on different machines (Programming Models and Concurrency).  A way for processes on different machines to exchange information (Communication).  A way for processes to cooperate, synchronize with one another and agree on shared values (Synchronization).  A way to enhance reliability and improve performance (Consistency and Replication).

Requirements  But this requires (Cont.):  A way to recover from partial failures (Fault Tolerance).  A way to secure communication and ensure that a process gets only those access rights it is entitled to (Security).  A way to extend interfaces so as to mimic the behavior of another system, reduce diversity of platforms, and provide a high degree of portability and flexibility (Virtualization)

Course Objective l This is a course on advanced distributed systems, where we will understand the state of the art in distributed systems, in particular, data-intensive distributed computing systems, and how and why we got there and how to engage in systems research.