Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师.

Slides:



Advertisements
Similar presentations
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Advertisements

CHANGING THE WAY IT WORKS Cloud Computing 4/6/2015 Presented by S.Ganesh ( )
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
 Need for a new processing platform (BigData)  Origin of Hadoop  What is Hadoop & what it is not ?  Hadoop architecture  Hadoop components (Common/HDFS/MapReduce)
Distributed Computations
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
Distributed Computations MapReduce
AN INTRODUCTION TO CLOUD COMPUTING Web, as a Platform…
CLOUD COMPUTING.  It is a collection of integrated and networked hardware, software and Internet infrastructure (called a platform).  One can use.
Lecture 2 – MapReduce CPE 458 – Parallel Programming, Spring 2009 Except as otherwise noted, the content of this presentation is licensed under the Creative.
BigTable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Author: Murray Stokely Presenter: Pim van Pelt Distributed Computing at Google.
1 Networks, advantages & types of What is a network? Two or more computers that are interconnected so they can exchange data, information & resources.
Cloud Computing الحوسبة السحابية. subject History of Cloud Before the cloud Cloud Conditions Definition of Cloud Computing Cloud Anatomy Type of Cloud.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
By: Jeffrey Dean & Sanjay Ghemawat Presented by: Warunika Ranaweera Supervised by: Dr. Nalin Ranasinghe.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 The Google File System Reporter: You-Wei Zhang.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
The Google Cloud EDTEC 572. History & Overview Cloud Computing Grid Computing Parallel Computing Distributed Computing Ubiquitous Computing Mobil phon.
GIS and Cloud Computing. Flickr  Upload and manage your photos online  Share your photos with your family and friends  Post your photos everywhere.
SOFTWARE SYSTEMS DEVELOPMENT MAP-REDUCE, Hadoop, HBase.
SaaS 傅汝緯 李碩元 林子驥 1. What is SaaS?  Definition :Software as a service  a software delivery model in which software and associated data are centrally.
Computing on the Cloud Jason Detchevery March 4 th 2009.
MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
An and Collaboration Suite LI 815 XR Kristen Gripp.
- Raghavi Reddy.  With traditional desktop computing, we run copies of software programs on our own computer. The documents we create are stored on our.
Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
MapReduce M/R slides adapted from those of Jeff Dean’s.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Google Apps in Education Workshop Presentation August 2010.
Bigtable: A Distributed Storage System for Structured Data 1.
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
" The cloud is a smart, complex, powerful computing system in the sky that people can just plug into. " -Marc Andreessen (Web browser pioneer) By : Rakesh.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Product Slides Mary Manzano Team Lead, Enterprise Sales Orange & Bronze Software Labs.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Denise Oliver, Education and Outreach Director Alabama Supercomputer Authority.
Google Apps and Education Jack Nieporte St James of the Valley
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Enterprise Messaging & Collaboration. e-Interact Modules.
Essential Google by Zhongyuan Wang Outline Motivation & Goals Problems Solution : BigTable File System vs Database Google’s Database : Google.
Bigtable: A Distributed Storage System for Structured Data
COM: 111 Introduction to Computer Applications Department of Information & Communication Technology Panayiotis Christodoulou.
WIDESCREEN PRESENTATION Tips and tools for creating and presenting wide format slides.
Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.
 Google Apps is the most visible example of cloud computing,  Instead of hosting apps and data on an individual desktop computer, everything is hosted.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
Cloud Computing ENG. YOUSSEF ABDELHAKIM. Agenda :  The definitions of Cloud Computing.  Examples of Cloud Computing.  Which companies are using Cloud.
BIG DATA/ Hadoop Interview Questions.
Google Apps for Education Account Overview for Staff.
Bigtable A Distributed Storage System for Structured Data.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
Cloud Computing I hear this question often. It is not easy to explain, because it means different things depending on who you talk to. Today’s Webinar.
CSE-291 (Cloud Computing) Fall 2016
John Bordsen Technology Trainer Gail Borden Public Library District
Introduction to Cloud Computing
Google and Cloud Computing
CS 345A Data Mining MapReduce This presentation has been altered.
Presentation transcript:

Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师

Agenda The Internet: From Hardware to Community The Innovation: A Computing Cloud Breakthroughs for Cloud Computing Google Apps for Cloud Computing Google Infrastructure for Cloud Computing

The Internet From Hardware to Community

The Internet: From Hardware to Community MySpace Facebook 开心网 校内网 ……

What Do Today’s Users Want? Accessibility –Access from anywhere and from multiple devices Shareability –Make sharing as easy as creating and saving Freedom –Users don’t want their data held hostage Simplicity –Easy-to-learn, easy-to-use Security –Trust that data will not be lost or seen by unwanted parties

6 The Innovation A Computing Cloud

Cloud Computing 7

Attributes of Cloud Computing 8 Data stored on the cloud Software & services on the cloud - Access via web browser Based on standards and protocols - Linux, AJAX, LAMP, etc. Accessible from any device Hardware Centric Software Centric Service Centric Personal PCClient ServerCloud Computing

9 Breakthroughs for Cloud Computing

10 User-Centric 1 Task-Centric 2 Powerful 3 Intelligent 4 Affordable 5 Programmable 6

User Centric Data stored in the “Cloud” Data follows you & your devices Data accessible anywhere Data can be shared with others music preferences maps news contacts messages mailing lists photo s calendar phone numbers investments

Example : GMail –Just a web browser and your account with password! –Once you login, the device is “yours”. –Data stored on remote servers in the “cloud” (with large capacity) Beijing, on travel San Francisco, Monday Home, Wednesday

Use Google Docs to Solve a Task Access your docs from anywhere Chat with others in real time Changes instantly appear to other collaborators Task = “Teachers creating a departmental curriculum”

Communication Task – , Chat, Contacts, Chat History

Task: Collaborate on Spreadsheet – Communicate Chat with others editing the spreadsheet

Task: Collaborate on Spreadsheet – Collaborate Invite others to collaborate on the spreadsheet

Task: Collaborate on Spreadsheet – Publish Invite others to view the spreadsheet

You can also easily organize all your common tasks

Cloud Computing is Powerful: It can do what no PC can do Is Google Search faster than search in Windows/Outlook/Word? And Google Search must be much harder…. How much storage does it take to store all of the web pages? 100B pages * 10K per page = 1000T disk! Cloud computing has at its disposal Essentially infinite amount of disk Essentially infinite amount of computation (Assuming they can be parallelized) Example: Google Search

Web Page Search  Universal Search W 1 st Generation: era of single search – not diverse 2 nd Generation: era of vertical search – too complex 3 rd Generation: an era of Universal Search A B C D E

From vertical search to universal search A B CDE Integration of user experience

Universal Search Example

Cloud Computing Infrastructure

25 GFS Architecture Google 48% MSN 19% Yahoo 33% Files broken into chunks (typically 64 MB) Master manages metadata Data transfers happen directly between clients/chunkservers Client Replicas Masters GFS Master C0C0 C1C1 C2C2 C5C5 Chunkserver 1 C0C0 C2C2 C5C5 Chunkserver N C1C1 C3C3 C5C5 Chunkserver 2 … Client

Typical Cluster 26 Scheduling masters GFS chunkserver Scheduler slave Linux Machine 1 User app2 User app1 … GFS masterLock service GFS chunkserver Scheduler slave Linux Machine N User app3 User app2 User app1 GFS chunkserver Scheduler slave Linux Machine 2 User app3

MapReduce 27

More specifically… 28 Programmer specifies two primary methods: – map(k, v) → * – reduce(k', *) → * All v' with same k' are reduced together, in order. Usually also specify: – partition(k’, total partitions) -> partition for k’ often a simple hash of the key allows reduce operations for different k’ to be parallelized

29 BigTable Distributed multi-level map – With an interesting data model Fault-tolerant, persistent Scalable – Thousands of servers – Terabytes of in-memory data – Petabyte of disk-based data – Millions of reads/writes per second, efficient scans Self-managing – Servers can be added/removed dynamically – Servers adjust to load imbalance

30 BigTable: Basic Data Model Distributed multi-dimensional sparse map (row, column, timestamp)  cell contents Good match for most of our applications … … “ …” t1 t2 t3 ROWS COLUMNS TIMESTAMPS “contents”

BigTable: System Architecture Cluster Scheduling Master handles failover, monitoring GFS holds tablet data, logs Lock service holds metadata, handles master-election Bigtable tablet server serves data Bigtable tablet server serves data Bigtable tablet server serves data Bigtable master performs metadata ops, load balancing Bigtable cell Bigtable client Bigtable client library Open()

Thanks Q&A