Download presentation
Presentation is loading. Please wait.
Published byGarry Marsh Modified over 9 years ago
1
Google and Cloud Computing Google 与云计算 王咏刚 Google 资深工程师
2
Agenda The Internet: From Hardware to Community The Innovation: A Computing Cloud Breakthroughs for Cloud Computing Google Apps for Cloud Computing Google Infrastructure for Cloud Computing
3
The Internet From Hardware to Community
4
The Internet: From Hardware to Community MySpace Facebook 开心网 校内网 ……
5
What Do Today’s Users Want? Accessibility –Access from anywhere and from multiple devices Shareability –Make sharing as easy as creating and saving Freedom –Users don’t want their data held hostage Simplicity –Easy-to-learn, easy-to-use Security –Trust that data will not be lost or seen by unwanted parties
6
6 The Innovation A Computing Cloud
7
Cloud Computing 7
8
Attributes of Cloud Computing 8 Data stored on the cloud Software & services on the cloud - Access via web browser Based on standards and protocols - Linux, AJAX, LAMP, etc. Accessible from any device Hardware Centric Software Centric Service Centric Personal PCClient ServerCloud Computing
9
9 Breakthroughs for Cloud Computing
10
10 User-Centric 1 Task-Centric 2 Powerful 3 Intelligent 4 Affordable 5 Programmable 6
11
User Centric Data stored in the “Cloud” Data follows you & your devices Data accessible anywhere Data can be shared with others music preferences maps news contacts messages mailing lists photo e-mails calendar phone numbers investments
12
Example : GMail –Just a web browser and your account with password! –Once you login, the device is “yours”. –Data stored on remote servers in the “cloud” (with large capacity) Beijing, on travel San Francisco, Monday Home, Wednesday
13
Use Google Docs to Solve a Task Access your docs from anywhere Chat with others in real time Changes instantly appear to other collaborators Task = “Teachers creating a departmental curriculum”
14
Communication Task – Email, Chat, Contacts, Chat History
15
Task: Collaborate on Spreadsheet – Communicate Chat with others editing the spreadsheet
16
Task: Collaborate on Spreadsheet – Collaborate Invite others to collaborate on the spreadsheet
17
Task: Collaborate on Spreadsheet – Publish Invite others to view the spreadsheet
18
You can also easily organize all your common tasks
19
Cloud Computing is Powerful: It can do what no PC can do Is Google Search faster than search in Windows/Outlook/Word? And Google Search must be much harder…. How much storage does it take to store all of the web pages? 100B pages * 10K per page = 1000T disk! Cloud computing has at its disposal Essentially infinite amount of disk Essentially infinite amount of computation (Assuming they can be parallelized) Example: Google Search
20
Web Page Search Universal Search W 1 st Generation: era of single search – not diverse 2 nd Generation: era of vertical search – too complex 3 rd Generation: an era of Universal Search A B C D E
21
From vertical search to universal search A B CDE Integration of user experience
22
Universal Search Example
24
Cloud Computing Infrastructure
25
25 GFS Architecture Google 48% MSN 19% Yahoo 33% Files broken into chunks (typically 64 MB) Master manages metadata Data transfers happen directly between clients/chunkservers Client Replicas Masters GFS Master C0C0 C1C1 C2C2 C5C5 Chunkserver 1 C0C0 C2C2 C5C5 Chunkserver N C1C1 C3C3 C5C5 Chunkserver 2 … Client
26
Typical Cluster 26 Scheduling masters GFS chunkserver Scheduler slave Linux Machine 1 User app2 User app1 … GFS masterLock service GFS chunkserver Scheduler slave Linux Machine N User app3 User app2 User app1 GFS chunkserver Scheduler slave Linux Machine 2 User app3
27
MapReduce 27
28
More specifically… 28 Programmer specifies two primary methods: – map(k, v) → * – reduce(k', *) → * All v' with same k' are reduced together, in order. Usually also specify: – partition(k’, total partitions) -> partition for k’ often a simple hash of the key allows reduce operations for different k’ to be parallelized
29
29 BigTable Distributed multi-level map – With an interesting data model Fault-tolerant, persistent Scalable – Thousands of servers – Terabytes of in-memory data – Petabyte of disk-based data – Millions of reads/writes per second, efficient scans Self-managing – Servers can be added/removed dynamically – Servers adjust to load imbalance
30
30 BigTable: Basic Data Model Distributed multi-dimensional sparse map (row, column, timestamp) cell contents Good match for most of our applications … … “ …” t1 t2 t3 www.cnn.com ROWS COLUMNS TIMESTAMPS “contents”
31
BigTable: System Architecture Cluster Scheduling Master handles failover, monitoring GFS holds tablet data, logs Lock service holds metadata, handles master-election Bigtable tablet server serves data Bigtable tablet server serves data Bigtable tablet server serves data Bigtable master performs metadata ops, load balancing Bigtable cell Bigtable client Bigtable client library Open()
32
Thanks Q&A
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.