The Google Cluster Architecture Written By: Luiz André Barroso Jeffrey Dean Urs Hölzle Presented By: Omkar Kasinadhuni Simerjeet Kaur.

Slides:

Advertisements

Similar presentations

Multiprocessors— Large vs. Small Scale Multiprocessors— Large vs. Small Scale.

Advertisements

Lecture 6: Multicore Systems

Building a Distributed Full-Text Index for the Web S. Melnik, S. Raghavan, B.Yang, H. Garcia-Molina.

Dave Bradley Rick Harper Steve Hunter 4/28/2003 CoolRunnings.

The Central Processing Unit: What Goes on Inside the Computer.

Chapter 5: Server Hardware and Availability. Hardware Reliability and LAN The more reliable a component, the more expensive it is. Server hardware is.

The Google Cluster Architecture

The Next I.T. Tsunami Paul A. Strassmann. Copyright © 2005, Paul A. Strassmann - IP4IT - 11/15/05 2 Perspective Months  Weeks.

1 Copyright © 2012, Elsevier Inc. All rights reserved. Chapter 2 (and Appendix B) Memory Hierarchy Design Computer Architecture A Quantitative Approach,

Distributed Computations

2/25/2004 The Google Cluster Architecture February 25, 2004.

Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2010.

1 Sept 7, 2011 COMP6111A Fall 2011 HKUST Lin Gu Cloud Computing Systems.

Google’s Map Reduce. Commodity Clusters Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture.

11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.

Datacenter Power State-of-the-Art Randy H. Katz University of California, Berkeley LoCal 0 th Retreat “Energy permits things to exist; information, to.

Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:

1 Sept 3, 2009 COMP660L Fall 2009 HKUST Lin Gu Topics in Computer and Communication Networks: Cloud Computing.

OVERVIEW OF GOOGLE SEARCH ENGINE Xiannong Meng Computer Science Department Bucknell University Lewisburg, PA U.S.A.

Distributed Computations MapReduce

Lecture 37: Chapter 7: Multiprocessors Today’s topic –Introduction to multiprocessors –Parallelism in software –Memory organization –Cache coherence 1.

Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.

Single-Chip Multi-Processors (CMP) PRADEEP DANDAMUDI 1 ELEC , Fall 08.

CPU Performance Assessment As-Bahiya Abu-Samra *Moore’s Law *Clock Speed *Instruction Execution Rate - MIPS - MFLOPS *SPEC Speed Metric *Amdahl’s.

Chapter 18 Multicore Computers

SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.

MapReduce. Web data sets can be very large – Tens to hundreds of terabytes Cannot mine on a single server Standard architecture emerging: – Cluster of.

Computer System Architectures Computer System Software

Windows 2000 Advanced Server and Clustering Prepared by: Tetsu Nagayama Russ Smith Dale Pena.

Network Aware Resource Allocation in Distributed Clouds.

MapReduce: Simplified Data Processing on Large Clusters Jeffrey Dean and Sanjay Ghemawat.

Multi-core architectures. Single-core computer Single-core CPU chip.

Multi-Core Architectures

Multi-core Programming Introduction Topics. Topics General Ideas Moore’s Law Amdahl's Law Processes and Threads Concurrency vs. Parallelism.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.

Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,

Srihari Makineni & Ravi Iyer Communications Technology Lab

Embedded System Lab 김해천 Thread and Memory Placement on NUMA Systems: Asymmetry Matters.

Replicating Memory Behavior for Performance Skeletons Aditya Toomula PC-Doctor Inc. Reno, NV Jaspal Subhlok University of Houston Houston, TX By.

MapReduce: Simplified Data Processing on Large Clusters Lim JunSeok.

Virtualization Supplemental Material beyond the textbook.

Memory Hierarchy: Terminology Hit: data appears in some block in the upper level (example: Block X)  Hit Rate : the fraction of memory access found in.

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

Computer performance issues* Pipelines, Parallelism. Process and Threads.

 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.

MapReduce: Simplified Data Processing on Large Clusters By Dinesh Dharme.

3/12/2013Computer Engg, IIT(BHU)1 PARALLEL COMPUTERS- 1.

Background Computer System Architectures Computer System Software.

Chapter 11 System Performance Enhancement. Basic Operation of a Computer l Program is loaded into memory l Instruction is fetched from memory l Operands.

Introduction Goal: connecting multiple computers to get higher performance – Multiprocessors – Scalability, availability, power efficiency Job-level (process-level)

Data Centers and Cloud Computing 1. 2 Data Centers 3.

MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.

Cloud Computing Ed Lazowska Bill & Melinda Gates Chair in Computer Science & Engineering University of Washington August 2012.

1 Scaling Soft Processor Systems Martin Labrecque Peter Yiannacouras and Gregory Steffan University of Toronto FCCM 4/14/2008.

Information Technology (IT). Information Technology – technology used to create, store, exchange, and use information in its various forms (business data,

Hadoop Aakash Kag What Why How 1.

Advanced Topics in Concurrency and Reactive Programming: Case Study – Google Cluster Majeed Kassis.

Large-scale file systems and Map-Reduce

How will execution time grow with SIZE?

CLUSTER COMPUTING Presented By, Navaneeth.C.Mouly 1AY05IS037

Cloud Computing Ed Lazowska August 2011 Bill & Melinda Gates Chair in

Cache Memory Presentation I

Hyperthreading Technology

MapReduce Simplied Data Processing on Large Clusters

CLUSTER COMPUTING.

IBM Power Systems.

Computer Evolution and Performance

Database System Architectures

Presentation transcript:

The Google Cluster Architecture Written By: Luiz André Barroso Jeffrey Dean Urs Hölzle Presented By: Omkar Kasinadhuni Simerjeet Kaur

April 4th, 2007Google Cluster Architecture2 Agenda ► Introduction ► Architecture Overview  Serving a Google query  Using replication ► Leveraging Commodity Parts ► Power Problem ► Hardware-level applications characteristics  Memory system  Large-scale multiprocessing ► Conclusion

April 4th, 2007Google Cluster Architecture3 Introduction ► To build an infrastructure capable of executing thousands of queries per second, where each query in itself consumes billions of CPU cycles, is really challenging ► A single query on Google reads hundreds of megabytes of data and consumes tens of billions of CPU cycles. ► Google architecture features clusters of more than 15,000 commodity- class PCs with fault-tolerant software to achieve superior performance. ► Combining commodity-class PCs, with fault-tolerant software, gives higher performance than a smaller group of high-end servers.

April 4th, 2007Google Cluster Architecture4 ► Google benefits from on-chip parallelism (multithreading, on-chip multiprocessor) Important Design Factors: ► Energy Efficiency (considering thousands of machines) ► Price-Performance Ratio (not peak processor performance)

April 4th, 2007Google Cluster Architecture5 Google Architecture Overview ► Reliability in software (not in server-class hardware)  By replicating services ► Aggregate request throughput (not peak server response time)  Manage response time by parallelizing requests

April 4th, 2007Google Cluster Architecture6 Google search results for the query “ieee society”

April 4th, 2007Google Cluster Architecture7 Serving A Google Query ► Users enters query to Google ► User’s browser performs DNS lookup ► DNS based load balancing selects a cluster ► User’s browser sends HTTP request to the cluster ► Hardware based local load balancer balances the load among the Google Web Servers (GWSs) in that cluster. ► GWS machine sends the result to the user’s browser in HTML format.

April 4th, 2007Google Cluster Architecture8 Addresses: , , , Aliases: Grid Clusters Worldwide Clusters HTTP Request HTML Response DNS-based load-balancing DNS Lookup -> IP Cluster addresses returned

April 4th, 2007Google Cluster Architecture9 Serving A Google Query Internally GWS machine coordinates query execution in two phases: Phase 1: (Index lookup phase) ► Index Servers use inverted index  Maps each query word to matching documents (hit list) ► Compute relevance score for documents  Score determines the order of results on the output page. ► Search process is challenging  Tens of terabytes of uncompressed data in documents  Inverted index itself results in many terabytes of data

April 4th, 2007Google Cluster Architecture10 Serving A Google Query Search is highly Parallelizable: ► Divide Index into pieces – index shards ► Each shard is served by pool of machines ► An intermediate load balancer chooses machine (or subset of machines) ► Even if a machine fails, the service remains uninterrupted ► The final result of the first phase is an ordered list of document identifiers (docids)

April 4th, 2007Google Cluster Architecture11 Serving A Google Query Index (a-z) shard-1 (a - h) shard-2 (i - o) Shard - (n-1) (q - v) Shard – n (w - z) ……. Figure: Index Shards ………

April 4th, 2007Google Cluster Architecture12 Google Web Servers (GWSs) Figure 1: Google query-serving architecture

April 4th, 2007Google Cluster Architecture13 Google Web Servers (GWSs) Phase 2: (Document serving phase) ► Use the list of docids to generate actual title, URL, query specific document summary ► Document servers (docservers) fetch each document from the disk and extract title and keyword-in-context snippet. ► The docservers processes the documents by partitioning them just as in the Index lookup phase. Query specific document summary Title URL

April 4th, 2007Google Cluster Architecture14 Google Web Servers (GWSs) ► The docserver cluster must have access to an online copy of the entire Web. ► The GWS also initiates tasks like spell-checker, ad-server. Figure 1: Google query-serving architecture

April 4th, 2007Google Cluster Architecture15 Using Replication ► Google maintains several copies of the entire Web across its clusters for performance and availability issues ► Most of the accesses to the indexes are read-only. Updates are possible by diverting queries away from service replica ► Adding machines to each pool increases serving capacity and adding shards accommodates index growth

April 4th, 2007Google Cluster Architecture16 ► Average latency is reduced by parallelizing search ► Individual shards do not require communication resulting speedup to be nearly linear. ► CPU speed of the individual index servers does not directly influence the search’s overall performance Using Replication

April 4th, 2007Google Cluster Architecture17 ► Software reliability ► Use replication for better request throughput and availability ► Price/Performance beats peak performance ► Using commodity PCs reduces the cost of computation Google Cluster Design Principles

April 4th, 2007Google Cluster Architecture18 Leveraging Commodity Parts ► Google's racks consists of 40 to 80 x86-based servers on custom made racks ► Several CPUs generations are used from single-processor 533-Mhz Intel-Celeron to dual 1.4-Ghz Intel Pentium III ► Servers contains one or more 80GB IDE drives ► Index Servers have less disk space than document servers ► Servers on rack are interconnected via 100-Mbps Ethernet switch ► A core gigabit switch connects all racks together

April 4th, 2007Google Cluster Architecture19 ► The selection criteria is cost per query, i.e. sum of capital expenses (with depreciation) and operating costs (hosting, administration, repairs) divided by performance  1 Rack ($287,000) ► Capital cost $278,000 for 3 years ► Operating cost $9,000 for 3 years  Queries ► 1 second = 1000 ► 3 years = * 10 9  Cost per query = / * 10 9 = cents ► The load balancing between current generation (faster) machines and older machines is difficult ► Relatively short amortization period causes the equipment cost have a significant impact on overall cost equation Leveraging Commodity Parts

April 4th, 2007Google Cluster Architecture20 Leveraging Commodity Parts ► Rack configuration : (costs about $278,000)  88 dual-CPU, 2-GHz Intel Xeon servers,  2 GB RAM and  80 GB hard disk ► Monthly capital cost of $7,700 over 3 years ► In total, GHz Xeon CPUs, 176 GB of RAM, and 7 TB of disk space ► A typical x86-based system: (costs about $758,000)  Eight 2-GHz Xeon CPUs,  64 GB of RAM and  8 TB of hard disk space. ► In comparison, multiprocessor server is about 3 times more expensive but has 22 times fewer CPUs, 3 times less RAM and slightly more disk space.

April 4th, 2007Google Cluster Architecture21 Leveraging Commodity Parts ► Expensive equipment increase performance but decrease price per performance ► High-end servers provide high interconnect bandwidth and reliability which Goggle achieves with its highly redundant architecture ► Managing thousands of PCs instead of few servers incurs system administration and repair costs, but Goggle's homogeneous application makes these costs manageable

April 4th, 2007Google Cluster Architecture22 The Power Problem ► A mid-range server  requires 90 W of DC power ► 55 W - 2 CPUs, ► 10 W - disk drive, ► 25 W-DRAM and motherboard  120 W of AC power ► 10 kW per rack ► Power density = 400 W/ft 2 (1 rack-25 ft 2 ) ► Power density (high-end processors) = 700 W/ft 2 ► A commercial data center typically holds power densities between 70 to 150 W/ft 2

April 4th, 2007Google Cluster Architecture23 The Power Problem ► Low-power servers must not be very expensive ► Watts per unit of performance is important than watt alone ► 10 kW rack consumes 10 MW-h of power per month (including cooling overhead) ► With 15 cents per kW-hour, power and cooling costs are $1,500 per month (much less than $7,700)

April 4th, 2007Google Cluster Architecture24 Hardware-level Applications Characteristics ► Index servers contribute the most in the overall price performance ► There isn’t much exploitable instruction level parallelism in the workload ► Profitable way to exploit parallelism in index server is to benefit from easily parallelizable computation ► Both simultaneous multithreading (SMT) and chip multiprocessor (CMP) target thread level parallelism

April 4th, 2007Google Cluster Architecture25 Hardware-level Applications Characteristics ► CMP yields more performance than SMT ► CMP systems use short pipeline cores that reduces branch mispredict penalties. ► Thread level parallelism should allow linear speedup with number of cores ► L2 cache speeds up interprocessor communication

April 4th, 2007Google Cluster Architecture26 Memory System ► Index data block have no temporal locality ► Access within an index data block benefit from spatial locality ► Even for modest cache sizes, good overall cache hit ratios is achieved ► Memory bandwidth is not a bottleneck ► The memory utilization of Pentium- class processor system is under 20 percent ► Modest sized L2 cache or short L2 cache and memory latencies, and longer cache lines are most efficient for Google workload

April 4th, 2007Google Cluster Architecture27 Large Scale Multiprocessing Large-scale shared memory machines are useful when: ► Computation to communication ratio is low ► Communication patterns or data partitioning are dynamic or hard to predict ► Cost of ownership gets higher than hardware costs The above requirements do not apply to Google

April 4th, 2007Google Cluster Architecture28 ► All the software is produced in-house ► System management overhead is minimized through automation and monitoring ► Google’s hardware cost is only a small fraction of the total system operating expenses Large Scale Multiprocessing

April 4th, 2007Google Cluster Architecture29 Conclusion ► The cluster solution is best suited for performance and availability. ► The system achieves high fault tolerance due to redundancy (not because of its hardware capabilities) ► An application which focuses on the price/performance and can run on servers that have no private state might benefit from similar architecture.

April 4th, 2007Google Cluster Architecture30 Discussion ► The disk capacity in each cluster grows at some rate. The web grows at some other rate. Is the rate at which the Web grows a potential problem? ► Cost per query is an important factor in Google Architecture, how would you relate it with cost per advertisement? ► At Google’s scale of operation, what are the complexities of data storage management?