Bigtable: A Distributed Storage System for Structured Data 1.

Slides:



Advertisements
Similar presentations
Introduction to cloud computing
Advertisements

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Tomcy Thankachan  Introduction  Data model  Building Blocks  Implementation  Refinements  Performance Evaluation  Real applications  Conclusion.
Bigtable: A Distributed Storage System for Structured Data Fay Chang et al. (Google, Inc.) Presenter: Kyungho Jeon 10/22/2012 Fall.
Homework 2 What is the role of the secondary database that we have to create? What is the role of the secondary database that we have to create?  A relational.
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
The google file system Cs 595 Lecture 9.
Big Table Alon pluda.
File Systems.
Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
 Pouria Pirzadeh  3 rd year student in CS  PhD  Vandana Ayyalasomayajula  1 st year student in CS  Masters.
Authors Fay Chang Jeffrey Dean Sanjay Ghemawat Wilson Hsieh Deborah Wallach Mike Burrows Tushar Chandra Andrew Fikes Robert Gruber Bigtable: A Distributed.
BigTable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Distributed storage for structured data
Bigtable: A Distributed Storage System for Structured Data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Gowtham Rajappan. HDFS – Hadoop Distributed File System modeled on Google GFS. Hadoop MapReduce – Similar to Google MapReduce Hbase – Similar to Google.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 The Google File System Reporter: You-Wei Zhang.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,
Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.
Bigtable: A Distributed Storage System for Structured Data Google’s NoSQL Solution 2013/4/1Title1 Chao Wang Fay Chang, Jeffrey Dean, Sanjay.
BigTable and Accumulo CMSC 461 Michael Wilson. BigTable  This was Google’s original distributed data concept  Key value store  Meant to be scaled up.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
Big Table - Slides by Jatin. Goals wide applicability Scalability high performance and high availability.
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Apache Accumulo CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Bigtable A Distributed Storage System for Structured Data.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.
Bigtable: A Distributed Storage System for Structured Data Written By: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike.
Bigtable A Distributed Storage System for Structured Data
Lecture 7 Bigtable Instructor: Weidong Shi (Larry), PhD
Bigtable: A Distributed Storage System for Structured Data
Data Management in the Cloud
CSE-291 (Cloud Computing) Fall 2016
Gowtham Rajappan.
Introduction to Apache
Cloud Computing Storage Systems
A Distributed Storage System for Structured Data
Presentation transcript:

Bigtable: A Distributed Storage System for Structured Data 1

Before we begin …  BigTable  Sawzall  MapReduce  Bloom Filters Bigtable: A Distributed Storage System for Structured Data 2

 Introduction  Data Model  API  Building Blocks  Implementation  Refinements  Performance Evaluation  Real Applications  Lessons  Related Work  Conclusions Bigtable: A Distributed Storage System for Structured Data 3

 What is Bigtable? A distributed storage system for managing structured data at Google  Used by > 60 Google products  Google Analytics  Google reader  Personalized Search  Orkut Bigtable: A Distributed Storage System for Structured Data 4

 Goals  Wide applicability  Scalability  High performance  High availability  Bigtable and Database  Bigtable does not support a full relational data model Bigtable: A Distributed Storage System for Structured Data 5

 A Bigtable is sparse, distributed, persistent multi-dimensional sorted map  Distributed multi-dimensional sparse map (row, column, timestamp) cell contents Webtable Bigtable: A Distributed Storage System for Structured Data 6

Rows  row keys are arbitrary strings up to 64KB  every read or write of data in a single row is atomic (regardless of the # or columns)  row ranges are dynamically partitioned into tablets Bigtable: A Distributed Storage System for Structured Data 7

Column Families  column keys are grouped into sets called column families  usually of the same type  number of columns families should be small  number of columns is unbounded  access control is at the column family level Bigtable: A Distributed Storage System for Structured Data 8

Timestamps  Each cell in a Bigtable can contain multiple versions of the same data  Versions are indexed by 64-bit integer timestamps  Garage-collection settings per-column-family:  only the last n versions of a cell be kept, or  only new-enough versions be kept Bigtable: A Distributed Storage System for Structured Data 9

10  Rows  Columns  Timestamps

 Metadata operations  Create/delete tables or column families  Change metadata  Writes (atomic)  Bigtable does not support general transactions across row keys  does not support writing to Bigtable  filtering, summarization, and transformation  Bigtable can be used with MapReduce Bigtable: A Distributed Storage System for Structured Data 11

 Google File System (GFS)  used to store log and data files  Scheduler cluster management system  used to manage jobs and resources  SSTable file format  used internally to store Bigtable data  Chubby distributed lock service  highly-available with five active replicas % unavailability for 14 Bibtable clusters % unavailability for most affectected cluster Bigtable: A Distributed Storage System for Structured Data 12

What is a tablet?  A Bigtable cluster stores a number of tables  Each table consists of a set of tablets  Each tablet managed by a specific tablet server  As a table grows, it is automatically split into multiple tablets ( ) MB in size by default  Tablet servers handle read/write requests for their tablets Bigtable: A Distributed Storage System for Structured Data 13

BigTable: Servers  Master manages assignment of tablets servers Bigtable: A Distributed Storage System for Structured Data 14 Tablet server 1 Bigtable Master Tablet server 2 Tablets

Tablet Location  A three-level hierarchy of tablets is used to store tablet locations  The root tablet is never split Bigtable: A Distributed Storage System for Structured Data 15

Tablet Assignment  A master server is responsible for assigning tablets to tablet servers  The master server also:  detects addition and expiration of tablet servers  balances tablet server loads  initiates garbage collection of files in GFS  reassigns tablets when a tablet server is lost  If the master server dies, a new master server is recreated Bigtable: A Distributed Storage System for Structured Data 16

Tablet Serving  The persistent state of a tablet as stored in GFS Bigtable: A Distributed Storage System for Structured Data 17 memtableRead Op Write Op SSTable Files Memory GFS

Compactions  Minor Compactions  memtable size reaches a threshold  memtable is frozen  new memtable is created  frozen memtable is converted into a new SSTable  Merging Compactions Bigtable: A Distributed Storage System for Structured Data 18

 A number of refinements were required for Bigtable implementations to achieve high:  performance  availability  reliability Bigtable: A Distributed Storage System for Structured Data 19

Locality groups  Clients can group multiple column families together into a locality group  A separate SSTable is generated for each locality group  Segregating column families which are not typically accessed together enables more efficient reads Bigtable: A Distributed Storage System for Structured Data 20

Refinements Compression  Clients can control whether compression is used on a locality group  Many clients use a two pass compression algorithm  Bentley and McIlroy's scheme Bigtable: A Distributed Storage System for Structured Data 21

Refinements Caching & Bloom Filters  Tablets use two levels of caching to improve read performance  Scan caching is useful for data which tends to be read repeatedly  Block caching is useful for when read data tends to be close to data recently read  Bloom filters reduce disk seeks by allowing a client to ask whether a SSTable contains a row/column key pair Bigtable: A Distributed Storage System for Structured Data 22

Refinements Speeding Table Recovery  When a tablet is moved to another tablet server :  A minor compaction is performed  The tablet server stop serving the tablet  Another minor compaction (unusually fast)  Then the tablet is moved without requiring any log entry recovery Bigtable: A Distributed Storage System for Structured Data 23

Refinements Exploiting Immutability  Because SSTables are immutable, various parts of the Bigtable system have been simplified:  file system access synchronization  permanently removing deleted data is completely handled thru garbage collection  splitting tables is efficient because child tablets can share the SSTable of parent tablets Bigtable: A Distributed Storage System for Structured Data 24

 Google setup a Bigtable cluster with N tablet servers to measure performance and scalability as N is varied.  configured to use 1 GB of memory  each with two 400GB IDE hard drives, two dual core 2 GHz chips, and a single gigabit Ethernet link  N client machines generated the Bigtable load used for tests  Every machine ran a GFS server. Bigtable: A Distributed Storage System for Structured Data 25

Performance Evaluation Single tablet - server performance Bigtable: A Distributed Storage System for Structured Data 26 Experiment # of Tablet Servers Random Reads Random Reads (mem) Random Writes Sequintial Reads Sequintial Writes Scans

Performance Evaluation  Scaling : Aggregate throughput increases by over a factor of 100 as the number of tablet servers is increased from 1 to 500. Bigtable: A Distributed Storage System for Structured Data 27

Real Applications  As of August 2006  388 non-test Bigtable cluster  tablet servers Bigtable: A Distributed Storage System for Structured Data 28 # of Tablet Servers # of Clusters > 50012

Real Applications Bigtable: A Distributed Storage System for Structured Data 29  This table provides some data about a few of the tables currently in use  Table size (measured before compression) and # Cells indicate approximate sizes

Real Applications Google Analytics  Google Analytics is supported by 2 Bigtables  200 TB raw click table  20 TB summary table Bigtable: A Distributed Storage System for Structured Data 30

Real Applications Google Earth  Google Earth is supported by 2 Bigtables  70 TB images table, compression turned off  500 GB index table Bigtable: A Distributed Storage System for Structured Data 31

Real Applications Personalized Search  Personalized Search supported by 1 Bigtable  one row per user id  separate column family for each type of action Bigtable: A Distributed Storage System for Structured Data 32

Lessons learned  Large distributed systems are vulnerable to many types of failures  memory and network corruption  hung machines  extended and asymmetric network partitions  bugs in other systems (i.e. Chubby)  overflow of GFS quotas  planned and unplanned hardware maintenance  To address experience problems  some protocols have been changed  some assumptions have been modified Bigtable: A Distributed Storage System for Structured Data 33

Lessons learned  It is important to delay adding new features until it is clear how the new features will be used  It is important to support system-level monitoring  allowed for detection and fixing of many issues  also enables tracking clusters to answer common questions Bigtable: A Distributed Storage System for Structured Data 34

Related Work  The Boxwood project's goal is to provide infrastructure for building higher-level services such as file systems or databases  while the goal of Bigtable is to directly support client applications that wish to store data Bigtable: A Distributed Storage System for Structured Data 35

Related Work  C-Store and Bigtable share many characteristics  shared-nothing architecture  two different data structures  however these two systems vary significantly in their APIs performance optimization Bigtable: A Distributed Storage System for Structured Data 36

Conclusions  Bigtable is a distributed system for storing structure data at Google  in production since April 2005  seven person-years to design and implement  more than 60 projects using in August 2006  users like performance and high availability  Users can scale their applications capacity by simply adding more machines to their system Bigtable: A Distributed Storage System for Structured Data 37

Conclusions  Google has begun deploying Bigtable as a service to product groups  Google has gained significant advantages by building their own storage solution  has control over implementation and infrastructure  can remove bottlenecks and inefficiencies as the arise Bigtable: A Distributed Storage System for Structured Data 38

Strengths  Implementation and Usable  Optimization  Performance Evaluation  Used by > 60 Google products Bigtable: A Distributed Storage System for Structured Data 39

Weaknesses  Complexity  Chubby  Master  Network Bigtable: A Distributed Storage System for Structured Data 40

Bigtable: A Distributed Storage System for Structured Data 41