Bigtable: A Distributed Storage System for Structured Data Written By: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike.

Slides:



Advertisements
Similar presentations
Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China
Advertisements

Tomcy Thankachan  Introduction  Data model  Building Blocks  Implementation  Refinements  Performance Evaluation  Real applications  Conclusion.
Bigtable: A Distributed Storage System for Structured Data Fay Chang et al. (Google, Inc.) Presenter: Kyungho Jeon 10/22/2012 Fall.
Homework 2 What is the role of the secondary database that we have to create? What is the role of the secondary database that we have to create?  A relational.
CS525: Special Topics in DBs Large-Scale Data Management HBase Spring 2013 WPI, Mohamed Eltabakh 1.
Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung
Big Table Alon pluda.
Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
Homework 2 In the docs folder of your Berkeley DB, have a careful look at documentation on how to configure BDB in main memory. In the docs folder of your.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
 Pouria Pirzadeh  3 rd year student in CS  PhD  Vandana Ayyalasomayajula  1 st year student in CS  Masters.
Authors Fay Chang Jeffrey Dean Sanjay Ghemawat Wilson Hsieh Deborah Wallach Mike Burrows Tushar Chandra Andrew Fikes Robert Gruber Bigtable: A Distributed.
BigTable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Distributed storage for structured data
Bigtable: A Distributed Storage System for Structured Data
BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
BigTable and Google File System
CSC 536 Lecture 8. Outline Reactive Streams Streams Reactive streams Akka streams Case study Google infrastructure (part I)
Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.
1 The Google File System Reporter: You-Wei Zhang.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.
Bigtable: A Distributed Storage System for Structured Data Google’s NoSQL Solution 2013/4/1Title1 Chao Wang Fay Chang, Jeffrey Dean, Sanjay.
BigTable and Accumulo CMSC 461 Michael Wilson. BigTable  This was Google’s original distributed data concept  Key value store  Meant to be scaled up.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.
Hypertable Doug Judd Zvents, Inc.. hypertable.org Background.
Bigtable: A Distributed Storage System for Structured Data 1.
Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.
Big Table - Slides by Jatin. Goals wide applicability Scalability high performance and high availability.
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Cloud Data Models Lecturer.
Key/Value Stores CMSC 491 Hadoop-Based Distributed Computing Spring 2015 Adam Shook.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
HBase Elke A. Rundensteiner Fall 2013
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.
Cloudera Kudu Introduction
Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,
Bigtable: A Distributed Storage System for Structured Data
Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.
Apache Accumulo CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
Bigtable A Distributed Storage System for Structured Data.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
CSCI5570 Large Scale Data Processing Systems
Bigtable A Distributed Storage System for Structured Data
Lecture 7 Bigtable Instructor: Weidong Shi (Larry), PhD
Column-Based.
HBase Mohamed Eltabakh
Bigtable: A Distributed Storage System for Structured Data
GFS and BigTable (Lecture 20, cs262a)
Data Management in the Cloud
NoSQL Database and Application
CSE-291 (Cloud Computing) Fall 2016
Cloud Computing Storage Systems
A Distributed Storage System for Structured Data
Presentation transcript:

Bigtable: A Distributed Storage System for Structured Data Written By: Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber, Google, Inc. Presented By: Manoher Shatha & Naveen Kumar Ratkal

Overview Introduction Data Model API Building Blocks Implementation Refinements Real Applications Conclusion Discussion

Introduction Bigtable: It is a distributed storage system for managing structured data that is designed to scale to a very large size: i.e. petabytes of data across thousands of commodity servers. Why not Commercial DB? No commercial DB is big enough to store petabytes of data, Even though such DB exists it will be very costly. Low level storage optimizations are difficult to perform on Commercial DB. With good design and implementation Bigtable achieved wide applicability, scalability and high availability. Bigtable is used by more than sixty Google products and projects.

Database Vs Bigtable FeaturesDatabaseBigtable Supports Relational DB ?Most of the databases.No Atomic TransactionsAll are atomic transactions. Limited. Data TypeSupports many data types. String of characters (un- interpreted string). ACID TestYesNO OperationsYes (insert, delete, update etc….) Yes (read, write, update, delete etc…)

Data Model Figure 1: Web Table “Contents:” “anchor:cnnsi.com”anchor:cnnsi.com “anchor:my.look.ca”anchor:my.look.ca “com.cnn.www”com.cnn.www “ CNN ” “CNN.com ” t3 t5 t6 t9t8 Bigtable is a multidimensional stored map. Map is indexed by row key, column key and timestamp. i.e. (row: string, column: string, time:int64 )  String. Rows are ordered in lexicographic order by row key. Row range for a table is dynamically partitioned, Each row range is called “Tablet ”. Columns: syntax is family : qualifier. Cells can store multiple version of data with timestamps.

API Writing to Bigtable // Open the table Table *T = OpenOrDie("/bigtable/web/webtable"); // Write a new anchor and delete an old anchor RowMutation r1(T, "com.cnn.www"); r1.Set("anchor: "CNN"); r1.Delete("anchor: Operation op; Apply(&op, &r1); Taken From paper

API Contd… Reading from Bigtable Scanner scanner(T); ScanStream *stream; stream = scanner.FetchColumnFamily("anchor"); stream->SetReturnAllVersions(); scanner.Lookup("com.cnn.www"); for (; !stream->Done(); stream->Next()) { printf("%s %s %lld %s\n",scanner.RowName(),stream->ColumnName(), stream->MicroTimestamp(), stream->Value()); } Taken From paper

Building Blocks GFS Uses Google File system to store data. Cluster Management Google cluster management system manages Bigtable’s cluster Chubby Its is a distributed lock server. Allows multi-thousand node Bigtable cluster to stay coordinated. Five replicas, one is elected as master.

SSTables This is the underlying file format used to store Bigtable data. SSTables are immutable. If new data is added, a new SSTable is created. Old SSTable is set out for garbage collection. Figure: SSTable Figure : From Erik Paulson presentation 64K Block 64K Block Index SSTable

Tablet & Table Tablets contains some range of rows Figure : Table Figure : From Erik Paulson presentation 64K Block 64K Block Index SSTable 64K Block 64K Block Index SSTable Tablet Start : aardvarkEnd : apple 64K Block 64K Block Index SSTable 64K Block 64K Block Index SSTable 64K Block 64K Block Index SSTable Tablet aardvarkapple Tablet apple_twoboat Figure : Tablet Bigtable contains tables, Tables contains set of tablets and each tablet contains set of rows ( MB).

Implementation Bigtable components Library linked into every client. Master Server. Tablets Server. Tablet server can be added dynamically based on the workload. Master assigns tablets to tablet server. Master is lightly loaded.

Tablet Location Chubby File Root tablet (1at METADATA tablet) Other METADATA tablets User Table 1 User Table N Figure : Tablet Location Hierarchy

Tablet Assignment Each Tablets Server is given a tablet for serving client requests. Master keeps track of the tablet server (RPC) to assign the tablet. Chubby directory is used to acquire lock by the tablet server. If Tablet server terminates, it release the lock on the file. Status is sent to Master by tablet server. How Does Master comes to know about Tablets, Tablet servers? Master acquires unique master lock in chubby. Master scans server directory in chubby to find live servers. Master communicates with each tablet server to get the details. It scans the METADATA table to find the unassigned tablets. Steps are taken from the paper “google bigtable”.

Chubby Directory Acquires unique lock Once scanned it will come to know about the live tablet server Communicates with all the tablet server to get the details about the tablet they are serving Metadata table Scans the metadata table Master Tablet Server Tablet Assignment Contd …

Tablet Serving Commit log stores the updates that are made to the data. Updates are stored in memtable. Recovery process. Reads/Writes that arrive at tablet server. Is the request Well-formed. Authorization. Chubby holds the permission file. If a mutation occurs it is wrote to commit log and finally a group commit is used. Figure : Tablet Representation. Figure is taken from the paper “google bigtable”. Memory GFS Tablet Log Write Op Read Op Memtable SST SSTable Files

Compaction Write Operation Figure : Minor Compaction. Frozen SSTable Memtable Converted to SSTable Figure : Merging Compaction. SST SSTable Small New SSTables Due to minor compaction. New Large SSTables SST Merging compaction leads to major compaction.

Refinements “ CNN ” “CNN.com ” “Contents:” “anchor:cnnsi.com” “anchor:my.look.ca” “com.cnn.www” “ CNN ” “CNN.com ” “Contents:” “com.cnn.www” “anchor:cnnsi.com” “anchor:my.look.ca” Locality Group Figure : From Jeff Dean presentation Multiple column families are grouped into locality group. Efficient reads are done by separating column families. Additional parameters can be specified.

Caching for Read Performance Two types of caches Scan Cache, caches key value pair. Block Cache, caches complete SSTable block. Commit-log Implementation How about Commit-log for each tablet? What would be the bottleneck? Better to use single Commit-log file. Commit log is sorted based on the value to avoid duplicate reads. Refinements Contd…

Speeding up tablet recovery If master decides to move the tablet from once tablet server to other tablet server based on the load, then the tablet under goes minor compaction two time. Exploiting immutability As SSTables are immutable no need of synchronization for accessing data while reading. The deleted data is collected by garbage collection. Refinements Contd…

Figure: Characteristics of few tables in production use Real Applications Figure is taken from the paper “google bigtable”.

Google Analytics Google analytics ( is a service that helps webmasters to analyze traffic patterns at their websites. To enable the service, webmaster embed a small JavaScript program in their web pages. We describe two of the tables used by google analytics. Raw click table(~200 TB) maintains a row for each end-user session. Summary table(~20 TB) contains various predefined summaries for each website. Real Applications Contd… URL Address : IP address Location: User Location Norfolk, VA Figure: Raw Click Table URL Language :Page Rank: English6 Figure: Summary Table

Real Applications Contd… Personalized Search Personalized Search ( is a service that records user queries and clicks across a variety of Google properties. Personalized search stores each user’s data in Bigtable. Each user has a unique userID and is assigned a row named by userID. All user actions are stored in a table. The personalized search data is replicated across several Bigtable clusters to increase availability and to reduce latency due to distance from clients. Date : Text: words used 04/06/2007 Bloom Filter Figure: UserTable User ID

Conclusion What all we discussed? Google’s Bigtable, Its architecture and some real application that are using bigtable. Bigtable is a feasible solution for storing large amount of structured data. It reduces the amount of space required to store the data. Difficult for new users to use bigtable.

Discussions How server expansion is done? And will tablets be redistributed immediately? What happens when the tablet server crashed?

References Websites Papers

Questions????