BigTable and Google File System

Slides:

Advertisements

Similar presentations

Introduction to cloud computing Jiaheng Lu Department of Computer Science Renmin University of China

Advertisements

Tomcy Thankachan  Introduction  Data model  Building Blocks  Implementation  Refinements  Performance Evaluation  Real applications  Conclusion.

CS525: Special Topics in DBs Large-Scale Data Management HBase Spring 2013 WPI, Mohamed Eltabakh 1.

Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.

Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung

The google file system Cs 595 Lecture 9.

G O O G L E F I L E S Y S T E M 陳仕融黃振凱林佑恩 Z 1.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google Jaehyun Han 1.

The Google File System Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani 1CS5204 – Operating Systems.

Bigtable: A Distributed Storage System for Structured Data Presenter: Guangdong Liu Jan 24 th, 2012.

Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.

Lecture 7 – Bigtable CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation is licensed.

Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.

The Google File System.

7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.

 Pouria Pirzadeh  3 rd year student in CS  PhD  Vandana Ayyalasomayajula  1 st year student in CS  Masters.

Distributed storage for structured data

Bigtable: A Distributed Storage System for Structured Data

BigTable CSE 490h, Autumn What is BigTable? z “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by.

Case Study - GFS.

Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.

Platform as a Service (PaaS)

Google Distributed System and Hadoop Lakshmi Thyagarajan.

Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗

Bigtable: A Distributed Storage System for Structured Data F. Chang, J. Dean, S. Ghemawat, W.C. Hsieh, D.A. Wallach M. Burrows, T. Chandra, A. Fikes, R.E.

1 The Google File System Reporter: You-Wei Zhang.

CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.

Introduction to Hadoop and HDFS

Google’s Big Table 1 Source: Chang et al., 2006: Bigtable: A Distributed Storage System for Structured Data.

Bigtable: A Distributed Storage System for Structured Data Google’s NoSQL Solution 2013/4/1Title1 Chao Wang Fay Chang, Jeffrey Dean, Sanjay.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

1 Dennis Kafura – CS5204 – Operating Systems Big Table: Distributed Storage System For Structured Data Sergejs Melderis 1.

Bigtable: A Distributed Storage System for Structured Data 1.

Google Bigtable Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber.

Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,

Large-scale Incremental Processing Using Distributed Transactions and Notifications Daniel Peng and Frank Dabek Google, Inc. OSDI Feb 2012 Presentation.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Presenters: Rezan Amiri Sahar Delroshan

GFS : Google File System Ömer Faruk İnce Fatih University - Computer Engineering Cloud Computing

Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.

GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.

CSC590 Selected Topics Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A.

 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.

Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture Chunkservers Master Consistency Model File Mutation Garbage.

Cloudera Kudu Introduction

Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,

Bigtable: A Distributed Storage System for Structured Data

Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.

Bigtable: A Distributed Storage System for Structured Data Google Inc. OSDI 2006.

Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.

The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)

Bigtable A Distributed Storage System for Structured Data.

Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore

From Coulouris, Dollimore, Kindberg and Blair Distributed Systems: Concepts and Design Chapter 3 System Models.

Platform as a Service (PaaS)

Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung

Platform as a Service (PaaS)

Bigtable A Distributed Storage System for Structured Data

HBase Mohamed Eltabakh

Bigtable: A Distributed Storage System for Structured Data

CSE-291 (Cloud Computing) Fall 2016

Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.

The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.

A Distributed Storage System for Structured Data

THE GOOGLE FILE SYSTEM.

by Mikael Bjerga & Arne Lange

MapReduce: Simplified Data Processing on Large Clusters

Presentation transcript:

BigTable and Google File System Presented by: Ayesha Fawad 10/07/2014

Overview Google File System Basics Design Chunks Replicas Clusters Client

Overview Google File System chunk server Master server Shadow Master Read Request Workflow Write Request Workflow Built-in Functions Limitations

Overview BigTable Introduction What is BigTable? Design example Rows and Tablets Columns and Column Families

Overview BigTable Timestamp Cells Data Structure SSTables and Logs example Cells Data Structure SSTables and Logs Tablet Table

Overview BigTable Cluster Chubby How to find a row Mutations BigTable Implementation BigTable Building Blocks Architecture

Overview BigTable Master server Tablet server Client Library Incase of Failure? Recovery Process Compactions Refinement

Overview BigTable Interactions between GFS and BigTable API Why use BigTable? Why not any other Database? Application Design CAP

Overview BigTable Google Services using BigTable BigTable Derivatives Colossus Comparison

Overview Google App Engine Introduction GAE Data store Unsupported Actions Entities Models Queries Indexes

Overview Google App Engine GQL Transactions Data store Software Stack GUI Main Data store options Competitors

Overview Google App Engine Hard Limits Free Quotas Cloud Data Storage options

Presented by: Ayesha Fawad 10/07/2014 Google File System Presented by: Ayesha Fawad 10/07/2014

Basics Originated in 2003. GFS is designed for system to system interaction, not user to system. Network of inexpensive machines running on Linux operating systems

Design GFS relies on Distributed Computing to provide users the infrastructure they need to create, access and alter data Distributed Computing: is all about networking several computers together and taking advantage of their individual resources in a collective way. Each computer contributes some of its resources e.g. such as memory, processing power and hard drive space, to the overall network. It turns the entire network into a massive computer, with each individual computer acting as a processor and data storage device.

Design Autonomic Computing: a concept in which computers are able to diagnose problems and solve them in real time without the need for human intervention Challenge for GFS development team was to design an autonomic monitoring system that could work across a huge network of computers Simplification offer basic commands like open, create, read, write and close. Some specialized commands like append, snapshot

Design Checkpoints can include application level checksums Readers verify and process only file region up to last checkpoint, which is known to be in defined state Check pointing allows writers to restart incrementally and keeps readers from processing successfully written file data that is still incomplete from applications point of view Relies on appends rather than overwrites

Chunks Files on the GFS tend to be very large (multi-gigabyte (GB) range) GFS handles this issue by breaking files up into chunks of 64 MB each good for scans, streams, archives, shared Q’s Each chunk has a unique 64-bit ID number called chunk handle Simplifies Resource Application: all file chunks are the same size check which computers are near capacity check which computers are underused balance workload by moving chunk from one resource to another

Replicas Two categories: Primary Replica: primary replica is the chunk that a chunk server sends to a client Secondary Replica: secondary replicas serve as backups on other chunk servers Master decides which chunks will act as primary or secondary Based on client changes to the data in the chunk, the master server informs chunk servers with secondary replicas that they have to copy the new chunk off the primary chunk server to stay current

Design REFERENCE: http://en.wikipedia.org/wiki/Google_File_System

Clusters Google has organized GFS into a simple network of computers called clusters Cluster contains three kinds of entities: Clients Master Server Chunk servers

Client Clients: any entity making a request Developed by Google for its own use Clients can be other computers or computer applications

Chunk server Chunk servers: workhorses stores the 64 MB file chunks sends requested chunks directly to client replicas are configurable

Master server Master Server: is the coordinator for cluster maintains operation log keeps track of metadata information describing chunks chunk garbage collection re-replication on chunk server failures chunk migration to balance load and disk space does not store the actual chunks

Master server Upon start up, master server polls all the chunk servers chunk servers respond back with information of: data they contain location details space details REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system4.htm

Shadow Master Shadow master servers contact primary master server to stay up to date operation log polling chunk servers Anything goes wrong with the primary master, the shadow server can take over GFS ensure shadow master servers are stored on different machines (incase of hardware failure) Shadow servers lag behind the primary master server by fractions of a second They provide limited services in parallel with master. Services are limited to reads

Shadow Master REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system4.htm

Read Request Work flow Client send a read request for a particular file to master REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Read Request Work flow 2. Master responds back with a location of primary replica, where client can find that particular file REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Read Request Work flow Client contacts the chunk server directly REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Read Request Work flow 4. chunk server sends the replica to the client REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow Client sends the request to master server REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow 2. Master responds back with a location of primary and secondary replicas REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow 3. Client sends the write data to all the replicas. Regardless of primary or secondary, closest one first (pipeline) REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow Once data is received by replicas, client instructs the primary replica to begin the write function primary assigns consecutive serial numbers to each of the file changes (mutations) REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow 5. After primary applies the mutations to its own data, it sends the write requests to all the secondary replicas REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow 6. Secondary replicas complete the write function and report back to the primary replica REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Write Request Work flow Primary sends confirmation to the client if that doesn’t work, the master will identify the affected replica as garbage REFERENCE: http://computer.howstuffworks.com/internet/basics/google-file-system3.htm

Mutations REFERENCE: http://static.googleusercontent.com/media/research.google.com/en/us/archive/gfs-sosp2003.pdf

Mutations Consistent: a file region is consistent, if all clients will always see same data, regardless of which replicas is being read Defined: a region is defined, after a file data mutation if it is consistent and clients will see what the mutation writes in its entirely

Built-in Functions Master and Chunk replication Streamlined recovery process Rebalancing Stale replica detection Garbage removal - configurable Checksumming each 64 MB chunk is broken into blocks of 64 KB each block has its own 32-bit checksum master monitors and compares checksums prevents data corruption

Limitations Suited for batch-oriented applications which prefers high sustained bandwidth over low latency e.g. web crawling Single Point of Failure is unacceptable for latency sensitive applications e.g. Gmail or YouTube Single master a scanning bottleneck Consistency Problems

Presented by: Ayesha Fawad 10/07/2014 BigTable Presented by: Ayesha Fawad 10/07/2014

Introduction Created by Google in 2005. Maintained as a proprietary, in-house technology. Some technical details were disclosed in USENIX Symposium in 2006. It is being used by Google services since 2005.

What is BigTable? It is a distributed storage system could be spread across multiple nodes appears to be one large table not a database design, it’s a storage design model REFERENCE: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf

What is BigTable? Map BigTable is a collection of (key, value) pairs The key identifies a row and the value is the set of columns

What is BigTable? Sparse different rows in the table may you use different columns. with many of the columns empty for a particular row

What is BigTable? Column-oriented it can operate on a set of attributes (columns) for all tuples stores each column contiguously on disk allow more records in a disk block reduces the disk I/O The underlying assumption is that in most cases not all columns are needed for data access In RDBMS implementation, usually each “row” is stored contiguous on disk

Example webpages REFERENCE: http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf

Example webpages { "com.cnn.www" => { "contents" => "html….", "anchor" => { "cnnsi.com" => "CNN", "my.look.ca" => "CNN.com" } },

What is BigTable? It is semi-structured Map (key value pair) different rows in the same table can have different columns key is string, so it is not required to be sequential unlike an array

What is BigTable? Lexicographically sorted data is sorted by keys structure keys in a way that sorting brings the data together, for e.g. edu.villanova.cs edu.villanova.law edu.villanova.www REFERENCE: http://en.wikipedia.org/wiki/Lexicographical_order

What is BigTable? Persistent when a certain amount of data is collected in memory, BigTable makes it persistent by storing the data in Google File System

What is BigTable? Multi-dimensional URLS : row keys data is indexed by row key, column name and time stamp its like a table with many rows (key) and many columns (columns) with timestamp. it acts like a map For e.g URLS : row keys Metadata of Web pages : column names Contents of Web page : column Timestamps when fetched

Design Data is indexed by row key, column name and time stamp row :string column :string time :int64 Data is indexed by row key, column name and time stamp Each value in map is an interpreted array of bytes Offers client some control over data layout and format Careful choice of schema can control locality of data Client decides how to serialize the data

Row Row key is up to 64KB Row range for a table are dynamically partitioned Each row range is called a tablet unit of distribution load balancing Clients can select row keys for better locality of data accesses reads of short row ranges are efficient. typically require communication with few number of machines

Row every read or write of data using a single row key is atomic no guarantee across rows (different columns being read or written in the row) supports single row transactions to perform atomic (read, modify, write) sequences on data store under a single row key does not support general transaction across row keys

Row with Example ROW

Column and Column Families Column keys are grouped together to form sets, which are called Column families family:qualifier data stored in the same column family usually has the same data type indexes data in the same column family compress data number of distinct column families are small for e.g. language used on web page

Column with Example COLUMNS

Column Families with Example COLUMN FAMILY family: qualifier

Timestamp 64-bit integers Multiple timestamps exist in each cell to show various versions of data created modified Most recent version is accessible first can choose options for garbage collection can choose specific timestamps Timestamps are assigned by BigTable (in microseconds) or Client application

Timestamp with Example TIMESTAMPS

Cells CELLS

Mutations First, mutations are logged in a log file Log file is stored in GFS Then the mutations are applied to an in-memory version called memtable

Mutations REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf

Data Structure GFS supports two data structures. Logs Sorted String Tables Data Structure is defined using protocol buffers (data description language) Used to avoid inefficiency of converting data from one format to another. For e.g. data format in Java and .NET

SSTables and Logs In memory BigTable provides mutable storage using key-value. Once the log or in-memory table reaches a certain limit, changes are made persistent by GFS. immutable All transaction in memory are saved in GFS as segments, called logs After changes reach a certain size (that you want in memory), they are cleaned. After cleaning, data is compacted into series of SSTables Then sent out as chunks to GFS

SSTables and Logs SSTable provides a persistent, immutable, ordered map from keys to values Sequence of blocks form into an SSTable Each SSTable saves one block index when SSTable is opened, index is loaded in memory specifies block location

SSTables and Logs Index (block ranges) 64KB Block … REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf

Tablet Tablets are a range of rows of a table Contains multiple SSTables Tablets are assigned to Tablet servers Tablet Start:aardvark End:apple REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf SSTable SSTable 64K block 64K block 64K block 64K block 64K block 64K block Index Index

Table Multiple tablets form a Table SSTables can overlap but tablets do not overlap Tablet Tablet aardvark apple apple boat REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf SSTable SSTable SSTable SSTable

Cluster BigTable cluster stores tables each table consists of tablets initially, table contains one tablet as the table grows, multiple tablets are created tablets are assigned to tablet servers each tablet exists at only one server server contains multiple tablets each tablet is 100-200 MB

How to find a Row? REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf

How to find a Row? Client reads location of the root tablet from the Chubby file Root tablet contains location of Metadata tablets root tablet never splits Metadata tablet contains the location of user tablets

BigTable Architecture REFERENCE: http://www.crunchertronics.com/high-scalability-architecture-cloud-computing/

BigTable Implementation BigTable has 3 components: Master Server Tablet Servers: dynamically added or removed to handle workload Chubby Client Library: links the master server, many tablets servers and all clients

BigTable Implementation REFERENCE: http://www.cse.buffalo.edu/~mpetropo/CSE736-SP10/slides/seminar100409b1.pdf

BigTable Building Blocks Google File System Stores persistent state. Scheduler Schedules jobs involved in serving BigTable Lock Service Master election Location bootstrapping Map Reduce Used to read/write BigTable data

Chubby Distributed Lock Service Highly available Paxos Atomic name space consists of directories and files, which are used as locks provides Mutual Exclusion Highly available 1 master (elected) 5 active replicas Paxos maintain consistency in replicas Atomic reads and writes

Chubby Responsible for: ensure there is only one active master store the bootstrap location of BigTable data discover tablet servers store BigTable schema information store access control lists REFERENCE: https://www.cs.rutgers.edu/~pxk/417/notes/content/bigtable.html

Chubby Client Library Responsible for: providing consistent caching of Chubby files each Chubby client maintains a session with a Chubby service Every client’s session has a lease expiration time. If the client is unable to renew its session lease within the given time, the session expires and all locks and open handles are lost

Master Server Starts up: Responsible for: Incase of Failure: Acquires unique master lock in chubby Discovers tablet assignments Discovers live servers in chubby Scans Metadata table to learn the set of tablets Responsible for: Adding or deleting tablet servers based on demand Assigns tablets to tablet servers Monitor and balance tablet server load Garbage collection of files in GFS Check tablet server for the status of its lock Incase of Failure: If session with Chubby is lost, master kills itself and an election can take place to find a new master

Tablet Server Starts up: acquires an exclusive lock on a uniquely named file in a specific Chubby directory Responsible for: Tablet servers manages tablets Splits tablets beyond a certain size For reads and writes, client communicates directly with tablet server Incase of Failure: if it loses its exclusive lock, the tablet server stops serving if the file exists, will attempt to reacquire lock if the file no longer exists, tablet server kills itself, restart and join the pool of unassigned tablet servers

Tablet Server Failure REFERENCE: http://www.crunchertronics.com/high-scalability-architecture-cloud-computing/

Tablet Server Failure REFERENCE: http://www.crunchertronics.com/high-scalability-architecture-cloud-computing/

Tablet Server Recovery Process Read metadata containing SSTABLES and redo points Metadata table contains the list of SSTables that comprise a tablet and a set of a redo points Redo points are pointers into any commit logs Apply redo points to reconstruct the memtable based on updates in commit log

Tablet Server Recovery Process Read and Write requests at the tablet server are checked to make sure they are well formed Check permission file in Chubby to ensure Authorization Incase of write operation, all mutations are written to commit log and finally a group commit is used Incase of read operation, it is executed on a merged view of the sequence of SSTables and the memtable

Compactions When in-memory is full Minor compaction – convert the memtable into an SSTable Reduce memory usage Reduce log traffic on restart Merging compaction Reduce number of SSTables Good place to apply policy “keep only N versions” Major compaction Merging compaction that results in only one SSTable No deletion records, only live data

Refinement Locality groups Compression Caching for read performance Clients can group multiple column families together into a locality group. Compression Compression applied to each SSTable block separately Uses Bentley and McIlroy's scheme and fast compression algorithm Caching for read performance Uses Scan Cache and Block Cache Bloom filters Reduce the number of disk accesses

Refinement Commit-log implementation Exploiting SSTable immutability Suppose one log per tablet rather have one log per tablet server Exploiting SSTable immutability No need to synchronize accesses to file system when reading SSTables Concurrency control over rows efficient Deletes work like garbage collection on removing obsolete SSTables Enables quick tablet split: parent SSTables used by children

Interactions between GFS and BigTable Persistent state of a collection of rows (tablet) is stored in GFS Writes Incoming writes are recorded in memory as memtables They are sorted and buffered in memory After they reach a certain size, they are stored in sequence of SSTables (persistent storage, in GFS)

Interactions between GFS and BigTable Reads Information can be in Memtables or SSTables Need to consider, how to avoid Stale information All tables are sorted so easy to find most recent Recovery To recover a tablet, tablet server reconstructs Memtable by reading its metadata, redo points

API BigTable APIs provide functions for: Creating/deleting tables, column families Changing cluster, table and column family metadata such as access control rights Client applications can: write or delete values lookup values from individual rows iterate over a subset of data Support of single row transactions Allowing cells to be used as integer counters Executing client supplied scripts in the address space of servers

Why use BigTable? Scale is Large More than 100 TB of Satellite Image Data Millions of users thousands of queries per second manage Latency Billions of URLS billions and billions of pages each page has many versions

Why not any other Database? In-house solution is always cheaper Scale is very large for most of the databases Cost is too high Same system can be used across different projects, which again lowers the cost With Relational Databases, we expect ACID transactions. It is impossible to guarantee Consistency while providing High Availability and Network Partition Tolerance.

CAP REFERENCE: http://stackoverflow.com/questions/7339374/nosql-what-does-it-mean-for-mongodb-or-bigtable-to-not-always-be-available

CAP REFERENCE: http://stackoverflow.com/questions/7339374/nosql-what-does-it-mean-for-mongodb-or-bigtable-to-not-always-be-available

Application Design Reminders Timestamp is Int64, so application needs to plan for updating the same cell at the same time by multiple clients. At application level, need to know the data structure that is supported by GFS, to avoid conversion

Google Services using BigTable Used a database by: Google Analytics Google Earth Google App Engine Datastore Google Personalized Search

BigTable Derivatives Apache Hbase database, which is built to run on top of the Hadoop Distributed File System (HDFS). Cassandra, which originated at Facebook Inc. Hypertable, an open source technology, an alternative to HBase.

Colossus GFS is more suited for batch operations Colossus is a revamped file system that is suited for real-time operations Colossus makes use of a new search infrastructure called ‘Caffeine’ which enables Google to update its search index in real-time In Colossus there are many masters operating at the same time Number of changes have already been made to the open-source Hadoop to make it look more like Colossus

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Comparison REFERENCE: http://vschart.com/compare/dynamo-db/vs/bigtable/vs/mongodb

Presented by: Ayesha Fawad 10/07/2014 Google App Engine Presented by: Ayesha Fawad 10/07/2014

Introduction Also know as GAE or App Engine Preview started in April 2008 Came out of preview in September 2011 Is a a PAAS (platform as a service) Allows developing and hosting web applications in Google managed data centers Default choice for storage is a NoSQL solution

Introduction Language Independent plans to support more languages Automatic scaling automatically allocates more resources to handle additional demand It is free up to certain level of resources (storage, bandwidth, or instance hours) required by the application Does not allow joins

Introduction Applications are Sandboxed across multiple servers security mechanism to execute/run untested code restricted resources for the safety of host system Reliable Service Level Agreement of 99.5% uptime can sustain multiple data center failures

GAE Data store It is built on top of BigTable Follows a hierarchical structure Schema-less object data store Designed to scale for high performance Queries are pre-built indexes Does not require entities of same kind

Does Not Support Join operations Inequality filtering on multiple properties Filtering data based on results of sub query

Entities Also known as Objects in App Engine Data store Each entity is uniquely identified by its own key An entity: begins with root entity proceeding from parent to child Every Entity belongs to an Entity group

Models Model is the superclass for data model definitions defined in google.appengine.ext.db Entities of a given kind are represented by instance of the corresponding model class

Queries A Data store Query retrieves entities from data store which operates on entity values keys to meet specified set of conditions Data store API provides a Query class for constructing queries PreparedQuery class for fetching and returning entities from the data store Can apply filters and sort orders on queries

Indexes An index is defined on a list of entity properties of an entity kind An index table contains a column for every property specified in the index’s definition Data store identifies the index that corresponds with the Query’s kind, filter properties, filter operators and sort orders App Engine predefines an index on each property of each kind. These indexes are sufficient to simple queries.

GQL GQL is a SQL like language for retrieving Entities or Keys from App Engine Data store

Transactions Transaction is a set of Data store operations on one or more entity Its atomic, means transactions are never partially applied Isolation and Consistency Required when users are attempting to create or update an entity with same string ID Also possible to queue transactions

Data store Software Stack

Data store Software Stack App Engine Data store schema-less storage advanced query engine Megastore Multi-row transactions simple indexes/queries strict schema BigTable distributed key/value store Next gen distributed file system

GUI https://appengine.google.com Everything done through console can also be done through Command Line (appcfg)

GUI Main Data Administration Billing

GUI (Main) Dashboard you can see all metrics related to your application. versions resources and usage much more

GUI (Main) Instances total number of instances availability (e.g. dynamic) average latency average memory much more

GUI (Main) Logs detailed information helps resolving any issue much more

GUI (Main) Versions number of versions default setting deployment information delete a specific version much more

GUI (Main) Backends its like a worker role piece of business logic which does not have a user interface much more

GUI (Main) Crom Jobs time based job can be defined in xml or yaml file much more

GUI (Main) Task Queues can create multiple tasks first one will be default automatically can be defined in xml or yaml file much more

GUI (Main) Quota Details detailed metrics of resources being used For e.g. storage, memcache, mail etc shows daily quota shows rate details of what client is billed for much more

Data store Options High-Replication uses Paxos algorithm multi master read and write provides highest level of availability (99.999% SLA) certain queries will be Eventually Consistent some latency due to multi master writing reads are from the fastest source (local) Reads are transactional

Data store Options Master/Slave offers strong Consistency over availability, for all reads and queries data is written to a single master data center, then replicated asynchronously to other (slave) data centers 99.9% SLA reads from master only

Competitors App Engine offers better infrastructure to host applications in terms of administration and scalability Other hosting services offer better flexibility for applications in terms of languages and configuration

Hard Limits

Free Quotas

Cloud Data Storage Options

References Reference to Bigtable http://en.wikipedia.org/wiki/BigTable https://www.cs.rutgers.edu/~pxk/417/notes/content/bigtable.html http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/archive/bigtable-osdi06.pdf Reference to Google File System http://en.wikipedia.org/wiki/Google_File_System http://static.googleusercontent.com/media/research.google.com/en/us/archive/gfs-sosp2003.pdf