Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson,Jean Michel L´eon, Yawei Li, Alexander Lloyd, Vadim Yushprakh Megastore.

Slides:



Advertisements
Similar presentations
Megastore: Providing Scalable, Highly Available Storage for Interactive Services. Presented by: Hanan Hamdan Supervised by: Dr. Amer Badarneh 1.
Advertisements

There is more Consensus in Egalitarian Parliaments Presented by Shayan Saeed Used content from the author's presentation at SOSP '13
High throughput chain replication for read-mostly workloads
Calvin : Fast Distributed Transactions for Partitioned Database
Spanner: Google’s Globally-Distributed Database James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Spanner: Google’s Globally-Distributed Database By - James C
A Survey of Distributed Database Management Systems Brady Kyle CSC
Cloud Storage Theo Benson. Outline Distributed storage – Commodity server, limited resources, – Geodistribution, scalable, reliable Cassandra [FB] – High.
SPANNER: GOOGLE’S GLOBALLYDISTRIBUTED DATABASE James C. Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, JJ Furman, Sanjay Ghemawat,
Presented By Alon Adler – Based on OSDI ’12 (USENIX Association)
Cloud Storage Yizheng Chen. Outline Cassandra Hadoop/HDFS in Cloud Megastore.
Chapter 13 (Web): Distributed Databases
© 2011 Citrusleaf. All rights reserved.1 A Real-Time NoSQL DB That Preserves ACID Citrusleaf Srini V. Srinivasan Brian Bulkowski VLDB, 09/01/11.
Overview Distributed vs. decentralized Why distributed databases
September 24, 2007The 3 rd CSAIL Student Workshop Byzantine Fault Tolerant Cooperative Caching Raluca Ada Popa, James Cowling, Barbara Liskov Summer UROP.
Murtadha Al Hubail Project Team:. Motivation & Goals NC 1 Cluster Controller NC2 NC3 AsterixDB typical cluster consists of a master node (Cluster Controller)
Low-Latency Multi-Datacenter Databases using Replicated Commit
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Cloud Storage: All your data belongs to us! Theo Benson This slide includes images from the Megastore and the Cassandra papers/conference slides.
Distributed Storage System Survey
1 Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over.
IBM Almaden Research Center © 2011 IBM Corporation 1 Spinnaker Using Paxos to Build a Scalable, Consistent, and Highly Available Datastore Jun Rao Eugene.
:: Conférence :: NoSQL / Scalabilite Etat de l’art Samuel BERTHE10 Mars 2014Epitech Nantes.
© , OrangeScape Technologies Limited. Confidential 1 Write Once. Cloud Anywhere. Building Highly Scalable Web applications BASE gives way to ACID.
Presented by Dr. Greg Speegle April 12,  Two-phase commit slow relative to local transaction processing  CAP Theorem  Option 1: Reduce availability.
VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,
Molecular Transactions G. Ramalingam Kapil Vaswani Rigorous Software Engineering, MSRI.
CSC 536 Lecture 10. Outline Case study Google Spanner Consensus, revisited Raft Consensus Algorithm.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Apache Cassandra - Distributed Database Management System Presented by Jayesh Kawli.
NoSQL Databases NoSQL Concepts SoftUni Team Technical Trainers Software University
NoSQL Databases Oracle - Berkeley DB Rasanjalee DM Smriti J CSC 8711 Instructor: Dr. Raj Sunderraman.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services J. Baker, C. Bond, J.C. Corbett, JJ Furman, A. Khorlin, J. Larson, J-M.
Dynamo: Amazon’s Highly Available Key-value Store
NoSQL Databases Oracle - Berkeley DB. Content A brief intro to NoSQL About Berkeley Db About our application.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Distributed File System By Manshu Zhang. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation MongoDB Architecture.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Distributed Databases DBMS Textbook, Chapter 22, Part II.
1 Distributed Databases BUAD/American University Distributed Databases.
Megastore: Providing Scalable, Highly Available Storage for Interactive Services Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin,
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
MBA 664 Database Management Systems Dave Salisbury ( )
R*: An overview of the Architecture By R. Williams et al. Presented by D. Kontos Instructor : Dr. Megalooikonomou.
Chapter 1 Database Access from Client Applications.
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
EECS 262a Advanced Topics in Computer Systems Lecture 24 Paxos/Megastore November 26 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
EECS 262a Advanced Topics in Computer Systems Lecture 24 Paxos/Megastore April 20 th, 2016 John Kubiatowicz Electrical Engineering and Computer Sciences.
CS 540 Database Management Systems NoSQL & NewSQL Some slides due to Magda Balazinska 1.
CalvinFS: Consistent WAN Replication and Scalable Metdata Management for Distributed File Systems Thomas Kao.
Slide credits: Thomas Kao
CS 440 Database Management Systems
Spanner: Google’s Globally Distributed Database
Alternative system models
CSCI5570 Large Scale Data Processing Systems
Spanner: Google’s Globally-Distributed Database
Distributed Transactions and Spanner
Replication Middleware for Cloud Based Storage Service
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Spanner: Google’s Globally-Distributed Database
Implementing Consistency -- Paxos
CS 440 Database Management Systems
CONSISTENCY IN DISTRIBUTED SYSTEMS
Distributed Databases
Presentation transcript:

Jason Baker, Chris Bond, James C. Corbett, JJ Furman, Andrey Khorlin, James Larson,Jean Michel L´eon, Yawei Li, Alexander Lloyd, Vadim Yushprakh Megastore Scalable Highly Available Storage for Interactive Systems Presented By: Hamid Seyedmoradi Ayoub Hamidi Ehsan Mohamad Nezamian Advanced Database Systems SRBIAU, Kurdistan Campus 10May2012

Megastore  Motivation  Introduction  NoSQL & RDBMS  Megastore  Paxos SRBIAU, Kurdistan Campus Advanced Database Systems2

Megastore wow! more than 3 billion write and 20 billion read daily  key Contribution  Data Model and Storage System  Paxos Replication  Report on Experience SRBIAU, Kurdistan Campus Advanced Database Systems3

AVAILABILITY & SCALE  Replication For Availability, we implemented a synchronous, fault tolerant log replicator optimized for long distance links  Partitioning and Locality for scale, we partitioned data into a vast space of small databases SRBIAU, Kurdistan Campus Advanced Database Systems4

AVAILABILITY & SCALE  Replication  Strategies  Asynchronous Master/Slave  Synchronous Master/Slave  Optimistic Replication We decided to use Paxos SRBIAU, Kurdistan Campus Advanced Database Systems5

Technology Options SRBIAU, Kurdistan Campus Advanced Database Systems 6

7 Technology Options

AVAILABILITY & SCALE  Partitioning and Locality  Replication SRBIAU, Kurdistan Campus Advanced Database Systems8 Datacenters Entity Groups Partition the datastore Each entity group is synchronously replicated across datacenters ACID semantics within an entity group Looser consistency across entity groups Entity group data and replication metadata stored in scalable NoSQL datastores

AVAILABILITY & SCALE  Partitioning and Locality  Operations: SRBIAU, Kurdistan Campus Advanced Database Systems9 Entities (Units of data) Entity Group 1 Most transactions are within a single entity group Cross Entity group transactions supported via Two – Phase Commit Asynch communication between entity groups supported by Queues Global Indexes span entity groups but have weaker consistency Entity Group 2 Local Index Send queue receive

AVAILABILITY & SCALE  Partitioning and Locality  Entity Groups  Selecting Entity Group Boundaries  Example Blogs  Physical Layout SRBIAU, Kurdistan Campus Advanced Database Systems10

Megastore  API Design Philosophy  Data Model  Pre-Joining with Keys  SCATTER  Indexes  Storing Clause  Repeated Indexes.  Inline Indexes  Mapping to Bigtable SRBIAU, Kurdistan Campus Advanced Database Systems11

Megastore SRBIAU, Kurdistan Campus Advanced Database Systems12

Megastore  Transactions and Concurrency Control  Read  current  snapshot  inconsistent  Transaction Lifecycle 1-Read 3-Commit 5-Clean up 2-Application logic 4-Apply SRBIAU, Kurdistan Campus Advanced Database Systems13

Megastore  Queues  Two Phase Commit SRBIAU, Kurdistan Campus Advanced Database Systems14

REPLICATION  Brief Summary of Paxos  Megastore’s Approach  Fast Reads  Fast Writes  Replica Types  Witness Replica  Architecture SRBIAU, Kurdistan Campus Advanced Database Systems15

Architecture SRBIAU, Kurdistan Campus Advanced Database Systems16

Data Structures and Algorithms  Replicated Logs SRBIAU, Kurdistan Campus Advanced Database Systems17

Data Structures and Algorithms  Reads  Query Local  Find Position  Local read  Majority read  Catchup  Validate  Query Data SRBIAU, Kurdistan Campus Advanced Database Systems18

Data Structures and Algorithms SRBIAU, Kurdistan Campus Advanced Database Systems19

Data Structures and Algorithms  Writes  Accept Leader  Prepare  Accept  Invalidate  Apply SRBIAU, Kurdistan Campus Advanced Database Systems20

Feedback SRBIAU, Kurdistan Campus Advanced Database Systems21

END With Thanks Question ? SRBIAU, Kurdistan Campus Advanced Database Systems22