D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store Slides taken from created by paper authors Giuseppe DeCandia, Deniz Hastorun,
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
Distributed Hash Tables Chord and Dynamo Costin Raiciu, Advanced Topics in Distributed Systems 18/12/2012.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Dynamo: Amazon's Highly Available Key-value Store Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Amazon Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Dynamo: Amazon’s Highly Available Key-value Store Adopted from slides and/or materials by paper authors (Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
Dynamo Highly Available Key-Value Store 1Dennis Kafura – CS5204 – Operating Systems.
NoSQL Databases: MongoDB vs Cassandra
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
CS 582 / CMPE 481 Distributed Systems
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Rethinking Dynamo: Amazon’s Highly Available Key-value Store --An Offense Shih-Chi Chen Hongyu Gao.
Wide-area cooperative storage with CFS
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Distributed Databases
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Dynamo: Amazon's Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, et.al., SOSP ‘07.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 14 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia et al. [Amazon.com] Jagrut Sharma CSCI-572 (Prof. Chris Mattmann)
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Cloud Computing Cloud Data Serving Systems Keke Chen.
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili.
Dynamo: Amazon’s Highly Available Key-value Store
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
CAP + Clocks Time keeps on slipping, slipping…. Logistics Last week’s slides online Sign up on Piazza now – No really, do it now Papers are loaded in.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Peer to Peer Networks Distributed Hash Tables Chord, Kelips, Dynamo Galen Marchetti, Cornell University.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Partitioning and Replication.
Databases Illuminated
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
An Architecture for Mobile Databases By Vishal Desai.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Introduction to Active Directory
Big Data Yuan Xue CS 292 Special topics on.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
Amazon Simple Storage Service (S3)
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Dynamo: Amazon’s Highly Available Key-value Store
Lecture 9: Dynamo Instructor: Weidong Shi (Larry), PhD
Chapter 19: Distributed Databases
Providing Secure Storage on the Internet
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai

I NTRODUCTION Dynamo is a highly available and scalable key- value storage system adapted for a number of core services in Amazon’s e-commerce platform. It’s used to provide an “always-on” experience. In platforms like Amazon which is highly decentralized, there is a need for storage systems that can be always available. Dynamo provides a simple primary-key only interface to meet the requirements of the applications.

I NTRODUCTION (C ONT ’ D ) Dynamo is an internal technology designed to give its users the ability to trade-off cost, consistency, durability and performance while maintaining high-availability. Amazon has also developed a simple storage service called S3 to meet the reliability and scaling needs.

S YSTEM A SSUMPTIONS AND R EQUIREMENTS Query Model: simple read and write operations to a data item that is uniquely identified by a key. ACID Properties: Atomicity, Consistency, Isolation, Durability. Efficiency: latency requirements which are in general measured at the 99.9th percentile of the distribution. Other Assumptions: operation environment is assumed to be non-hostile and there are no security related requirements such as authentication and authorization.

S ERVICE LEVEL AGREEMENTS (SLA) Application can deliver its functionality in abounded time Every dependency in the platform needs to deliver its functionality with even tighter bounds. Example s ervice guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second. Service-oriented architecture of Amazon’s platform

D ESIGN CONSIDERATION Data replication algorithms provide a strongly consistent data access interface. One of the important design consideration is to decide when to perform the process of resolving the conflicts i.e. during reads or writes. “always writeable” data store where no updates are rejected due to failures or concurrent writes. An infrastructure within a single administrative domain where all nodes are assumed to be trusted. Second design consideration is who perform the process of conflict resolution. Can be done by data store or application.

D ESIGN CONSIDERATION (C ONT ’ D ) Incremental Scalability Must be able to add nodes on-demand with minimal impact Symmetry Every node in Dynamo should have the same set of responsibilities as its peers. Decentralization In the past, centralized control has resulted in outages and the goal is to avoid it as much as possible. Heterogeneity This is essential in adding new nodes with higher capacity without having to upgrade all hosts at once.

S YSTEM ARCHITECTURE Distributed Systems Techniques used in Dynamo Partitioning Replication Versioning Membership Failure Handling Scaling

P ARTITIONING ALGORITHM Consistent hashing T he output range of a hash function is treated as a fixed circular space or “ring”. Principle advantage of consistent hashing Departure or arrival of a node only affects its immediate neighbors while other nodes remain unaffected. Virtual Nodes Each node can be responsible for more than one virtual node.

A DVANTAGES OF USING VIRTUAL NODES If a node becomes unavailable the load handled by this node is evenly dispersed across the remaining available nodes. When a node becomes available again, the newly available node accepts a roughly equivalent amount of load from each of the other available nodes. The number of virtual nodes that a node is responsible can decided based on its capacity, accounting for heterogeneity in the physical infrastructure.

REPLICATION Each data item is replicated at N hosts. Preference list The list of nodes that is responsible for storing a particular key

D ATA V ERSIONING A put() call may return to its caller before the update has been applied at all the replicas A get() call may return many versions of the same object Key Challenge: distinct version sub-histories – need to be reconciled Solution: to use vector clocks in order to capture causality between different versions of the same object

V ECTOR C LOCKS A vector clock is a list of (node, counter) pairs Every version of every object is associated with one vector clock If the counters on the first object’s clock are less- than-or-equal-to all the nodes in the second clock, the first is considered an ancestor of the second and can be forgotten.

V ERSION EVOLUTION OF AN OBJECT OVER TIME

E XECUTION OF GET () AND PUT () OPERATIONS Two strategies to select a node: 1. Route its request through a generic load balancer that will select a node based on load information. 2. Use a partition-aware client library that routes requests directly to the appropriate coordinator nodes.

T EMPORARY F AILURES – S LOPPY Q UOROM R/W is the minimum number of nodes that must participate in a successful read/write operation Setting R + W > N yields a quorum-like system In this model, the latency of a get (or a put) operation is dictated by the slowest of the R (or W) replicas. For this reason, R and W are usually configured to be less than N, to provide better latency.

H INTED H ANDOFF Assume N = 3. When A is temporarily down or unreachable during a write, send replica to D. D is hinted that the replica belongs to A and it will deliver to A when A recovers Again: “always writeable”

H ANDLING PERMANENT FAILURES : R EPLICA SYNCHRONIZATION Merkle tree: A hash tree where leaves are hashes of the values of individual keys. Parent nodes higher in the tree are hashes of their respective children Advantages Each branch of the tree can be checked independently without requiring nodes to download the entire tree. Help in reducing the amount of data that needs to be transferred while checking for inconsistencies among replicas.

S UMMARY OF TECHNIQUES USED IN D YNAMO AND THEIR ADVANTAGES

I MPLEMENTATION JAVA Local persistence components allows for different storage engines to be plugged in: Berkeley Database (BDB) Transactional datastore: objects of tens of kilobytes MySQL: objects of greater that tens of kilobytes BDB Java edition

D YNAMO ’ S PARTITIONING SCHEME Strategy 1: T random tokens per node and partition by token The space needed to maintain the membership at each node increases linearly with the number of nodes in the system

S TRATEGY 2: T RANDOM TOKENS PER NODE AND EQUAL SIZED PARTITIONS Strategy 2: Divides the hash size into Q equally sized partitions Primary advantages are: 1. decoupling of partitioning and partition placement 2. Enabling the possibility of changing the placement of scheme at runtime.

S TRATEGY 3: Q/S TOKENS PER NODE, EQUAL SIZED PARTITIONS Divides hash size into Q equally sized partitions Each node is assigned Q/S tokens where S is the number of nodes in the system When a node leaves the system, its tokens are randomly distributed to the remaining nodes When a node joins the system, it “steals” tokens from nodes in the system.

Strategy 3 achieves better efficiency Faster bootstrapping/recovery: Since partition ranges are fixed, they can be stored in separate files, meaning a partition can be relocated as a unit by simply transferring the file (avoiding random accesses needed to locate specific items). Ease of archival Periodical archiving of the dataset is a mandatory requirement for most of Amazon storage services. Archiving the entire dataset stored by Dynamo is simpler in strategy 3 because the partition files can be archived separately.

C ONCLUSION Dynamo is a highly available and scalable data store for Amazon’s e-commerce platform. Dynamo has been successful in handling server failures, data center failures and network partitions. Dynamo is incrementally scalable and allows service owners to scale up and down based on their current request load. Dynamo allows service owners to customize their storage system by allowing them to tune the parameters N, R and W.