D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Dynamo: Amazon’s Highly Available Key-value Store Slides taken from created by paper authors Giuseppe DeCandia, Deniz Hastorun,
Dynamo: Amazon’s Highly Available Key-value Store ID2210-VT13 Slides by Tallat M. Shafaat.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
Scalable Content-Addressable Network Lintao Liu
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
AMAZON’S KEY-VALUE STORE: DYNAMO DeCandia,Hastorun,Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels: Dynamo: Amazon's highly available.
D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - V ALUE S TORE Presented By Roni Hyam Ami Desai.
Distributed Hash Tables Chord and Dynamo Costin Raiciu, Advanced Topics in Distributed Systems 18/12/2012.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
Dynamo: Amazon's Highly Available Key-value Store Guiseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin,
Amazon Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Dynamo: Amazon’s Highly Available Key-value Store Adopted from slides and/or materials by paper authors (Giuseppe DeCandia, Deniz Hastorun, Madan Jampani,
1 Dynamo Amazon’s Highly Available Key-value Store Scott Dougan.
Dynamo Highly Available Key-Value Store 1Dennis Kafura – CS5204 – Operating Systems.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
1 ITC242 – Introduction to Data Communications Week 12 Topic 18 Chapter 19 Network Management.
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Dynamo: Amazon’s Highly Available Key- value Store (SOSP’07) Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman,
Rethinking Dynamo: Amazon’s Highly Available Key-value Store --An Offense Shih-Chi Chen Hongyu Gao.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Vector timestamps Global state –Distributed Snapshot Election algorithms.
CS162 Operating Systems and Systems Programming Key Value Storage Systems November 3, 2014 Ion Stoica.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Dynamo: Amazon's Highly Available Key-value Store
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, et.al., SOSP ‘07.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
EECS 262a Advanced Topics in Computer Systems Lecture 22 P2P Storage: Dynamo November 14 th, 2012 John Kubiatowicz and Anthony D. Joseph Electrical Engineering.
Presented by: Alvaro Llanos E.  Motivation and Overview  Frangipani Architecture overview  Similar DFS  PETAL: Distributed virtual disks ◦ Overview.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia et al. [Amazon.com] Jagrut Sharma CSCI-572 (Prof. Chris Mattmann)
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
Data Management Kelly Clynes Caitlin Minteer. Agenda Globus Toolkit Basic Data Management Systems Overview of Data Management Data Movement Grid FTP Reliable.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Dynamo: Amazon's Highly Available Key-value Store Dr. Yingwu Zhu.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Dynamo: Amazon’s Highly Available Key-value Store
Cassandra - A Decentralized Structured Storage System
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer to Peer Networks Distributed Hash Tables Chord, Kelips, Dynamo Galen Marchetti, Cornell University.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
MongoDB is a database management system designed for web applications and internet infrastructure. The data model and persistence strategies are built.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
Databases Illuminated
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Robustness in the Salus scalable block store Yang Wang, Manos Kapritsos, Zuocheng Ren, Prince Mahajan, Jeevitha Kirubanandam, Lorenzo Alvisi, and Mike.
Big Data Yuan Xue CS 292 Special topics on.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Amazon Web Services. Amazon Web Services (AWS) - robust, scalable and affordable infrastructure for cloud computing. This session is about:
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Dynamo: Amazon’s Highly Available Key-value Store
CHAPTER 3 Architectures for Distributed Systems
Providing Secure Storage on the Internet
EECS 498 Introduction to Distributed Systems Fall 2017
EECS 498 Introduction to Distributed Systems Fall 2017
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

D YNAMO : A MAZON ’ S H IGHLY A VAILABLE K EY - VALUE S TORE Presenters: Pourya Aliabadi Boshra Ardallani Paria Rakhshani 1 Professor : Dr Sheykh Esmaili

I NTRODUCTION Amazon runs a world-wide e-commerce platform that serves tens of millions customers at peak times using tens of thousands of servers located in many data centers around the world Reliability at massive scale is one of the biggest challenges we face at Amazon.com, one of the largest e- commerce operations in the world; even the slightest outage has significant financial consequences and impacts customer trust 2

I NTRODUCTION One of the lessons our organization has learned from operating Amazon’s platform is that the reliability and scalability of a system is dependent on how its application state is managed To meet the reliability and scaling needs, Amazon has developed a number of storage technologies, of which the Amazon Simple Storage Service (S3) There are many services on Amazon’s platform that only need primary-key access to a data store 3

S YSTEM A SSUMPTIONS AND R EQUIREMENTS Query Model Operations to a data item that is uniquely identified by a key State is stored as binary objects No operations span multiple data items Dynamo targets applications that need to store objects that are relatively small (less than 1 MB) 4

S YSTEM A SSUMPTIONS AND R EQUIREMENTS ACID Properties ACID ( Atomicity, Consistency, Isolation, Durability ) ACID is a set of properties that guarantee that database transactions are processed reliably Dynamo targets applications that operate with weaker consistency Dynamo does not provide any isolation guarantees and permits only single key updates 5

S YSTEM A SSUMPTIONS AND R EQUIREMENTS Efficiency The system needs to function on a commodity hardware infrastructure Services must be able to configure Dynamo such that they consistently achieve their latency and throughput requirements. The tradeoffs are in performance, cost efficiency, availability, and durability guarantees. 6

S YSTEM A SSUMPTIONS AND R EQUIREMENTS Dynamo is used only by Amazon’s internal services We will discuss the scalability limitations of Dynamo and possible scalability related extensions 7

S ERVICE L EVEL A GREEMENTS (SLA) To guarantee that the application can deliver its functionality in a bounded time, each and every dependency in the platform needs to deliver its functionality with even tighter bounds An example of a simple SLA is a service guaranteeing that it will provide a response within 300ms for 99.9% of its requests for a peak client load of 500 requests per second For example a page request to one of the e-commerce sites typically requires the rendering engine to construct its response by sending requests to over 150 services These services often have multiple dependencies 8

Figure shows an abstract view of the architecture of Amazon’s platform 9

D ESIGN C ONSIDERATIONS Incremental scalability : Dynamo should be able to scale out one storage host (henceforth, referred to as “ node” ) at a time, with minimal impact on both operators of the system and the system itself Symmetry : Every node in Dynamo should have the same set of responsibilities as its peers; there should be no distinguished node or nodes that take special roles or extra set of responsibilities 10

D ESIGN C ONSIDERATIONS Decentralization : An extension of symmetry, the design should favor decentralized peer-to- peer techniques over centralized control. In the past, centralized control has resulted in outages and the goal is to avoid it as much as possible. This leads to a simpler, more scalable, and more available system. Heterogeneity : The system needs to be able to exploit heterogeneity in the infrastructure it runs on. e.g. the work distribution must be proportional to the capabilities of the individual servers. This is essential in adding new nodes with higher capacity without having to upgrade all hosts at once. 11

S YSTEM A RCHITECTURE The Dynamo data storage system contains items that are associated with a single key Operations that are implemented: get( ) and put( ) get(key): locates object with key and returns object or list of objects with a context put(key, context, object): places an object at a replica along with the key and context Context: metadata about object 12

P ARTITIONING Provides mechanism to dynamically partition the data over the set of nodes Use consistent hashing Similar to Chord Each node gets an ID from the space of keys Nodes are arranged in a ring Data stored on the first node clockwise of the current placement of the data key 13

V IRTUAL NODE (single node) -> multiple points in the ring i.e. virtual nodes Advantages of virtual nodes: Graceful handling of failure of a node Easy accommodation of a new node Heterogeneity in physical infrastructure can be exploited 14

R EPLICATION Each data item replicated at N hosts N is configured per-instance Each node is responsible for the region of the ring between it and its N th predecessor Preference list : List of nodes responsible for storing a particular key 15

VERSIONING Multiple versions of an object can be present in the system at same time Vector clock is used for version control Vector clock size issue 16

E XECUTION OF GET () AND PUT () O PERATIONS Operations can originate at any node in the system Coordinator : node handing read or write operation The coordinator contacts R nodes for reading and W nodes for writing, where R + W > N 17

H ANDLING F AILURES Temporary failures: Hinted Handoff Mechanism to ensure that the read and write operations are not failed due to temporary node or network failures. Handling Permanent Failures: Replica Synchronization Synchronize with another node Use Merkle Trees 18

M EMBERSHIP AND F AILURE D ETECTION Explicit mechanism available to initiate the addition and removal of nodes from a Dynamo ring To prevent logical partitions, some Dynamo nodes play the role of seed nodes Gossip-based distributed failure detection and membership protocol 19

I MPLEMENTATION 20 Storage Node Request Coordination Membership & Failure Detection Local Persistence Engine Pluggable Storage Engines Berkeley Database (BDB) Transactional Data Store BDB Java Edition MySQL In-memory buffer with persistent backing store Chosen based on application’s object size distribution Pluggable Storage Engines Berkeley Database (BDB) Transactional Data Store BDB Java Edition MySQL In-memory buffer with persistent backing store Chosen based on application’s object size distribution Built on top of event- driven messaging substrate Coordinator executes client read & write requests State machines created on nodes serving requests Built on top of event- driven messaging substrate Coordinator executes client read & write requests State machines created on nodes serving requests Each state machine instance handles exactly one client request State machine contains entire process and failure handling logic Each state machine instance handles exactly one client request State machine contains entire process and failure handling logic

E XPERIENCES, R ESULTS & L ESSONS L EARNT Main Dynamo Usage Patterns 1. Business logic specific reconciliation E.g. Merging different versions of a customer’s shopping cart 2. Timestamp based reconciliation E.g. Maintaining customer’s session information 3. High performance read engine E.g. Maintaining product catalog and promotional items Client applications can tune parameters to achieve specific objectives: N: Performance {no. of hosts a data item is replicated at} R: Availability {min. no. of participating nodes in a successful read opr} W: Durability {min. no. of participating nodes in a successful write opr} Commonly used configuration (N,R,W) = (3,2,2) 21

E XPERIENCES, R ESULTS & L ESSONS L EARNT Balancing Performance and Durability Average & 99.9 th percentile latencies of Dynamo’s read and write operations during a period of 30 days Comparison of performance of 99.9 th percentile latencies for buffered vs. non-buffered writes over 24 hours 22

C ONCLUSION Dynamo: Is a highly available and scalable data store Is used for storing state of a number of core services of Amazon.com’s e-commerce platform Has provided desired levels of availability and performance and has been successful in handling: Server failures Data center failures Network partitions Is incrementally scalable Sacrifices consistency under certain failure scenarios Extensively uses object versioning Combination of decentralized techniques can be combined to provide a single highly-available system. 23

thanks 24