Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning.

Slides:



Advertisements
Similar presentations
Dynamo: Amazon’s Highly Available Key-value Store
Advertisements

Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Case Study - Amazon. Amazon r Amazon has many Data Centers r Hundreds of services r Thousands of commodity machines r Millions of customers at peak times.
Dr. Kalpakis CMSC 621, Advanced Operating Systems. Fall 2003 URL: Distributed System Architectures.
Amazon’s Dynamo Simple Cloud Storage. Foundations 1970 – E.F. Codd “A Relational Model of Data for Large Shared Data Banks”E.F. Codd –Idea of tabular.
Dynamo: Amazon's Highly Available Key-value Store Distributed Storage Systems CS presented by: Hussam Abu-Libdeh.
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
Dynamo Highly Available Key-Value Store 1Dennis Kafura – CS5204 – Operating Systems.
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CS 582 / CMPE 481 Distributed Systems
© nCode 2000 Title of Presentation goes here - go to Master Slide to edit - Slide 1 Reliable Communication for Highly Mobile Agents ECE 7995: Term Paper.
Dynamo Kay Ousterhout. Goals Small files Always writeable Low latency – Measured at 99.9 th percentile.
Overview Distributed vs. decentralized Why distributed databases
Transaction Processing IS698 Min Song. 2 What is a Transaction?  When an event in the real world changes the state of the enterprise, a transaction is.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Chapter 10 Introduction to Wide Area Networks Data Communications and Computer Networks: A Business User’s Approach.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
Rethinking Dynamo: Amazon’s Highly Available Key-value Store --An Offense Shih-Chi Chen Hongyu Gao.
Distributed Systems 2006 Group Membership * *With material adapted from Ken Birman.
Dynamo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as well as related cloud storage implementations.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Chapter 1 Introduction to Databases
Academic Year 2014 Spring. MODULE CC3005NI: Advanced Database Systems “DATABASE RECOVERY” (PART – 1) Academic Year 2014 Spring.
Inexpensive Scalable Information Access Many Internet applications need to access data for millions of concurrent users Relational DBMS technology cannot.
Amazon’s Dynamo System The material is taken from “Dynamo: Amazon’s Highly Available Key-value Store,” by G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati,
Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, et.al., SOSP ‘07.
Cloud Storage – A look at Amazon’s Dyanmo A presentation that look’s at Amazon’s Dynamo service (based on a research paper published by Amazon.com) as.
Dynamo: Amazon’s Highly Available Key-value Store Presented By: Devarsh Patel 1CS5204 – Operating Systems.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
Peer-to-Peer in the Datacenter: Amazon Dynamo Aaron Blankstein COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Distributed Deadlocks and Transaction Recovery.
Dynamo: Amazon’s Highly Available Key-value Store COSC7388 – Advanced Distributed Computing Presented By: Eshwar Rohit
Cloud MapReduce : a MapReduce Implementation on top of a Cloud Operating System Speaker : 童耀民 MA1G Authors: Huan Liu, Dan Orban Accenture.
Concepts of Database Management Sixth Edition
Components of Database Management System
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
Scalable Web Server on Heterogeneous Cluster CHEN Ge.
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Amazon Dynamo Steve Ko Computer Sciences and Engineering University at Buffalo.
CS 5204 (FALL 2005)1 Leases: An Efficient Fault Tolerant Mechanism for Distributed File Cache Consistency Gray and Cheriton By Farid Merchant Date: 9/21/05.
Introduction to DFS. Distributed File Systems A file system whose clients, servers and storage devices are dispersed among the machines of a distributed.
CS 347Lecture 9B1 CS 347: Parallel and Distributed Data Management Notes 13: BigTable, HBASE, Cassandra Hector Garcia-Molina.
Databases Illuminated
Geo-distributed Messaging with RabbitMQ
Chapter 14 Transactions Yonsei University 1 st Semester, 2015 Sanghyun Park.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
The Raft Consensus Algorithm Diego Ongaro and John Ousterhout Stanford University.
Department of Computer Science, Johns Hopkins University EN Instructor: Randal Burns 24 September 2013 NoSQL Data Models and Systems.
Big Data Yuan Xue CS 292 Special topics on.
Kitsuregawa Laboratory Confidential. © 2007 Kitsuregawa Laboratory, IIS, University of Tokyo. [ hoshino] paper summary: dynamo 1 Dynamo: Amazon.
VICTORIA UNIVERSITY OF WELLINGTON Te Whare Wananga o te Upoko o te Ika a Maui SWEN 432 Advanced Database Design and Implementation Amazon’s Dynamo Lecturer.
1INTRODUCTION TO NETWORKING. Objective Introduction to networks. Need for networks. Classification of networks. 2INTRODUCTION TO NETWORKING.
Distributed Databases
Ivy: A Read/Write Peer-to- Peer File System Authors: Muthitacharoen Athicha, Robert Morris, Thomer M. Gil, and Benjie Chen Presented by Saurabh Jha 1.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
Remote Backup Systems.
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Dynamo: Amazon’s Highly Available Key-value Store
CHAPTER 3 Architectures for Distributed Systems
An Introduction to Computer Networking
Outline Announcements Fault Tolerance.
Software models - Software Architecture Design Patterns
Distributed Databases
Introduction of Week 13 Return assignment 11-1 and 3-1-5
View Change Protocols and Reconfiguration
CSE 486/586 Distributed Systems Case Study: Amazon Dynamo
Presentation transcript:

Dynamo: Amazon's Highly Available Key-Value Store Offense: Jori and Ning

Outline Presentation (Ning) Symmetry (Jori) WAN considerations (Ning) Consistency (Jori) Disaster Recovery (Ning) Minor Quibbles (Jori, Ning)

Presentation (Ning) Dynamo: o The basic functions are simple; o System implementation could be very complex; Leads to many gaps in the explanation. Missing things that are mentioned, but not explained include: o overload handling o state transfer o concurrency o job scheduling o request marshalling o request routing o system monitoring o alarming o configuration management If you don't want to talk about them, don't mention them.

Presentation contd. Almost impossible to understand some concepts without reading the cited material. o Some concepts are used but not well explained:  the gossip protocol  vector clock o Some concepts are not so important: SLA o Too wordy: at least give a numbered list o No clear graph: please use flow chart!! Despite the length and many cited resources, it is still very difficult to use the article as a design document. o Many open-source clones (Cassandra, Voldemort, Riak) have tried. o Many design concerns aren't touched upon  Why the decentralized structure is better? Must be well-versed in distributed computing concepts in order to really understand whats going on on the first read- through.

Symmetry (Jori) There are direct contradictions in regard to symmetry: o In section 2.3: "Symmetry: Every node in Dynamo should have the same set of responsibilities as its peers; there should be no distinguished node or nodes that take special roles or extra set of responsibilities." o In section 4.8.2: "To prevent logical partitions, some Dynamo nodes play the role of seeds... Seeds can be obtained either from static configuration or from a configuration service. Typically seeds are fully functional nodes in the Dynamo ring."

Symmetry contd. No justification for this design choice except that it "simplifies the process of system provisioning and maintenance." Membership and failure detection are presented in a hand- wavy manner. In this sort of system, specialization can simplify the overall design. It is not necessary for high availability. o Chubby/Paxos (google-designed distributed storage system) uses a master coordinator approach which results in much simpler consistency algorithms. It allows updates to be serialized which prevents conflicts. o A distributed directory service layer for lookup would fix dynamo's scalability issue, since nodes would no longer have to gossip the entire routing table.

Symmetry contd. Network connectivity is not symmetric. e.g. connections between nodes in the same data center are different than those between nodes in separate data centers. o The symmetric ring-based system does not reflect this inherent asymmetry. Server hardware configurations are inherently asymmetric. By making a symmetric system, you rule out the advantages of specialization. One can no longer use different hardware for different components of a complex system.

WAN Considerations (Ning) Non clear introduction for the interactions between data centers. When a Dynamo clusters span a WAN, the odds of nodes rejoining the clusters and remaining out of date are signficantly increased. If a node goes down, ‘hinted handoff’ sends updates to the next node in the ring. Since nodes of two data centers alternate, the updates are sent to the remote data center. When the node re-joins the cluster, if the network is partitioned (which happen all the time), the node will not catch up on pending updates for a long time (until the network partitioning is healed). Authentication and authorization are ignored in this paper. However, these could cause problems in the ring membership management.

Consistency (Jori) Principle for Symmetry and Decentralizaion o Centralization does not mean low availability and consistency does not need to be sacrificed for high availability: BigTable+GFS o Decentralized Architecture usually causes a lot of complexity o For handling transient failures, hinted handoff is complicated. "0.06% of inconsistent values" o millions of transactions a day for Amazon, so this ends up being a lot.

Consistency contd. Stale reads are possible and inconvenient o A node that has been down for a significant amount of time can rejoin a cluster completely out-of-date. There is no resynchronization barrier for reentry and no concept of how far behind it is. Merkle trees lead to slow catch-up. o Dynamo provides no bounds on stale reads to the detriment of developers e.g. a stale read could indirectly lead to an incorrect write, which is hard to track. Practical implications: o Committed writes don't show up in subsequent reads. o Committed writes may show up in some subsequent reads, but then go missing. o There is no SLA for when writes are globally committed i.e. no nodes are still playing catch-up.

Consistency contd. Conflict Resolution o Dynamo exposes resolution logic to the developer, making application logic more complex.  Since there are no bounds for stale reads or any centralized commit logs, data returned may be woefully out-of-date. o As noted before, this data loss can lead to unexpected situations that are hard to predict. o If the returned object is a list, deleted objects may reemerge after a conflict (shopping cart example)

Disaster Recovery (Ning) Disaster: o Entire data center fails: no way to describe the state of surviving data centers, so data loss is unbounded:  One cannot quantify exactly how much data was lost.  The lost data will be possibly corrupted forever. o Lost data can result in stale reading:  transactional inconsistencies are that most applications are ill- equipped to handle. Recovery: o The paper does not outline how disk corruptions and failures are handled. o Standard log-shipping based replication: one can at least keep track of replication log, and therefore have a general idea of how far behind a surviving cluster is.

Minor Quibbles Amazon implemented the system in Java, but gave no justification as to why. If the concern is providing high-speed availability, why do it in a slow language like Java? There are a few grammar mistakes and spelling mistakes throughout - could have used a couple more read-throughs. Wish there were comparisons of various (N,R,W) configuration schemes The size constraint on objects limits its applications. End of section 4.4 "However, this problem has not surfaced in production and therefore this issue has not been thoroughly investigated."