Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED BY: KIMIISA OSHIKOJI
OUTLINE Amazon Dynamo Architecture Performance
AMAZON Huge Infrastructure Customer oriented business Reliability is key
DYNAMO Data storage system Flexible Automated addition and removal of storage nodes
DYNAMO-REQUIREMENTS RequirementEffect Query ModelRead and write operations that are associated with a key ACID PropertiesProperties for database transactions EfficiencySystems must achieve latency and throughput requirements Other AssumptionsWhat Dynamo assumes
DYNAMO-QUERY MODEL Key identifies operations Operations don’t require multiple data items Data to be stored is relatively small
DYNAMO-ACID PROPERTIES PropertyEffect AtomicityTransactions happen or don’t ConsistencyTransactions consistent across states IsolationData cannot be accessed by external operations while its in an intermediate stage DurabilityAfter transaction concluded it will never be undone
DYNAMO-EFFICINCY
DYNAMO-ASSUMPTIONS Only used by internal Amazon systems No security considerations Limited scalability
DYNAMO-SLA Service Level Agreement: contract between client and service about their relationship In Amazon a typical client request involves over 100 services who might have dependencies SLA are governed by 99.9 th percentile
DYNAMO-DESIGN Focus on correctness of an answer rather than how quickly it can be available Eventually consistent data store Writes can never be rejected 99.9 th percentile Zero-hop DHT
DYNAMO-PRINCIPLES PrincipleEffect Incremental scalabilityA storage host can be scaled without undue impact to the system SymmetryAll nodes are the same DecentralizationFocus on peer to peer techniques HeterogeneityWork must be distributed according to capabilities of the nodes
ARCHITECTURE-STORAGE Objects stored with a key using: – Get(key): locates object with key and returns object or list of objects with a context – Put(key, context): places an object at a replica along with the key and context – Context: metadata about object
ARCHITECTURE-HASHING
ARCHITECTURE-REPLICATION Data is replicated on N hosts (N is determined by user) Coordinator nodes replicate the data for nodes they are responsible for coordinating
ARCHITECTURE-VERSIONING Multiple versions can exist Vector clock is used for version control Vector clock size issue
ARCHITECTURE-FAILURE Failure TypeDescription Temporary failure of nodeReplica that would have been on failed node is sent to another with a hint as to original destination Permanent failure of nodeReplica synchronization to insure no information is lost *Failure are not automatically detected by a central node
ARCHITECTURE-ADDING Discovery TypeDescription InternalGossip based protocol which leads to eventual consistent membership list ExternalSeed nodes, known by all nodes in system
PERFORMANCE-BUFFER System can be optimized without sacrificing the 99.9 th percentile Buffer usage can decrease latency by a factor of 5 during peak traffic times
PERFORMANCE-LOAD DISTRIBUTION Partitioning schemeDescription Partition by Token and T Tokens per nodeRange of nodes vary b/c of random selection of tokens Partition into equal slices and T Tokens per node Tokens used to map values in hash space to nodes Partition into equal slices and Q/S Tokens per node Each node in system must always have Q/S Tokens assigned to it *Third strategy is the best in terms of balancing
QUESTIONS?