Download presentation
Presentation is loading. Please wait.
Published byEleanor Drusilla McKenzie Modified over 9 years ago
1
Dynamo: Amazon’s Highly Available Key-value Store DeCandia, Hastorun, Jampani, Kakulapati, Lakshman, Pilchin, Sivasubramanian, Vosshall, Vogels PRESENTED BY: KIMIISA OSHIKOJI
2
OUTLINE Amazon Dynamo Architecture Performance
3
AMAZON Huge Infrastructure Customer oriented business Reliability is key
4
DYNAMO Data storage system Flexible Automated addition and removal of storage nodes
5
DYNAMO-REQUIREMENTS RequirementEffect Query ModelRead and write operations that are associated with a key ACID PropertiesProperties for database transactions EfficiencySystems must achieve latency and throughput requirements Other AssumptionsWhat Dynamo assumes
6
DYNAMO-QUERY MODEL Key identifies operations Operations don’t require multiple data items Data to be stored is relatively small
7
DYNAMO-ACID PROPERTIES PropertyEffect AtomicityTransactions happen or don’t ConsistencyTransactions consistent across states IsolationData cannot be accessed by external operations while its in an intermediate stage DurabilityAfter transaction concluded it will never be undone
8
DYNAMO-EFFICINCY
9
DYNAMO-ASSUMPTIONS Only used by internal Amazon systems No security considerations Limited scalability
10
DYNAMO-SLA Service Level Agreement: contract between client and service about their relationship In Amazon a typical client request involves over 100 services who might have dependencies SLA are governed by 99.9 th percentile
11
DYNAMO-DESIGN Focus on correctness of an answer rather than how quickly it can be available Eventually consistent data store Writes can never be rejected 99.9 th percentile Zero-hop DHT
12
DYNAMO-PRINCIPLES PrincipleEffect Incremental scalabilityA storage host can be scaled without undue impact to the system SymmetryAll nodes are the same DecentralizationFocus on peer to peer techniques HeterogeneityWork must be distributed according to capabilities of the nodes
13
ARCHITECTURE-STORAGE Objects stored with a key using: – Get(key): locates object with key and returns object or list of objects with a context – Put(key, context): places an object at a replica along with the key and context – Context: metadata about object
14
ARCHITECTURE-HASHING
15
ARCHITECTURE-REPLICATION Data is replicated on N hosts (N is determined by user) Coordinator nodes replicate the data for nodes they are responsible for coordinating
16
ARCHITECTURE-VERSIONING Multiple versions can exist Vector clock is used for version control Vector clock size issue
17
ARCHITECTURE-FAILURE Failure TypeDescription Temporary failure of nodeReplica that would have been on failed node is sent to another with a hint as to original destination Permanent failure of nodeReplica synchronization to insure no information is lost *Failure are not automatically detected by a central node
18
ARCHITECTURE-ADDING Discovery TypeDescription InternalGossip based protocol which leads to eventual consistent membership list ExternalSeed nodes, known by all nodes in system
19
PERFORMANCE-BUFFER System can be optimized without sacrificing the 99.9 th percentile Buffer usage can decrease latency by a factor of 5 during peak traffic times
20
PERFORMANCE-LOAD DISTRIBUTION Partitioning schemeDescription Partition by Token and T Tokens per nodeRange of nodes vary b/c of random selection of tokens Partition into equal slices and T Tokens per node Tokens used to map values in hash space to nodes Partition into equal slices and Q/S Tokens per node Each node in system must always have Q/S Tokens assigned to it *Third strategy is the best in terms of balancing
21
QUESTIONS?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.