Download presentation
Presentation is loading. Please wait.
Published byBrooke York Modified over 9 years ago
1
Paper Survey of DHT Distributed Hash Table
2
Usages Directory service Very little amount of information, such as URI, metadata, … Storage Data, such as files, … Immutable, just for download Database Each entry is small, but large amount of entries Mutable Special operations for query
3
Challenges Immutable Latency Availability Query Consistency Mutable Object Consistency
4
Latency Query Different routing architectures Chord, Tapestry, Pastry, Kademlia, Can, … Recursive, interactive Proximity Neighbor Route Parallel Routing table size Fetch Transport Protocol Proximity Neighbor Selections Cache Distributed Object
5
Query: Routing Architectures Routing Complexity O (log n), O (d), O (1), … Principle Each peer has a unique digest Object with a digest Put the object to the peer with the closed digest Famous ones are O (log n) O (1) cache
7
Query: Recursive or Interactive Query is recursive forward Faster 2 times than interactive theoretically Primary parameters Base # of successor Persistent problem
8
Query: Recursive or Interactive Query is interactively forward Not very slow in practical Primary parameters # of parallel query Routing table tree Learning new neighbor easily Exchange information with other peers Flexible
10
Query: Proximity Neighbor Route Route by a node with smaller delay Small delay -> small timeout TCP > Vivaldi > fixed
12
Query: Proximity Neighbor Route Measure methods Global Sampling Neighbor’s neighbors Neighbor’s inverse Recursive sampling
14
Query: others Parallel query Faster With partial PNS property Persistent More traffic Large routing table Easy to find a closer node locally
15
Fetch: Cache Cache objects on nodes closer to the primary one # of nodes to cache is upon the popularity of the object Average query hops can be reduced to a constant number ( O (1) ) Hard to apply to mutable object Consider churn more bandwidth consumption
16
Fetch: Distributed Object Split object to small pieces and put on different nodes Recover faster Download faster Hard to maintain Only for immutable data
17
Fetch: Transport Protocol Striped Transport Protocol UDP Window control Retransmission
18
Availability Replicate Reactive / Proactive Eager / lazy repair Erasure coding Load balance is broken High correlation between uptime and storage Maintenance traffic problem
19
Availability: Replicate Reactive Duplicate when a copy is lost Consume lots of bandwidth in short time When churn is low, reactive is better Proactive Duplicate continually Consume constant and small bandwidth continually Need avail. prediction and redundancy management Bandwidth usage is predictable
21
Availability: Replicate Temporary / Permanent churn Availability Durability Achieve 100% availability or/and durability ? Eager repair Duplicate immediately Lazy repair Duplicate after timeout Need a good choice of timeout Reintegrating returning replicas
22
Availability: Erasure Coding Matter more on larger object Save storage and bandwidth For high churn, the bandwidth consumption is still not acceptable Complex maintenance Download latency is heterogeneous Only for immutable data
23
Query Consistency A digest-object mapping is existed, then the result of query must be it Weakly consistent KBR Eventual consistency Most of existed DHT Strongly consistent KBR Causality consistency Strong consistency Solution Route by W-KBR to a group S-KBR in a group
24
Mutable DHT Object stored in DHT is mutable Insert, update, delete Churn -> Replica New Challenge …
25
Object Consistency For immutable data For security issue, it may be there Merkle tree For mutable data Consensus algorithm Distributed algorithm for data consistency Quorum algorithm Read / write locks
26
Pitfalls Different kinds of p2p have different properties Lack of new real traces Standard simulation platform
27
References Efficient Replica Maintenance for Distributed Storage Systems Proactive replication for data durability On object Maintenance in Peer-to-Peer systems Enforcing Routing Consistency in Structured Peer-to-peer Overlays: Should We and Could We? High Availability in DHTs: Erasure Coding vs. Replication Toward Fault-tolerant Atomic Data Access in Mutable Distributed Hash Tables Kademlia: A Peer-to-peer Information System Based on the XOR Metric Total Recall: System Support for Automated Availability Management Designing a DHT for low latency and high throughput
28
References Fallacies in evaluating decentralized systems Anatomy of a P2P Content Distribution system with Network Coding Comparing the performance of distributed hash tables under churn EpiChord: Parallelizing the Chord Lookup Algorithm with Reactive Routing State management Bandwidth-efficient management of DHT routing tables Improving Lookup Performance over a Widely-Deployed DHT Failure Recovery for Structured P2P Networks: Protocol Design and Performance Evaluation Handling Churn in a DHT
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.