Download presentation
Presentation is loading. Please wait.
1
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet App. A Scalable Content Addressable Network
2
Motivation How to find data in a distributed system? N2 N3 N1 N5 ? N4
Publisher Key=“LetItBe” Value=MP3 data Internet N5 ? N4 Client Lookup(“LetItBe”)
3
Applications Peer-to-peer systems Mirroring (web caching)
Napster, Gnutella, Groove, FreeNet, … Large scale storage management systems Publius, OceanStore, PAST, Farsite, CFS ... Mirroring (web caching) Any wide area name resolution system
4
Outline Types of solutions Evaluation Criteria CAN and Chord
basic idea insert/retrieve join/leave recovery from failures
5
Centralized Solution N2 N3 N1 DB N5 N4 Central server (Napster)
Publisher Key=“LetItBe” Value=MP3 data Internet DB N5 N4 Client Lookup(“LetItBe”)
6
Distributed Solution (1)
Flooding (Gnutella, Morpheus, etc.) N2 N3 N1 Publisher Internet N5 N4 Client Lookup(“LetItBe”) Worst case O(m) messages per lookup
7
Distributed Solution (2)
Routed messages (Freenet, Tapestry, Chord, CAN, etc.) N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 N4 Client Lookup(“LetItBe”)
8
Routing Challenges Define a useful nearness metric
Keep the hop count small Keep the routing table right size Stay robust despite robust changes in membership
9
Evaluation Scalability Latency Load balancing Robustness
routing path length per-node state Latency Load balancing Robustness routing fault tolerance data availability
10
Content Addressable Network
Basic idea Internet scale hash table Interface insert(key,value) value = retrieve(key) Table partitioned among many individual node
11
CAN - Solution virtual d-dimensional Cartesian coordinate space
dynamically partitioned into zones, each “owned” by one node a key mapped into a point P (key, value) is stored at the node which owns the zone P belongs to
12
CAN - Example I
13
CAN - Example node I::insert(K,V) I
14
CAN - Example node I::insert(K,V) I (1) a = hx(K) x = a
15
CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) y = b
16
CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K)
(2) route(K,V)-> (a,b)
17
CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) (K,V)
(2) route(K,V)-> (a,b) (3) (a,b) stores (K,V) (K,V)
18
CAN - Routing each node maintains state for its neighbors
message contains dst coordinates greedy forwarding to neighbor with coordinates closest to dst (a,b) (x,y)
19
CAN - Node Insertion I new node
1) discover some node “I” already in CAN
20
CAN – Node Insertion I (p,q) 2) pick random point in space new node
21
CAN – Node Insertion J I new node
3) I routes to (p,q), discovers node J
22
CAN - Node Insertion J new
4) split J’s zone in half… new owns one half
23
CAN – Node Failure Need to repair the space recover database
soft-state updates use replication, rebuild database from replicas repair routing takeover algorithm
24
CAN – Takeover Algorithm
Simple failures know your neighbor’s neighbors when a node fails, one of its neighbors takes over its zone More complex failure modes simultaneous failure of multiple adjacent nodes scoped flooding to discover neighbors hopefully, a rare event
25
CAN - Evaluation Scalability
per node, number of neighbors is 2d average routing path is (dn1/d)/4 hops Can scale the network without increasing per-node state
26
CAN – Design Improvements
increase dimensions, realities (reduce the path length) heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries
27
CAN - Weaknesses Impossible to perform a fuzzy search
Susceptible to malicious activity Maintain coherence of all the indexed data (Network overhead, Efficient distribution) Still relatively higher routing latency Poor performance w/o improvement
28
Chord Based on consistent hashing for key to node mapping
standard hashing, e.g. x->ax+b (mod(p)) provide good balance across bins consistent hashing small change in the bucket set does not induce a total remapping of items to buckets
29
Node identifier = SHA-1(IP address)
Chord IDs m bit identifier space for both keys and nodes Key identifier = SHA-1(key) Key=“LetItBe” ID=60 SHA-1 Node identifier = SHA-1(IP address) IP=“ ” ID=123 SHA-1 Both are uniformly distributed
30
Chord – Consistent Hashing
K5 IP=“ ” N123 K20 Circular 7-bit ID space K101 N32 Key=“LetItBe” N90 K60 A key is stored at its successor: node with next higher ID
31
Chord – Basic Lookup Every node knows its successor in the ring N10
N10 Where is “LetItBe”? N123 Hash(“LetItBe”) = K60 N32 “N90 has K60” N55 K60 N90
32
Chord – “Finger Tables”
Every node knows m other nodes in the ring Increase distance exponentially Finger i points to successor of n+2i N80 N112 N96 N16
33
Chord – “Finger Tables”
(0) N32’s Finger Table N113 N40 N40 N40 N40 N52 N70 N102 N102 N32 N85 N40 N80 N79 N52 N70 N60
34
Chord – Lookup Algorithm
(0) N32 N60 N79 N70 N113 N40 N52 N80 N40 N40 N40 N40 N52 N70 N102 N32’s Finger Table N85
35
Chord - Node Insertion Three step process:
Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node Less aggressive mechanism (lazy finger update): Initialize only the finger to successor node Periodically verify immediate successor, predecessor Periodically refresh finger table entries
36
Joining the Ring – Step 1 Initialize the new finger table
locate any node p in the ring ask node p to lookup fingers of new node N36 N5 N20 N99 N36 1. Lookup(37,38,40,…,100,164) N40 N80 N60
37
Joining the Ring – Step 2 Updating fingers of existing nodes
new node calls update function on existing nodes existing node can recursively update fingers of other nodes N5 N20 N99 N36 N40 N80 N60
38
Joining the Ring – Step 3 Transfer keys from successor node to new node N5 N20 N99 N36 K30 K38 Copy keys from N40 to N36 N40 K30 K38 N80 N60
39
Chord – Handling Failures
Use successor list each node know r immediate successors after failure will know first live successor correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small
40
Chord - Evaluation Efficient: O(Log N) messages per lookup
N is the total number of servers Scalable: O(Log N) state per node Robust: survives massive changes in membership Assuming no malicious participants
41
Chord - Weakness NOT that simple (compared to CAN)
Member joining is complicated aggressive mechanisms requires too many messages and updates no analysis of convergence in lazy finger mechanism Key management mechanism mixed between layers upper layer does insertion and handle node failures Chord transfer keys when node joins (no leave mechanism!) Routing table grows with # of members in group Worst case lookup can be slow
42
Summary Both systems: fully distributed and scalable efficient lookup
robust simplicity (?) susceptible to malicious activity
43
How these related to OSD?
Very similar if data is “public” If data is “private”, only a few locations are available for storing data Does OSD help (make it easy) for peer-to-peer computing? Any more comments?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.