Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Lookup Systems

Similar presentations


Presentation on theme: "Distributed Lookup Systems"— Presentation transcript:

1 Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet App. A Scalable Content Addressable Network

2 Motivation How to find data in a distributed system? N2 N3 N1 N5 ? N4
Publisher Key=“LetItBe” Value=MP3 data Internet N5 ? N4 Client Lookup(“LetItBe”)

3 Applications Peer-to-peer systems Mirroring (web caching)
Napster, Gnutella, Groove, FreeNet, … Large scale storage management systems Publius, OceanStore, PAST, Farsite, CFS ... Mirroring (web caching) Any wide area name resolution system

4 Outline Types of solutions Evaluation Criteria CAN and Chord
basic idea insert/retrieve join/leave recovery from failures

5 Centralized Solution N2 N3 N1 DB N5 N4 Central server (Napster)
Publisher Key=“LetItBe” Value=MP3 data Internet DB N5 N4 Client Lookup(“LetItBe”)

6 Distributed Solution (1)
Flooding (Gnutella, Morpheus, etc.) N2 N3 N1 Publisher Internet N5 N4 Client Lookup(“LetItBe”) Worst case O(m) messages per lookup

7 Distributed Solution (2)
Routed messages (Freenet, Tapestry, Chord, CAN, etc.) N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 N4 Client Lookup(“LetItBe”)

8 Routing Challenges Define a useful nearness metric
Keep the hop count small Keep the routing table right size Stay robust despite robust changes in membership

9 Evaluation Scalability Latency Load balancing Robustness
routing path length per-node state Latency Load balancing Robustness routing fault tolerance data availability

10 Content Addressable Network
Basic idea Internet scale hash table Interface insert(key,value) value = retrieve(key) Table partitioned among many individual node

11 CAN - Solution virtual d-dimensional Cartesian coordinate space
dynamically partitioned into zones, each “owned” by one node a key mapped into a point P (key, value) is stored at the node which owns the zone P belongs to

12 CAN - Example I

13 CAN - Example node I::insert(K,V) I

14 CAN - Example node I::insert(K,V) I (1) a = hx(K) x = a

15 CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) y = b

16 CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K)
(2) route(K,V)-> (a,b)

17 CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) (K,V)
(2) route(K,V)-> (a,b) (3) (a,b) stores (K,V) (K,V)

18 CAN - Routing each node maintains state for its neighbors
message contains dst coordinates greedy forwarding to neighbor with coordinates closest to dst (a,b) (x,y)

19 CAN - Node Insertion I new node
1) discover some node “I” already in CAN

20 CAN – Node Insertion I (p,q) 2) pick random point in space new node

21 CAN – Node Insertion J I new node
3) I routes to (p,q), discovers node J

22 CAN - Node Insertion J new
4) split J’s zone in half… new owns one half

23 CAN – Node Failure Need to repair the space recover database
soft-state updates use replication, rebuild database from replicas repair routing takeover algorithm

24 CAN – Takeover Algorithm
Simple failures know your neighbor’s neighbors when a node fails, one of its neighbors takes over its zone More complex failure modes simultaneous failure of multiple adjacent nodes scoped flooding to discover neighbors hopefully, a rare event

25 CAN - Evaluation Scalability
per node, number of neighbors is 2d average routing path is (dn1/d)/4 hops Can scale the network without increasing per-node state

26 CAN – Design Improvements
increase dimensions, realities (reduce the path length) heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries

27 CAN - Weaknesses Impossible to perform a fuzzy search
Susceptible to malicious activity Maintain coherence of all the indexed data (Network overhead, Efficient distribution) Still relatively higher routing latency Poor performance w/o improvement

28 Chord Based on consistent hashing for key to node mapping
standard hashing, e.g. x->ax+b (mod(p)) provide good balance across bins consistent hashing small change in the bucket set does not induce a total remapping of items to buckets

29 Node identifier = SHA-1(IP address)
Chord IDs m bit identifier space for both keys and nodes Key identifier = SHA-1(key) Key=“LetItBe” ID=60 SHA-1 Node identifier = SHA-1(IP address) IP=“ ” ID=123 SHA-1 Both are uniformly distributed

30 Chord – Consistent Hashing
K5 IP=“ ” N123 K20 Circular 7-bit ID space K101 N32 Key=“LetItBe” N90 K60 A key is stored at its successor: node with next higher ID

31 Chord – Basic Lookup Every node knows its successor in the ring N10
N10 Where is “LetItBe”? N123 Hash(“LetItBe”) = K60 N32 “N90 has K60” N55 K60 N90

32 Chord – “Finger Tables”
Every node knows m other nodes in the ring Increase distance exponentially Finger i points to successor of n+2i N80 N112 N96 N16

33 Chord – “Finger Tables”
(0) N32’s Finger Table N113 N40 N40 N40 N40 N52 N70 N102 N102 N32 N85 N40 N80 N79 N52 N70 N60

34 Chord – Lookup Algorithm
(0) N32 N60 N79 N70 N113 N40 N52 N80 N40 N40 N40 N40 N52 N70 N102 N32’s Finger Table N85

35 Chord - Node Insertion Three step process:
Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node Less aggressive mechanism (lazy finger update): Initialize only the finger to successor node Periodically verify immediate successor, predecessor Periodically refresh finger table entries

36 Joining the Ring – Step 1 Initialize the new finger table
locate any node p in the ring ask node p to lookup fingers of new node N36 N5 N20 N99 N36 1. Lookup(37,38,40,…,100,164) N40 N80 N60

37 Joining the Ring – Step 2 Updating fingers of existing nodes
new node calls update function on existing nodes existing node can recursively update fingers of other nodes N5 N20 N99 N36 N40 N80 N60

38 Joining the Ring – Step 3 Transfer keys from successor node to new node N5 N20 N99 N36 K30 K38 Copy keys from N40 to N36 N40 K30 K38 N80 N60

39 Chord – Handling Failures
Use successor list each node know r immediate successors after failure will know first live successor correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small

40 Chord - Evaluation Efficient: O(Log N) messages per lookup
N is the total number of servers Scalable: O(Log N) state per node Robust: survives massive changes in membership Assuming no malicious participants

41 Chord - Weakness NOT that simple (compared to CAN)
Member joining is complicated aggressive mechanisms requires too many messages and updates no analysis of convergence in lazy finger mechanism Key management mechanism mixed between layers upper layer does insertion and handle node failures Chord transfer keys when node joins (no leave mechanism!) Routing table grows with # of members in group Worst case lookup can be slow

42 Summary Both systems: fully distributed and scalable efficient lookup
robust simplicity (?) susceptible to malicious activity

43 How these related to OSD?
Very similar if data is “public” If data is “private”, only a few locations are available for storing data Does OSD help (make it easy) for peer-to-peer computing? Any more comments?


Download ppt "Distributed Lookup Systems"

Similar presentations


Ads by Google