Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peer-to-Peer Networks and Distributed Hash Tables 2006.

Similar presentations


Presentation on theme: "Peer-to-Peer Networks and Distributed Hash Tables 2006."— Presentation transcript:

1 Peer-to-Peer Networks and Distributed Hash Tables 2006

2 2 Peer-peer networking file sharing: -files are stored at the end user machines (peers) rather than at a server (C/S), files are transferred directly between peers.  leverage: -P2P is a way to leverage vast amounts of computing power, storage, and connectivity from personal computers (PC) distributed around the world. Q: What are the new technical challenges? Q: What new services/applications enabled? Q: Is it just “ networking at the application-level ” ? Everything old is new again?

3 3 Napster  Naptser -- free music over the Internet  Key idea: share the content, storage and bandwidth of individual (home) users  Model: Each user stores a subset of files; Each user has access (can download) files from all users in the system Application-level, client-server protocol (index server) over point-to-point TCP How does it work -- four steps: Connect to Napster index server Upload your list of files (push) to server. Give server keywords to search the full list with. Select “ best ” of correct answers. (pings) Internet

4 4 Napster: Example A B C D E F m1 (machine) m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E

5 5 Napster characteristics  Advantages: -Simplicity, easy to implement sophisticated search engines on top of the index system centralized index server: single logical point of failure can load balance among servers using DNS rotation potential for congestion Napster “ in control ” (freedom is an illusion) no security: passwords in plain text no authentication no anonymity

6 6 Main Challenge  Find where a particular file is stored  Scale: up to hundred of thousands or millions of machines -7/2001: # simultaneous online users: Napster-160K, Gnutella-40K, Morpheus-300K  Dynamicity: machines can come and go any time A B C D E F E?

7 7 Gnutella  peer-to-peer networking: peer applications  Focus: decentralized method of searching for files  How to find a file: flood the request -Send request to all neighbors -Neighbors recursively multicast the request -Eventually a machine that has the file receives the request, and it sends back the answer  Advantages: -Totally decentralized, highly robust  Disadvantages: -Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)

8 8 Gnutella: Example  Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A B C D E F m1 m2 m3 m4 m5 m6 E? E

9 9 Gnutella  What we care about: -How much traffic does one query generate? -how many hosts can it support at once? -What is the latency associated with querying? -Is there a bottleneck?  late 2000: only 10% of downloads succeed -2001: more than 25% downloads successful (is this success or failure?)

10 10 BitTorrent  BitTorrent (BT) is new generation p2p. It can make download more faster -The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece  Swarming: Parallel downloads among a mesh of cooperating peers -Scalable - capacity increases with increase in number of peers/downloaders -Efficient - it utilises a large amount of available network bandwidth  Tracker -a central server keeping a list of all peers participating in the swarm (Handles peer discovery)

11 11 BitTorrent …. A picture.. Uploader/downloader Tracker Uploader/downloader

12 12 BitTorrent …. A picture..

13 13 Freenet  Addition goals to file location: -Provide publisher anonymity, security -Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines  Architecture: -Each file is identified by a unique identifier -Each machine stores a set of files, and maintains a “routing table” to route the individual requests

14 14 Data Structure  Each node maintains a common stack -id – file identifier -next_hop – another node that store the file id -file – file identified by id being stored on local node  Forwarding: -Each message contains the file id it is referring to -If file id stored locally, then stop; -If not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop id next_hop file … …

15 15 Query  API: file = query(id);  Upon receiving a query for document id -Check whether the queried file is stored locally If yes, return it If not, forward the query message  Notes: -Each query is associated a TTL that is decremented each time the query message is forwarded; to obscure distance to originator: TTL can be initiated to a random value within some bounds When TTL=1, the query is forwarded with a finite probability -Each node maintains the state for all outstanding queries that have traversed it  help to avoid cycles -When file is returned, the file is cached along the reverse path

16 16 Query Example  Note: doesn’t show file caching on the reverse path 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 10 n5 f10 8 n6 n5 query(10) 1 2 3 4 4’ 5

17 17 Insert  API: insert(id, file);  Two steps -Search for the file to be inserted -If not found, insert the file  Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator  Insertion -Follow the forward path; insert the file at all nodes along the path -A node probabilistically replace the originator with itself; obscure the true originator

18 18 Insert Example  Assume query returned failure along “blue” path; insert f10 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10)

19 19 Insert Example  10 n1 f10 4 n1 f4 12 n2 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10) 9 n3 f9 n2 orig=n1

20 20 Insert Example  n2 replaces the originator (n1) with itself 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 10 n2 f10 3 n1 f3 14 n4 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10) orig=n2

21 21 Insert Example  n2 replaces the originator (n1) with itself 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 10 n2 f10 3 n1 f3 14 n4 10 n4 f10 14 n5 f14 13 n2 n1 n2 n3 n4 10 n4 f10 4 n1 f4 11 n5 n5 Insert(10, f10)

22 22 Freenet Properties  Newly queried/inserted files are stored on nodes storing similar ids  New nodes can announce themselves by inserting files  Attempts to supplant or discover existing files will just spread the files

23 23 Freenet Summary  Advantages -Provides publisher anonymity -Totally decentralize architecture  robust and scalable -Resistant against malicious file deletion  Disadvantages -Does not always guarantee that a file is found, even if the file is in the network

24 24 Solutions to the Location Problem  Goal: make sure that an item (file) identified is always found -indexing scheme: used to map file names to their location in the system -Requires a scalable indexing mechanism  Abstraction: a distributed hash-table data strctr -insert(id, item); -item = query(id); -Note: item can be anything: a data object, document, file, pointer to a file…  Proposals -CAN, Chord, Kademlia, Pastry, Viceroy, Tapestry, etc

25 25  Hash tables - essential building block in software systems  Internet-scale distributed hash tables - equally valuable to large-scale distributed systems? -peer-to-peer systems -Napster, Gnutella, Groove, FreeNet, MojoNation… -large-scale storage management systems -Publius, OceanStore, PAST, Farsite, CFS... -mirroring on the Web -Content-Addressable Network (CAN) -scalable -operationally simple -good performance Internet-scale hash tables

26 26 Content Addressable Network (CAN): basic idea insert (K 1,V 1 ) K V  Interface -insert(key,value) key (id), value (item) -value = retrieve(key)

27 27 CAN: basic idea retrieve (K 1 ) K V (K1,V1)

28 28 CAN: basic idea  Associate to each node and item a unique id in an d-dimensional Cartesian space -key (id) - node/point – zone (d)  Goals -Scales to hundreds of thousands of nodes -Handles rapid arrival and failure of nodes  Properties -Routing table size O(d) -Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes

29 29 CAN: solution  virtual d-dimensional Cartesian coordinate space  entire space is partitioned amongst all the nodes -every node “owns” a zone in the overall space  abstraction -can store data at “points” in the space -can route from one “point” to another  point = node that owns the enclosing zone

30 30 CAN Example: Two Dimensional Space  Space divided between nodes  All nodes cover the entire space  Each node covers either a square or a rectangular area of ratios 1:2 or 2:1  Example: -Node n1:(1, 2) first node that joins  cover the entire space 1 234 5 670 1 2 3 4 5 6 7 0 n1

31 31 CAN Example: Two Dimensional Space  Node n2:(4, 2) joins  space is divided between n1 and n2 1 234 5 670 1 2 3 4 5 6 7 0 n1 n2

32 32 CAN Example: Two Dimensional Space  Node n3:(3, 5) joins  space is divided between n1 and n3 1 234 5 670 1 2 3 4 5 6 7 0 n1 n2 n3

33 33 CAN Example: Two Dimensional Space  Nodes n4:(5, 5) and n5:(6,6) join 1 234 5 670 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5

34 34 Node I::insert(K,V) (1) a = hx(K) b = h y (K) (2)route(K,V) --> (a,b) (3) (a,b) stores (K,V) Simple example: To store a pair (K1,V1) key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address) x=a y=b I

35 35 CAN Example: Two Dimensional Space  Each item is stored by the node who owns its mapping in the space  Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)  Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5); 1 234 5 670 1 2 3 4 5 6 7 0 n1 n2 n3 n4 n5 f1 f2 f3 f4

36 36 Simple example: To retrieve key K1 node J::retrieve(K) (1) a = h x (K) b = h y (K) (2) route “retrieve(K)” to (a,b) Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies (K,V) J y=b x=a

37 37 CAN: Query/Routing Example  Each node knows its neighbors in the d-space  Forward query to the neighbor that is closest to the query id  Example: assume n1 queries f4  A node only maintains state for its immediate neighboring nodes  Can route around some failures 1 234 5 670 1 2 3 4 5 6 7 0 n1n2 n3 n4 n5 f1 f2 f3 f4

38 38 CAN: node insertion Inserting a new node affects only a single other node and its immediate neighbors 1) discover some node “I” already in CAN 2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half (p,q) I J new node new

39 39 CAN: Node Failure Recovery  Simple failures -Know your neighbor’s neighbors -When a node fails, one of its neighbors takes over its zone  More complex failure modes -Simultaneous failure of multiple adjacent nodes -Scoped flooding to discover neighbors -Hopefully, a rare event  Only the failed node’s immediate neighbors are required for recovery

40 40 Evaluation  Scalability  Low-latency  Load balancing  Robustness

41 41 CAN: scalability  For a uniformly partitioned space with n nodes and d dimensions -per node, number of neighbors is 2d -average routing path is (dn 1/d )/4 hops -simulations show that the above results hold in practice  Can scale the network without increasing per-node state  Chord/Plaxton/Tapestry/Buzz -log(n) neighbors with log(n) hops

42 42 CAN: low-latency  Problem -latency stretch = (CAN routing delay) (IP routing delay) -application-level routing may lead to high stretch  Solution -increase dimensions, realities (reduce the path length) -Heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries

43 43 CAN: low-latency #nodes Latency stretch 16K32K65K131K #dimensions = 2 w/o heuristics w/ heuristics

44 44 CAN: low-latency #nodes Latency stretch 16K32K65K131K #dimensions = 10 w/o heuristics w/ heuristics

45 45 CAN: load balancing  Two pieces -Dealing with hot-spots popular (key,value) pairs nodes cache recently requested entries overloaded node replicates popular entries at neighbors -Uniform coordinate space partitioning uniformly spread (key,value) entries uniformly spread out routing load

46 46 CAN: Robustness  Completely distributed -no single point of failure ( not applicable to pieces of database when node failure happens)  Not exploring database recovery (in case there are multiple copies of database)  Resilience of routing -can route around trouble

47 47 Strengths  More resilient than flooding broadcast networks  Efficient at locating information  Fault tolerant routing  Node & Data High Availability (w/ improvement)  Manageable routing table size & network traffic

48 48 Weaknesses  Impossible to perform a fuzzy search  Susceptible to malicious activity  Maintain coherence of all the indexed data (Network overhead, Efficient distribution)  Still relatively higher routing latency  Poor performance w/o improvement

49 49 Suggestions  Catalog and Meta indexes to perform search function  Extension to handle mutable content efficiently for web-hosting  Security mechanism to defense against attacks

50 50 Ongoing Work Topologically-sensitive CAN construction - Distributed Binning  Goal -bin nodes such that co-located nodes land in same bin  Idea -well known set of landmark machines -each CAN node, measures its RTT to each landmark -orders the landmarks in order of increasing RTT  CAN construction -place nodes from the same bin close together on the CAN

51 51 Distributed Binning -4 Landmarks (placed at 5 hops away from each other) - naïve partitioning number of nodes 256 1K4K latency Stretch 5 10 15 20 256 1K4K  w/o binning w/ binning w/o binning w/ binning #dimensions=2#dimensions=4

52 52 Ongoing Work (cont’d)  CAN Security (Petros Maniatis - Stanford) -spectrum of attacks -appropriate counter-measures  CAN Usage -Application-level Multicast (NGC 2001) -Grass-Roots Content Distribution -Distributed Databases using CANs (J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang)

53 53 Summary  CAN -an Internet-scale hash table -potential building block in Internet applications  Scalability -O(d) per-node state -average routing path is (dn 1/d )/4 hops  Low-latency routing -simple heuristics help a lot  Robust -decentralized, can route around trouble


Download ppt "Peer-to-Peer Networks and Distributed Hash Tables 2006."

Similar presentations


Ads by Google