Peer-to-Peer Networks and Distributed Hash Tables 2006.

Slides:



Advertisements
Similar presentations
CAN 1.Distributed Hash Tables a)DHT recap b)Uses c)Example – CAN.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
A Scalable Content Addressable Network (CAN)
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Distributed Lookup Systems
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
CS 268: Overlay Networks: Distributed Hash Tables Kevin Lai May 1, 2001.
Wide-area cooperative storage with CFS
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
1 Freenet  Addition goals to file location: -Provide publisher anonymity, security -Resistant to attacks – a third party shouldn’t be able to deny the.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
P2P File Sharing Systems
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Introduction Widespread unstructured P2P network
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Jonathan Walpole CSE515 - Distributed Computing Systems 1 Teaching Assistant for CSE515 Rahul Dubey.
Sylvia Ratnasamy (UC Berkley Dissertation 2002) Paul Francis Mark Handley Richard Karp Scott Shenker A Scalable, Content Addressable Network Slides by.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
15-744: Computer Networking L-22: P2P. Lecture 22: Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Peer to Peer Network Design Discovery and Routing algorithms
15-744: Computer Networking L-22: P2P. L -22; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
CS 268: Lecture 22 (Peer-to-Peer Networks)
EE 122: Peer-to-Peer (P2P) Networks
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
A Scalable content-addressable network
A Scalable, Content-Addressable Network
CS 162: P2P Networks Computer Science Division
An Overview of Peer-to-Peer
Peer-to-Peer Networks and Distributed Hash Tables
A Scalable, Content-Addressable Network
Presentation transcript:

Peer-to-Peer Networks and Distributed Hash Tables 2006

2 Peer-peer networking file sharing: -files are stored at the end user machines (peers) rather than at a server (C/S), files are transferred directly between peers.  leverage: -P2P is a way to leverage vast amounts of computing power, storage, and connectivity from personal computers (PC) distributed around the world. Q: What are the new technical challenges? Q: What new services/applications enabled? Q: Is it just “ networking at the application-level ” ? Everything old is new again?

3 Napster  Naptser -- free music over the Internet  Key idea: share the content, storage and bandwidth of individual (home) users  Model: Each user stores a subset of files; Each user has access (can download) files from all users in the system Application-level, client-server protocol (index server) over point-to-point TCP How does it work -- four steps: Connect to Napster index server Upload your list of files (push) to server. Give server keywords to search the full list with. Select “ best ” of correct answers. (pings) Internet

4 Napster: Example A B C D E F m1 (machine) m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E

5 Napster characteristics  Advantages: -Simplicity, easy to implement sophisticated search engines on top of the index system centralized index server: single logical point of failure can load balance among servers using DNS rotation potential for congestion Napster “ in control ” (freedom is an illusion) no security: passwords in plain text no authentication no anonymity

6 Main Challenge  Find where a particular file is stored  Scale: up to hundred of thousands or millions of machines -7/2001: # simultaneous online users: Napster-160K, Gnutella-40K, Morpheus-300K  Dynamicity: machines can come and go any time A B C D E F E?

7 Gnutella  peer-to-peer networking: peer applications  Focus: decentralized method of searching for files  How to find a file: flood the request -Send request to all neighbors -Neighbors recursively multicast the request -Eventually a machine that has the file receives the request, and it sends back the answer  Advantages: -Totally decentralized, highly robust  Disadvantages: -Not scalable; the entire network can be swamped with request (to alleviate this problem, each request has a TTL)

8 Gnutella: Example  Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A B C D E F m1 m2 m3 m4 m5 m6 E? E

9 Gnutella  What we care about: -How much traffic does one query generate? -how many hosts can it support at once? -What is the latency associated with querying? -Is there a bottleneck?  late 2000: only 10% of downloads succeed -2001: more than 25% downloads successful (is this success or failure?)

10 BitTorrent  BitTorrent (BT) is new generation p2p. It can make download more faster -The file to be distributed is split up in pieces and an SHA-1 hash is calculated for each piece  Swarming: Parallel downloads among a mesh of cooperating peers -Scalable - capacity increases with increase in number of peers/downloaders -Efficient - it utilises a large amount of available network bandwidth  Tracker -a central server keeping a list of all peers participating in the swarm (Handles peer discovery)

11 BitTorrent …. A picture.. Uploader/downloader Tracker Uploader/downloader

12 BitTorrent …. A picture..

13 Freenet  Addition goals to file location: -Provide publisher anonymity, security -Resistant to attacks – a third party shouldn’t be able to deny the access to a particular file (data item, object), even if it compromises a large fraction of machines  Architecture: -Each file is identified by a unique identifier -Each machine stores a set of files, and maintains a “routing table” to route the individual requests

14 Data Structure  Each node maintains a common stack -id – file identifier -next_hop – another node that store the file id -file – file identified by id being stored on local node  Forwarding: -Each message contains the file id it is referring to -If file id stored locally, then stop; -If not, search for the “closest” id in the stack, and forward the message to the corresponding next_hop id next_hop file … …

15 Query  API: file = query(id);  Upon receiving a query for document id -Check whether the queried file is stored locally If yes, return it If not, forward the query message  Notes: -Each query is associated a TTL that is decremented each time the query message is forwarded; to obscure distance to originator: TTL can be initiated to a random value within some bounds When TTL=1, the query is forwarded with a finite probability -Each node maintains the state for all outstanding queries that have traversed it  help to avoid cycles -When file is returned, the file is cached along the reverse path

16 Query Example  Note: doesn’t show file caching on the reverse path 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 10 n5 f10 8 n6 n5 query(10) ’ 5

17 Insert  API: insert(id, file);  Two steps -Search for the file to be inserted -If not found, insert the file  Searching: like query, but nodes maintain state after a collision is detected and the reply is sent back to the originator  Insertion -Follow the forward path; insert the file at all nodes along the path -A node probabilistically replace the originator with itself; obscure the true originator

18 Insert Example  Assume query returned failure along “blue” path; insert f10 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10)

19 Insert Example  10 n1 f10 4 n1 f4 12 n2 3 n1 f3 14 n4 f14 5 n3 14 n5 f14 13 n2 f13 3 n6 n1 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10) 9 n3 f9 n2 orig=n1

20 Insert Example  n2 replaces the originator (n1) with itself 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 10 n2 f10 3 n1 f3 14 n4 14 n5 f14 13 n2 f13 3 n6 n1 n2 n3 n4 4 n1 f4 11 n5 f11 8 n6 n5 insert(10, f10) orig=n2

21 Insert Example  n2 replaces the originator (n1) with itself 10 n1 f10 4 n1 f4 12 n2 10 n2 f10 9 n3 f9 10 n2 f10 3 n1 f3 14 n4 10 n4 f10 14 n5 f14 13 n2 n1 n2 n3 n4 10 n4 f10 4 n1 f4 11 n5 n5 Insert(10, f10)

22 Freenet Properties  Newly queried/inserted files are stored on nodes storing similar ids  New nodes can announce themselves by inserting files  Attempts to supplant or discover existing files will just spread the files

23 Freenet Summary  Advantages -Provides publisher anonymity -Totally decentralize architecture  robust and scalable -Resistant against malicious file deletion  Disadvantages -Does not always guarantee that a file is found, even if the file is in the network

24 Solutions to the Location Problem  Goal: make sure that an item (file) identified is always found -indexing scheme: used to map file names to their location in the system -Requires a scalable indexing mechanism  Abstraction: a distributed hash-table data strctr -insert(id, item); -item = query(id); -Note: item can be anything: a data object, document, file, pointer to a file…  Proposals -CAN, Chord, Kademlia, Pastry, Viceroy, Tapestry, etc

25  Hash tables - essential building block in software systems  Internet-scale distributed hash tables - equally valuable to large-scale distributed systems? -peer-to-peer systems -Napster, Gnutella, Groove, FreeNet, MojoNation… -large-scale storage management systems -Publius, OceanStore, PAST, Farsite, CFS... -mirroring on the Web -Content-Addressable Network (CAN) -scalable -operationally simple -good performance Internet-scale hash tables

26 Content Addressable Network (CAN): basic idea insert (K 1,V 1 ) K V  Interface -insert(key,value) key (id), value (item) -value = retrieve(key)

27 CAN: basic idea retrieve (K 1 ) K V (K1,V1)

28 CAN: basic idea  Associate to each node and item a unique id in an d-dimensional Cartesian space -key (id) - node/point – zone (d)  Goals -Scales to hundreds of thousands of nodes -Handles rapid arrival and failure of nodes  Properties -Routing table size O(d) -Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes

29 CAN: solution  virtual d-dimensional Cartesian coordinate space  entire space is partitioned amongst all the nodes -every node “owns” a zone in the overall space  abstraction -can store data at “points” in the space -can route from one “point” to another  point = node that owns the enclosing zone

30 CAN Example: Two Dimensional Space  Space divided between nodes  All nodes cover the entire space  Each node covers either a square or a rectangular area of ratios 1:2 or 2:1  Example: -Node n1:(1, 2) first node that joins  cover the entire space n1

31 CAN Example: Two Dimensional Space  Node n2:(4, 2) joins  space is divided between n1 and n n1 n2

32 CAN Example: Two Dimensional Space  Node n3:(3, 5) joins  space is divided between n1 and n n1 n2 n3

33 CAN Example: Two Dimensional Space  Nodes n4:(5, 5) and n5:(6,6) join n1 n2 n3 n4 n5

34 Node I::insert(K,V) (1) a = hx(K) b = h y (K) (2)route(K,V) --> (a,b) (3) (a,b) stores (K,V) Simple example: To store a pair (K1,V1) key K1 is mapped onto a point P in the coordinate space using a uniform hash function The corresponding (key,value) pair is then stored at the node that owns the zone within which the point P lies Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address) x=a y=b I

35 CAN Example: Two Dimensional Space  Each item is stored by the node who owns its mapping in the space  Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6)  Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5); n1 n2 n3 n4 n5 f1 f2 f3 f4

36 Simple example: To retrieve key K1 node J::retrieve(K) (1) a = h x (K) b = h y (K) (2) route “retrieve(K)” to (a,b) Any node can apply the same deterministic hash function to map K1 onto point P and then retrieve the corresponding value from the point P If the point P is not owned by the requesting node, the request must be routed through the CAN infrastructure until it reaches the node in whose zone P lies (K,V) J y=b x=a

37 CAN: Query/Routing Example  Each node knows its neighbors in the d-space  Forward query to the neighbor that is closest to the query id  Example: assume n1 queries f4  A node only maintains state for its immediate neighboring nodes  Can route around some failures n1n2 n3 n4 n5 f1 f2 f3 f4

38 CAN: node insertion Inserting a new node affects only a single other node and its immediate neighbors 1) discover some node “I” already in CAN 2) pick random point in space 3) I routes to (p,q), discovers node J 4) split J’s zone in half… new owns one half (p,q) I J new node new

39 CAN: Node Failure Recovery  Simple failures -Know your neighbor’s neighbors -When a node fails, one of its neighbors takes over its zone  More complex failure modes -Simultaneous failure of multiple adjacent nodes -Scoped flooding to discover neighbors -Hopefully, a rare event  Only the failed node’s immediate neighbors are required for recovery

40 Evaluation  Scalability  Low-latency  Load balancing  Robustness

41 CAN: scalability  For a uniformly partitioned space with n nodes and d dimensions -per node, number of neighbors is 2d -average routing path is (dn 1/d )/4 hops -simulations show that the above results hold in practice  Can scale the network without increasing per-node state  Chord/Plaxton/Tapestry/Buzz -log(n) neighbors with log(n) hops

42 CAN: low-latency  Problem -latency stretch = (CAN routing delay) (IP routing delay) -application-level routing may lead to high stretch  Solution -increase dimensions, realities (reduce the path length) -Heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries

43 CAN: low-latency #nodes Latency stretch 16K32K65K131K #dimensions = 2 w/o heuristics w/ heuristics

44 CAN: low-latency #nodes Latency stretch 16K32K65K131K #dimensions = 10 w/o heuristics w/ heuristics

45 CAN: load balancing  Two pieces -Dealing with hot-spots popular (key,value) pairs nodes cache recently requested entries overloaded node replicates popular entries at neighbors -Uniform coordinate space partitioning uniformly spread (key,value) entries uniformly spread out routing load

46 CAN: Robustness  Completely distributed -no single point of failure ( not applicable to pieces of database when node failure happens)  Not exploring database recovery (in case there are multiple copies of database)  Resilience of routing -can route around trouble

47 Strengths  More resilient than flooding broadcast networks  Efficient at locating information  Fault tolerant routing  Node & Data High Availability (w/ improvement)  Manageable routing table size & network traffic

48 Weaknesses  Impossible to perform a fuzzy search  Susceptible to malicious activity  Maintain coherence of all the indexed data (Network overhead, Efficient distribution)  Still relatively higher routing latency  Poor performance w/o improvement

49 Suggestions  Catalog and Meta indexes to perform search function  Extension to handle mutable content efficiently for web-hosting  Security mechanism to defense against attacks

50 Ongoing Work Topologically-sensitive CAN construction - Distributed Binning  Goal -bin nodes such that co-located nodes land in same bin  Idea -well known set of landmark machines -each CAN node, measures its RTT to each landmark -orders the landmarks in order of increasing RTT  CAN construction -place nodes from the same bin close together on the CAN

51 Distributed Binning -4 Landmarks (placed at 5 hops away from each other) - naïve partitioning number of nodes 256 1K4K latency Stretch K4K  w/o binning w/ binning w/o binning w/ binning #dimensions=2#dimensions=4

52 Ongoing Work (cont’d)  CAN Security (Petros Maniatis - Stanford) -spectrum of attacks -appropriate counter-measures  CAN Usage -Application-level Multicast (NGC 2001) -Grass-Roots Content Distribution -Distributed Databases using CANs (J.Hellerstein, S.Ratnasamy, S.Shenker, I.Stoica, S.Zhuang)

53 Summary  CAN -an Internet-scale hash table -potential building block in Internet applications  Scalability -O(d) per-node state -average routing path is (dn 1/d )/4 hops  Low-latency routing -simple heuristics help a lot  Robust -decentralized, can route around trouble