1 EE 122: Overlay Networks and p2p Networks Ion Stoica TAs: Junda Liu, DK Moon, David Zats (Materials with thanks.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
CSCI-1680 P2P Based partly on lecture notes by Ion Stoica, Scott Shenker, Joe Hellerstein Rodrigo Fonseca.
Peer to Peer and Distributed Hash Tables
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Application Layer Overlays IS250 Spring 2010 John Chuang.
CS162 Operating Systems and Systems Programming Lecture 23 HTTP and Peer-to-Peer Networks April 20, 2011 Ion Stoica
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
IPv6 and Overlays EE122 Introduction to Communication Networks Discussion Section.
Distributed Lookup Systems
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
CS 268: Overlay Networks: Distributed Hash Tables Kevin Lai May 1, 2001.
EE 122: A Note On Joining Operation in Chord Ion Stoica October 20, 2002.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
Winter 2008 P2P1 Peer-to-Peer Networks: Unstructured and Structured What is a peer-to-peer network? Unstructured Peer-to-Peer Networks –Napster –Gnutella.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Communication Part IV Multicast Communication* *Referred to slides by Manhyung Han at Kyung Hee University and Hitesh Ballani at Cornell University.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Introduction of P2P systems
Overlay network concept Case study: Distributed Hash table (DHT) Case study: Distributed Hash table (DHT)
CS162 Operating Systems and Systems Programming Lecture 23 Peer-to-Peer Systems April 18, 2011 Anthony D. Joseph and Ion Stoica
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
15-744: Computer Networking L-22: P2P. Lecture 22: Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Peer to Peer Network Design Discovery and Routing algorithms
15-744: Computer Networking L-22: P2P. L -22; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
Overlay Networks and Overlay Multicast May Definition  Network -defines addressing, routing, and service model for communication between hosts.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
15-744: Computer Networking L-23: P2P. L -23; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS162 Operating Systems and Systems Programming Lecture 14 Key Value Storage Systems March 12, 2012 Anthony D. Joseph and Ion Stoica
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS162 Operating Systems and Systems Programming Lecture 25 P2P Systems and Review November 28, 2012 Ion Stoica
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Peer-to-Peer Information Systems Week 12: Naming
CS 268: Lecture 22 (Peer-to-Peer Networks)
Computer Science Division
EE 122: Peer-to-Peer (P2P) Networks
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
Overlay Networking Overview.
DHT Routing Geometries and Chord
CS 162: P2P Networks Computer Science Division
An Overview of Peer-to-Peer
EE 122: Lecture 22 (Overlay Networks)
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

1 EE 122: Overlay Networks and p2p Networks Ion Stoica TAs: Junda Liu, DK Moon, David Zats (Materials with thanks to Vern Paxson, Jennifer Rexford, and colleagues at UC Berkeley)

2 Announcements No class Wednesday. Happy Thanksgiving! Homework 3 grades available by Wednesday Homework 4, due on Wednesday, December 2

3 Overlay Networks: Motivations Changes in the network happen very slowly Why? Internet network is a shared infrastructure; need to achieve consensus (IETF) Many of proposals require to change a large number of routers (e.g., IP Multicast, QoS); otherwise end-users won’t benefit Proposed changes that haven’t happened yet on large scale: More Addresses (IPv6 ‘91) Security (IPSEC ‘93); Multicast (IP multicast ‘90)

4 Motivations (cont’d) One size does not fit all Applications need different levels of Reliability Performance (latency) Security Access control (e.g., who is allowed to join a multicast group) …

5 Goals Make it easy to deploy new functionalities in the network  accelerate the pace of innovation Allow users to customize their service

6 Solution Deploy processing in the network Have packets processed as they traverse the network AS-1 IP AS-1 Overlay Network (over IP)

7 Overview  Resilient Overlay Network (RON) Overlay Multicast Peer-to-peer systems

8 Resilient Overlay Network (RON) Premise: overlay networks, can increase performance and reliability of routing Install N computers at different Internet locations Each computer acts as an overlay network router Between each overlay router is an IP tunnel (logical link) Logical overlay topology is all-to-all (N 2 ) Computers actively measure each logical link in real time for Packet loss rate, latency, throughput, etc Route overlay network traffic based on measured characteristics

9 Example Default IP path determined by BGP & OSPF Reroute traffic using red alternative overlay network path, avoid congestion point 1 Acts as overlay router Berkeley MIT UCLA

10 Overview Resilient Overlay Network (RON)  Overlay multicast Peer-to-peer systems

11 IP Multicast Problems Twenty years of research, still not widely deployed Poor scalability Routers need to maintain per-group or even per-group and per-sender state! Multicast addresses cannot be aggregated Supporting higher level functionality is difficult IP Multicast: best-effort multi-point delivery service Reliability & congestion control for IP Multicast complicated No support for access control Nor restriction on who can send  easy to mount Denial of Service (Dos) attacks!

12 Overlay Approach Provide IP multicast functionality above the IP layer  application level multicast Challenge: do this efficiently Projects: Narada Overcast Scattercast Yoid Coolstreaming (Roxbeam) Rawflow

13 Narada [Yang-hua et al, 2000] Source Speific Trees Involves only end hosts Small group sizes <= hundreds of nodes Typical application: chat

14 Narada: End System Multicast Stanford CMU Stan 1 Stan2 Berk2 Overlay Tree Gatech Berk1 Berkeley Gatech Stan1 Stan2 CMU Berk1 Berk2

15 Properties Easier to deploy than IP Multicast Don’t have to modify every router on path Easier to implement reliability than IP Multicast Use hop-by-hop retransmissions But Consume more bandwidth than IP Multicast Typically has higher latency than IP Multicast Harder to scale Optimization: use IP Multicast where available

16 Overview Resilient Overlay Network (RON) Overlay multicast  Peer-to-peer systems

17 How Did it Start? A killer application: Naptser Free music over the Internet Key idea: share the storage and bandwidth of individual (home) users Internet

18 Model Each user stores a subset of files Each user has access (can download) files from all users in the system

19 Main Challenge Find where a particular file is stored Note: problem similar to finding a particular page in web caching (what are the differences?) A B C D E F E?

20 Other Challenges Scale: up to hundred of thousands or millions of machines Dynamicity: machines can come and go any time

21 Napster Assume a centralized index system that maps files (songs) to machines that are alive How to find a file (song) Query the index system  return a machine that stores the required file Ideally this is the closest/least-loaded machine ftp the file Advantages: Simplicity, easy to implement sophisticated search engines on top of the index system Disadvantages: Robustness, scalability (?)

22 Napster: Example A B C D E F m1 m2 m3 m4 m5 m6 m1 A m2 B m3 C m4 D m5 E m6 F E? m5 E? E

23 Gnutella Distribute file location Idea: broadcast the request How to find a file? Send request to all neighbors Neighbors recursively multicast the request Eventually a machine that has the file receives the request, and it sends back the answer Advantages: Totally decentralized, highly robust Disadvantages: Not scalable; the entire network can be swamped with requests (to alleviate this problem, each request has a TTL)

24 Gnutella: Example Assume: m1’s neighbors are m2 and m3; m3’s neighbors are m4 and m5;… A B C D E F m1 m2 m3 m4 m5 m6 E? E

25 Two-Level Hierarchy Current Gnutella implementation, KaZaa Leaf nodes are connected to a small number of ultrapeers (suppernodes) Query A leaf sends query to its ultrapeers If ultrapeers don’t know the answer, they flood the query to other ultrapeers More scalable: Flooding only among ultrapeers Ultrapeer nodes Leaf nodes Oct 2003 Crawl on Gnutella

26 Skype Peer-to-peer Internet Telephony Two-level hierarchy like KaZaa Ultrapeers used mainly to route traffic between NATed end-hosts (see next slide)… … plus a login server to authenticate users ensure that names are unique across network login server A B Messages exchanged to login server Data traffic (Note*: probable protocol; Skype protocol is not published)

27 BitTorrent (1/2) Allow fast downloads even when sources have low connectivity How does it work? Split each file into pieces (~ 256 KB each), and each piece into sub-pieces (~ 16 KB each) The loader loads one piece at a time Within one piece, the loader can load up to five sub- pieces in parallel

28 BitTorrent (2/2) Download consists of three phases: Start: get a piece as soon as possible Select a random piece Middle: spread all pieces as soon as possible Select rarest piece next End: avoid getting stuck with a slow source, when downloading the last sub-pieces Request in parallel the same sub-piece Cancel slowest downloads once a sub-piece has been received (For details see:

29 Distributed Hash Tables Problem: Given an ID, map to a host Challenges Scalability: hundreds of thousands or millions of machines Instability Changes in routes, congestion, availability of machines Heterogeneity Latency: 1ms to 1000ms Bandwidth: 32Kb/s to 100Mb/s Nodes stay in system from 10s to a year Trust Selfish users Malicious users

30 Content Addressable Network (CAN) Associate to each node and item a unique id in an d-dimensional space Properties Routing table size O(d) Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes

31 CAN Example: Two Dimensional Space Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example: Assume space size (8 x 8) Node n1:(1, 2) first node that joins  cover the entire space n1

32 CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2

33 CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2 n3

34 CAN Example: Two Dimensional Space Nodes n4:(5, 5) and n5:(6,6) join n1 n2 n3 n4 n5

35 CAN Example: Two Dimensional Space Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,1); f3:(2,1); f4:(7,5); n1 n2 n3 n4 n5 f1 f2 f3 f4

36 CAN Example: Two Dimensional Space Each item is stored by the node who owns its mapping in the space n1 n2 n3 n4 n5 f1 f2 f3 f4

37 CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f n1 n2 n3 n4 n5 f1 f2 f3 f4

38 Chord Associate to each node and item a unique id in an uni-dimensional space 0..2 m -1 Key design decision Decouple correctness from efficiency Properties Routing table size O(log(N)), where N is the total number of nodes Guarantees that a file is found in O(log(N)) steps

39 Identifier to Node Mapping Example Node 8 maps [5,8] Node 15 maps [9,15] Node 20 maps [16, 20] … Node 4 maps [59, 4] Each node maintains a pointer to its successor

40 Lookup Each node maintains its successor Route packet (ID, data) to the node responsible for ID using successor pointers lookup(37) node=44

Stabilization Procedure Periodic operation performed by each node N to handle joins N: periodically: STABILIZE  N.successor; M: upon receiving STABILIZE from N: NOTIFY(M.predecessor)  N; N: upon receiving NOTIFY(M’) from M: if (M’ between (N, N.successor)) N.successor = M’;

42 Joining Operation  Node with id=50 joins the ring  Node 50 needs to know at least one node already in the system -Assume known node is 15 succ=4 pred=44 succ=nil pred=nil succ=58 pred=35

43 Joining Operation  Node 50: send join(50) to node 15  Node 44: returns node 58  Node 50 updates its successor to 58 join(50) succ=58 succ=4 pred=44 succ=nil pred=nil succ=58 pred=35 58

44 Joining Operation  Node 50: send stabilize() to node 58  Node 58: -update predecessor to 50 -send notify() back succ=58 pred=nil succ=58 pred=35 stabilize() notify(pred=50) pred=50 succ=4 pred=44

45 Joining Operation (cont’d)  Node 44 sends a stabilize message to its successor, node 58  Node 58 reply with a notify message  Node 44 updates its successor to 50 succ=58 stabilize() notify(pred=50) succ=50 pred=50 succ=4 pred=nil succ=58 pred=35

46 Joining Operation (cont’d)  Node 44 sends a stabilize message to its new successor, node 50  Node 50 sets its predecessor to node 44 succ=58 succ=50 Stabilize() pred=44 pred=50 pred=35 succ=4 pred=nil

47 Joining Operation (cont’d)  This completes the joining operation! succ=58 succ=50 pred=44 pred=50

48 Achieving Efficiency: finger tables ( ) mod 2 7 = 16 0 Say m=7 ith entry at peer with id n is first peer with id >= i ft[i] Finger Table at

49 Achieving Robustness To improve robustness each node maintains the k (> 1) immediate successors instead of only one successor In the notify() message, node A can send its k-1 successors to its predecessor B Upon receiving notify() message, B can update its successor list by concatenating the successor list received from A with A itself

50 Discussion Query can be implemented Iteratively Recursively Performance: routing in the overlay network can be more expensive than in the underlying network Because usually there is no correlation between node ids and their locality; A query can repeatedly jump from Europe to North America, though both the initiator and the node that store the item are in Europe! Solutions: Tapestry takes care of this implicitly; CAN and Chord maintain multiple copies for each entry in their routing tables and choose the closest in terms of network distance

51 Conclusions The key challenge of building wide area P2P systems is a scalable and robust directory service Solutions covered in this lecture Naptser: centralized location service Gnutella: broadcast-based decentralized location service CAN, Chord, Tapestry, Pastry: intelligent-routing decentralized solution Guarantee correctness Tapestry, Pastry provide efficient routing, but more complex