CS162 Operating Systems and Systems Programming Lecture 23 Peer-to-Peer Systems April 18, 2011 Anthony D. Joseph and Ion Stoica

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Peer to Peer and Distributed Hash Tables
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
The BitTorrent Protocol. What is BitTorrent?  Efficient content distribution system using file swarming. Does not perform all the functions of a typical.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Denial-of-Service Resilience in Peer-to-Peer Systems D. Dumitriu, E. Knightly, A. Kuzmanovic, I. Stoica and W. Zwaenepoel Presenter: Yan Gao.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Application Layer Overlays IS250 Spring 2010 John Chuang.
CS162 Operating Systems and Systems Programming Lecture 23 HTTP and Peer-to-Peer Networks April 20, 2011 Ion Stoica
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
Overlay Networks EECS 122: Lecture 18 Department of Electrical Engineering and Computer Sciences University of California Berkeley.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
1 EE 122: Overlay Networks and p2p Networks Ion Stoica TAs: Junda Liu, DK Moon, David Zats (Materials with thanks.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
Winter 2008 P2P1 Peer-to-Peer Networks: Unstructured and Structured What is a peer-to-peer network? Unstructured Peer-to-Peer Networks –Napster –Gnutella.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
Peer-to-Peer Networks and Distributed Hash Tables 2006.
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Introduction of P2P systems
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
CS162 Section Lecture Digital Certificates Alice Certificate Authority {Alice, } (offline) identity verification E({, Alice}, K verisign_private.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
15-744: Computer Networking L-22: P2P. Lecture 22: Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
15-744: Computer Networking L-22: P2P. L -22; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS162 Operating Systems and Systems Programming Lecture 25 Capstone: P2P Systems, Review May 1, 2013 Anthony D. Joseph
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
15-744: Computer Networking L-23: P2P. L -23; © Srinivasan Seshan, P2P Peer-to-peer networks Assigned reading [Cla00] Freenet: A Distributed.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS162 Operating Systems and Systems Programming Lecture 25 P2P Systems and Review November 28, 2012 Ion Stoica
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
CS 268: Lecture 22 (Peer-to-Peer Networks)
Computer Science Division
EE 122: Peer-to-Peer (P2P) Networks
CS 268: Peer-to-Peer Networks and Distributed Hash Tables
CS 162: P2P Networks Computer Science Division
An Overview of Peer-to-Peer
Peer-to-Peer Networks and Distributed Hash Tables
Presentation transcript:

CS162 Operating Systems and Systems Programming Lecture 23 Peer-to-Peer Systems April 18, 2011 Anthony D. Joseph and Ion Stoica

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Three Capstone Lectures Peer-to-Peer systems (today) Client-Sever (Monday) Cloud computing (Wednesday, April 25) –Invited lecture: Harry Li (Facebook)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 P2P Traffic 2004: some Internet Service Providers (ISPs) claimed > 50% was p2p traffic

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 P2P Traffic Today, around 20% Big chunk now is video entertainment (e.g., Netflix, iTunes)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Peer-to-Peer Systems What problem does it try to solve? –Provide highly scalable, cost effective (i.e., free!) services, e.g., »Content distribution (e.g., Bittorrent) »Internet telephony (e.g., Skype) »Video streaming (e.g., Octoshape) »Computation (e.g., Key idea: leverage “free” resources of users (that use the service), e.g., –Network bandwidth –Storage –Computation

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 How Did it Start? A killer application: Napster (1999) –Free music over the Internet Use (home) user machines to store and distribute songs Internet

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Model Each user stores a subset of files Each user has access (can download) files from all users in the system A B C D E F

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Main Challenge Find a “good” node storing a specified file By “good” we mean: –Has correct content –Can get content fast –… A B C D E F E?

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Other Challenges Scale: up to hundred of thousands or millions of machines Dynamicity: machines can come and go at any time

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Napster Implements a centralized lookup/directory service that maps files (songs) to machines currently in the system How to find a file (song)? –Query the lookup service  return a machine that stores the required file »Ideally this is the closest/least-loaded machine –Download (ftp/http) the file Advantages: –Simplicity, easy to implement sophisticated search engines on top of a centralized lookup service Disadvantages: –Robustness, scalability (?)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Napster: Example 1)A client (initiator) contacts directory service to get file “C” 2)Directory service returns a (possible) close by and lightly loaded peer (m5) storing “C” 3)Client contacts directly m5 to get file “C” m3 m4 m5 m6 m7 m8 m9 m2 m1 A B B C C D A: m3 B: m1, m7 C: m5, m8 D: m8 … … C? initiator m5 C? C Directory service

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 The Rise and Fall of Napster Founded by Shawn Fanning, John Fanning, and Sean Parker Operated between June 1999 and July 2001 –More than 26 million users (February 2001) Several high profile songs were leaked before being released: –Metallica’s “I Disappear” demo song –Madonna’s “Music” single But, also helped made some bands successful (e.g., Radiohead, Dispatch) (Reemerged as music store in 2008) (Source: ers.svg)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 The Aftermath “Recording Industry Association of America (RIAA) Sues Music Startup Napster for $20 Billion” – December 1999 “Napster ordered to remove copyrighted material” – March 2001 Main legal argument: –Napster owns the lookup service, so it is directly responsible for disseminating copyrighted material

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella (2000) What problem does it try to solve? –Get around the legal vulnerabilities by getting rid of the directory service Main idea: Flood the request to peers in the system to find file

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella (2000) How does request flooding work? –Send request to all neighbors –Neighbors recursively multicast the request –Eventually a machine that has the file receives the request, and it sends back the answer Advantages: –Totally decentralized, highly robust Disadvantages: –Not scalable; the entire network can be swamped with requests (to alleviate this problem, each request has a TTL)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Time To Live (TTL) When the client (initiator) sends a request, it associates a TTL with the request When a node forwards the request it decrements the TTL When TTL reaches 0, the request is no longer forwarded Typically, Gnutella uses TTL = 7 Example: TTL = 3 TTL = 3TTL = 2TTL = 1TTL = 0 Stop forwarding request initiator

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example Assume a client (initiator) asks for file “C” Assume TTL=2 C C A B B D initiator

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example Initiator send request to its neighbor(s)… … which recursively forward the request to their neighbors After 3 hops request is dropped C C A B B D initiator C ?

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example If node has the requested file it sends a reply back –along the reverse path of the request, or –directly to initiator C C A B B D initiator C ? m m m m

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example Initiator request file “C” from node “m” –Initiator may pick one of several machines if receive multiple replies –Typically uses HTTP to retrieve data C C A B B D initiator C m

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Two-Level Hierarchy What problem does it try to solve? –Inefficient search Main idea: organize the p2p system in a two level hierarchy –Flooding happens only at the top level

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Two-Level Hierarchy KaZaa, subsequent versions of Gnutella Leaf nodes are connected to a small number of ultrapeers (supernodes) C A B B D C C D B C m2 m3 m4 m7 m8 m10 m11 m13 m15 m17 Leaf nodes Ultrapeer nodes m1

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Two-Level Hierarchy Each ultra-peer builds a director for the content stored at its peers C A B B D C C D B C m2 m3 m4 m7 m8 m10 m11 m13 m15 m17 B: m2, m3 D: m4 … … B: m7 D: m8 … … C: m10, m11 … … C: m13 … … A: m15 C: m17 … … Leaf nodes Ultrapeer nodes m1

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example Query: A leaf sends query to its ultrapeers If ultrapeer has requested content in its directory, the ultrapeer replies immediately C A B B D initiator C C D B C m2 m3 m4 m7 m8 m10 m11 m13 m15 m17 B: m2, m3 D: m4 … … B: m7 D: m8 … … C: m10, m11 … … C: m13 … … A: m15 C: m17 … … m2 B?

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Gnutella: Example Query: A leaf sends query to its ultrapeers If ultrapeer doesn’t have content in its directory, the ultrapeer floods other ultrapeers C A B B D initiator C C D B C m2 m3 m4 m7 m8 m10 m11 m13 m15 m17 B: m2, m3 D: m4 … … B: m7 D: m8 … … C: m10, m11 … … C: m13 … … A: m15 C: m17 … … m15 A? m15

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Example: Oct 2003 Crawl on Gnutella Ultrapeer nodes Leaf nodes

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Distributed Hash Tables (DHTs) What problem does it solve? –Scalable, low-overhead lookup protocol –Guarantee a file is found if it is in the system Abstraction: a distributed hash-table data structure –insert(id, item); –item = query(id); –Note: item can be anything: a data object, document, file, pointer to a file… Proposals –CAN, Chord, Kademlia, Pastry, Viceroy, Tapestry, etc

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 m1 m2 m3 m4 DHTs Partition hash table across nodes

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 DHT: Insertion Call insert(id4, item14) at m1 –Find node responsible for id4, i.e., node m3 –Insert (id14, item14) at m3 insert(id14, item14) m1 m2 m3 m4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 DHT: Query Call query(id4) at m4 –Find node responsible for id4, i.e., m3 –Return (id14, item14) to m4 query(id14) item14 m1 m2 m3 m4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 DHT: Lookup Service Key primitive: lookup service –Given an ID, map it to a host Scalability: hundreds of thousands or millions of machines One example: Chord (Lecture 15) Another example: CAN (next)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Content Addressable Network (CAN) Associate to each node and item a unique id in an d- dimensional space, e.g., torus Assigning d-dimensional ID to content (file): –Use d hash functions (i.e., h 1 (), h 2 (),.., h d ()) to compute d coordinates using name or content –id = (id 1, id 2, …, id d ), where id i = h i (content); Assigning d-dimensional ID to new node, n new : 1)Pick random id new 2)Find node n 1 responsible for id new 3)n 1 may pick a new ID for n new, id’ new, to better split the ID space

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Space divided between nodes All nodes cover the entire space Each node covers either a square or a rectangular area of ratios 1:2 or 2:1 Example: –Assume space size (8 x 8) –Node n1:(1, 2) first node that joins  cover the entire space n1

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Node n2:(4, 2) joins  space is divided between n1 and n n1 n2 n3

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Nodes n4:(5, 5) and n5:(6,6) join n1 n2 n3 n4 n5

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Nodes: n1:(1, 2); n2:(4,2); n3:(3, 5); n4:(5,5);n5:(6,6) Items: f1:(2,3); f2:(5,0); f3:(2,1); f4:(7,5); n1 n2 n3 n4 n5 f1 f2 f3 f4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN Example: Two Dimensional Space Each item is stored by the node who owns its mapping in the space n1 n2 n3 n4 n5 f1 f2 f3 f4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 CAN: Query Example Each node knows its neighbors in the d-space Forward query to the neighbor that is closest to the query id Example: assume n1 queries f n1 n2 n3 n4 n5 f1 f2 f3 f4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Content Addressable Network (CAN) Properties –Routing table size O(d), i.e., each node needs to know about O(d) neighbors –Guarantees that a file is found in at most d*n 1/d steps, where n is the total number of nodes Example: grid with n node (d = 2)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring min Break

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 BitTorrent (2001): The Problem What problem does it try to solve? –Efficient distribution of large files (e.g., movies) Example: –Assume we want to distribute F=1GB file to 8 other nodes –Assume all clients are identical and have uplink capacity of C=1 Mbps –How long does it take to distribute the file using Napster/Gnutella/Kazza?

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Efficient Large File Distribution Step 1: copy file from m1  m2 –Duration: F/C = 1GB/1Mbps = 8,000sec Step 2: copy files from m1  m3 and m2  m4 –Duration: 8,000sec m1 m2m3m4 m5m6 m7m8 m1 m2m3m4 m5m6 m7m8

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Efficient Large File Distribution Step 3: copy files from m1  m5 and m2  m6, m3  m7, m4  m8 –Duration: 8000sec Step k: by end of step k we transfer file to 2 k nodes m1 m2m3m4 m5m6 m7m8

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Efficient Large File Distribution Total time to transfer file to N machines: –If N = 8  total distribution time = 24,000sec Can we do better? Yes! –Divide file into chunks –Pipeline chunk distribution m1 m2m3m4 m5m6 m7m8

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Efficient Large File Distribution Dive file into blocks of size B Use chain distribution m1m2m3m4m5m6m7m … file … Step 1 Step 2 Step 3 Step 4

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Efficient Large File Distribution Example: –File size: F=1GB –block size: B=1MB –# of nodes: N –Total time to distribute file: »F/C + (N-1)*B/C = = 1GB/1Mbps + (N-1)*1MB/1Mbps = 8,000sec + (N-1)*8sec »If N = 8  Total distribution time = 8,056sec 1234 … Step … file

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 BitTorrent (2001) Allow fast downloads even when sources have low up- link capacity How does it work? –Seed (origin) – site storing the file to be downloaded –Tracker – server maintaining the list of peers in system –Split each file into pieces (~ 256 KB each), and each piece into sub-pieces (~ 16 KB each) –The loader loads one piece at a time –Within one piece, the loader can load up to five sub- pieces in parallel

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 BitTorrent: Join Procedure Seed (origin server) Tracker join peer List (m1,m2,m5)  Peer contacts tracker responsible for file it wants to download  Tracker returns a list of peer (20-50) downloading same file  Peer connects to peers in the list m2 m1 m5 m3 m4 m7 m6

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 BitTorrent: Download Algorithm Download consists of three phases: Start: get a piece as soon as possible –Select a random piece Middle: spread all pieces as soon as possible –Select rarest piece next End: avoid getting stuck with a slow source, when downloading the last sub-pieces –Request in parallel the same sub-piece –Cancel slowest downloads once a sub-piece has been received (For details see:

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Bittorrent: Handling Freeriders Free riders: peers that use the network without contributing (with the upstream bandwidth) Solution: chocking, a variant of Tit-for-Tat –Each peer has a limited number of upload slots –When a peer's upload bandwidth is saturated, it exchanges upload bandwidth for download bandwidth –If peer U downloads from peer C and doesn’t upload in return, C chokes download to U C U C U chock

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Skype (2003) Peer-to-peer Internet Telephony Two-level hierarchy (like KaZaa) –Ultrapeers used to route traffic between NATed end-hosts… –… plus a login server to »authenticate users »ensure that names are unique across network login server A B Messages exchanged to login server Data traffic (Note*: probable protocol; Skype protocol is not published)

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Skype (cont’d) Most powerful machine elected as a host and act as a mixer Example: –B elected as host/mixer –A and C sends their audio streams to B –B mixes the missing streams for A and C and sends mixed stream to each of them Two-way call: 36 kb/s Three-way call: 54 kb/s A C A C B+C A+B B

Lec /18Anthony D. Joseph and Ion Stoica CS162 ©UCB Spring 2012 Summary The key challenge of building wide area P2P systems is a scalable and robust directory service Solutions covered in this lecture –Naptser: centralized location service –Gnutella: broadcast-based decentralized location service –CAN, Chord, Tapestry, Pastry: intelligent-routing decentralized solution »Guarantee correctness Bittorrent: efficient distribution of large files –Split file into chunks and blocks –Parallelize and pipeline transfers