Peer-to-Peer Applications

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Course on Computer Communication and Networks Lecture 10 Chapter 2; peer-to-peer applications (and network overlays) EDA344/DIT 420, CTH/GU.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Peer-to-Peer Networking Marina Papatriantafilou CSE Department, Distributed Computing and Systems group Ack: Many of the slides are adaptation of slides.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Peer-to-Peer
Peer-to-Peer Jeff Pang Spring Spring 2004, Jeff Pang2 Intro Quickly grown in popularity –Dozens or hundreds of file sharing applications.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
2: Application Layer P2P applications and Sockets.
Peer-to-Peer Credits: Slides adapted from J. Pang, B. Richardson, I. Stoica, M. Cuenca, B. Cohen, S. Ramesh, D. Qui, R. Shrikant, Xiaoyan Li and R. Martin.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
CSE 124 Networked Services Fall 2009 B. S. Manoj, Ph.D 11/03/2009CSE 124 Network Services FA 2009 Some of these.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Peer-to-Peer Networking Credit slides from J. Pang, B. Richardson, I. Stoica.
Peer-to-Peer Scaling Problem Millions of clients  server and network meltdown.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
P2P File Sharing Systems
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
Content Distribution March 6, : Application Layer1.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Introduction of P2P systems
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Peer-to-Peer Networks Hongli Luo CEIT, IPFW. r Topics m Application architecture m P2P file sharing m P2P networks: Napster Gnutella KaAzA Bittorrent.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Lecture 2 Distributed Hash Table
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
15-744: Computer Networking L-22: P2P. Lecture 22: Peer-to-Peer Networks Typically each member stores/provides access to content Has quickly.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
Peer-to-Peer Systems.  Quickly grown in popularity:  Dozens or hundreds of file sharing applications  In 2004: 35 million adults used P2P networks.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
2: Application Layer 1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April.
EE324 DISTRIBUTED SYSTEMS L17 P2P. Scaling Problem 2  Millions of clients  server and network meltdown.
Hui Zhang, Fall Computer Networking CDN and P2P.
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Peer-to-Peer File Sharing
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
(slides by Nick Feamster)
EE 122: Peer-to-Peer (P2P) Networks
Part 4: Peer to Peer - P2P Applications
Consistent Hashing and Distributed Hash Table
#02 Peer to Peer Networking
Presentation transcript:

Peer-to-Peer Applications Course on Computer Communication and Networks, CTH/GU The slides are adaptation of slides of the authors of the main textbook of the course, of the authors in the bibliography section and of Jeff Pang. P2P

P2P file sharing Alice chooses one of the peers, Bob. File is copied from Bob’s PC to Alice’s notebook: HTTP While Alice downloads, other users uploading from Alice. Alice’s peer is both a Web client and a transient Web server. All peers are servers = highly scalable! Example Alice runs P2P client application on her notebook computer Intermittently connects to Internet; gets new IP address for each connection Asks for “Hey Jude” Application displays other peers that have copy of Hey Jude. P2P

Intro Quickly grown in popularity But what is P2P? Dozens or hundreds of file sharing applications many million people worldwide use P2P networks Audio/Video transfer now dominates traffic on the Internet But what is P2P? Searching or location? Computers “Peering”? Take advantage of resources at the edges of the network End-host resources have increased dramatically Broadband connectivity now common P2P

The Lookup Problem N2 N1 N3 ? N4 N6 N5 Key=“title” Value=MP3 data… Internet ? Client Publisher 1000s of nodes. Set of nodes may change… Lookup(“title”) N4 N6 N5 P2P

The Lookup Problem (2) Common Primitives: Join: how to I begin participating? Publish: how do I advertise my file? Search: how to I find a file? Fetch: how to I retrieve a file? P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

P2P: centralized directory directory server peers Alice Bob 1 2 3 original “Napster” design (1999, S. Fanning) 1) when peer connects, it informs central server: IP address content 2) Alice queries directory server for “Hey Jude” 3) Alice requests file from Bob Problems? Single point of failure Performance bottleneck Copyright infringement P2P

Napster: Publish insert(X, 123.2.21.23) ... Publish I have X, Y, and Z! 123.2.21.23 P2P

Napster: Search 123.2.0.18 Fetch search(A) --> 123.2.0.18 Query Reply Where is file A? P2P

Napster: Discussion Pros: Simple Search scope is O(1) Controllable (pro or con?) Cons: Server maintains O(N) State Server does all processing Single point of failure P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

Gnutella: History In 2000, J. Frankel and T. Pepper from Nullsoft released Gnutella Soon many other clients: Bearshare, Morpheus, LimeWire, etc. In 2001, many protocol enhancements including “ultrapeers” P2P

Gnutella: Overview Query Flooding: Join: on startup, client contacts a few other nodes; these become its “neighbors” Publish: no need Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender. Fetch: get the file directly from peer P2P

Gnutella: Search I have file A. Reply Query Where is file A? P2P

Gnutella: protocol Query message sent over existing TCP connections File transfer: HTTP Query message sent over existing TCP connections peers forward Query message QueryHit sent over reverse path Query QueryHit Scalability: limited scope flooding P2P

Gnutella: More on Peer joining Joining peer X must find some other peer in Gnutella network (from gnutellahosts.com): use list of candidate peers X sequentially attempts to make TCP with peers on list until connection setup with Y X sends Ping message to Y; Y forwards (floods) Ping message. All peers receiving Ping message respond with Pong message X receives many Pong messages. It can then (later on, if needed) setup additional TCP connections P2P

Query flooding: Gnutella fully distributed no central server public domain protocol many Gnutella clients implementing protocol overlay network: edge between peer X and Y if there’s a TCP connection all active peers and edges is overlay net Edge is not a physical link Given peer will typically be connected with < 10 overlay neighbors What is routing in p2p networks? P2P

Gnutella: Discussion Pros: Cons: Improvements: Fully de-centralized Search cost distributed Cons: Search scope is O(N) Search time is O(???) Nodes leave often, network unstable Improvements: limiting depth of search Random walks instead of flooding P2P

Aside: Search Time? P2P

Aside: All Peers Equal? 56kbps Modem 10Mbps LAN 1.5Mbps DSL P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

KaZaA: History In 2001, KaZaA created by Dutch company Kazaa BV Single network called FastTrack used by other clients as well: Morpheus, giFT, etc. Eventually protocol changed so other clients could no longer talk to it popular file sharing network with >10 million users (number varies) P2P

KaZaA: Overview “Smart” Query Flooding: Join: on startup, client contacts a “supernode” ... may at some point become one itself Publish: send list of files to supernode Search: send query to supernode, supernodes flood query amongst themselves. Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers P2P

KaZaA: Network Design “Super Nodes” P2P

KaZaA: File Insert insert(X, 123.2.21.23) ... Publish I have X! P2P

KaZaA: File Search search(A) --> 123.2.0.18 123.2.22.50 Replies Query Where is file A? P2P

KaZaA: Discussion P2P architecture used by Skype Pros: Cons: Tries to take into account node heterogeneity: Bandwidth Host Computational Resources Host Availability (?) Rumored to take into account network locality Cons: Mechanisms easy to circumvent Still no real guarantees on search scope or search time P2P architecture used by Skype P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

BitTorrent: History In 2002, B. Cohen debuted BitTorrent Key Motivation: Popularity exhibits temporal locality (Flash Crowds) E.g., Slashdot effect, CNN on 9/11, new movie/game release Focused on Efficient Fetching, not Searching: Distribute the same file to all peers Single publisher, multiple downloaders Has some “real” publishers: Blizzard Entertainment using it to distribute games P2P

BitTorrent: Overview Swarming: Join: contact centralized “tracker” server, get a list of peers. Publish: Run a tracker server. Search: Out-of-band. E.g., use Google to find a tracker for the file you want. Fetch: Download chunks of the file from your peers. Upload chunks you have to them. P2P

File distribution: BitTorrent P2P file distribution tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain list of peers trading chunks peer 2: Application Layer

BitTorrent (1) file divided into 256KB chunks. peer joining torrent: has no chunks, but will accumulate them over time registers with tracker to get list of peers, connects to subset of peers (“neighbors”) while downloading, peer uploads chunks to other peers. peers may come and go once peer has entire file, it may (selfishly) leave or (altruistically) remain 2: Application Layer

BitTorrent (2) Sending Chunks: tit-for-tat Alice sends chunks to four neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10 secs every 30 secs: randomly select another peer, starts sending chunks newly chosen peer may join top 4 “optimistically unchoke” Pulling Chunks at any given time, different peers have different subsets of file chunks periodically, a peer (Alice) asks each neighbor for list of chunks that they have. Alice sends requests for her missing chunks rarest first 2: Application Layer

BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster! 2: Application Layer

BitTorrent: Publish/Join Tracker P2P

BitTorrent: Fetch P2P

BitTorrent: Sharing Strategy “Tit-for-tat” sharing strategy “I’ll share with you if you share with me” Be optimistic: occasionally let freeloaders download Otherwise no one would ever start! Also allows you to discover better peers to download from when they reciprocate Approximates Pareto Efficiency Game Theory: “No change can make anyone better off without making others worse off” P2P

BitTorrent: Summary Pros: Cons: Works reasonably well in practice Gives peers incentive to share resources; avoids freeloaders Cons: Central tracker server needed to bootstrap swarm P2P

BTW: File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? us: server upload bandwidth Server ui: peer i upload bandwidth u1 d1 u2 d2 us di: peer i download bandwidth File, size F dN Network (with abundant bandwidth) uN 2: Application Layer

BTW: File distribution time: server-client server sequentially sends N copies: NF/us time client i takes F/di time to download F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN = dcs = max { NF/us, F/min(di) } i Time to distribute F to N clients using client/server approach increases linearly in N (for large N) 2: Application Layer

BTW: File distribution time: P2P Server server must send one copy: F/us time client i takes F/di time to download NF bits must be downloaded (aggregate) F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN fastest possible upload rate: us + Sui dP2P = max { F/us, F/min(di) , NF/(us + Sui) } i 2: Application Layer

Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us 2: Application Layer

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

Freenet: History In 1999, I. Clarke started the Freenet project Basic Idea: Employ shortest-path-like routing on the overlay network to publish and locate files Addition goals: Provide anonymity and security Make censorship difficult P2P

Freenet: Overview Routed Queries: Join: on startup, client contacts a few other nodes it knows about; gets a unique node id Publish: route file contents toward the file id. File is stored at node with id closest to file id Search: route query for file id toward the closest node id (DFS+caching in freenet vs. BFS+nocaching in Gnutella) Fetch: when query reaches a node containing file id, it returns the file to the sender P2P

Freenet: Routing Tables id – file identifier (e.g., hash of file) next_hop – another node that stores the file id file – file identified by id being stored on the local node Forwarding of query for file id If file id stored locally, then stop Forward data back to upstream requestor If not, search for the “closest” id in the table, and forward the message to the corresponding next_hop If data is not found, failure is reported back Requestor then tries next closest match in routing table id next_hop file … … P2P

DFS+caching in freenet vs. BFS+nocaching in Gnutella) Freenet: Routing query(10) n1 n2 1 4 n1 f4 12 n2 f12 5 n3 9 n3 f9 4’ 4 n4 n5 2 14 n5 f14 13 n2 f13 3 n6 5 4 n1 f4 10 n5 f10 8 n6 3 n3 3 n1 f3 14 n4 f14 5 n3 id next_hop file … DFS+caching in freenet vs. BFS+nocaching in Gnutella) … P2P

Freenet: Routing Properties “Close” file ids tend to be stored on the same node Why? Publications of similar file ids route toward the same place Network tend to be a “small world” Small number of nodes have large number of neighbors (i.e., ~ “six-degrees of separation”) Consequence: Most queries only traverse a small number of hops to find the file (cf. also 80-20 rule) P2P

Freenet: Discussion Pros: Cons: +/- anonymity Intelligent routing makes queries relatively short Search scope small (only nodes along search path involved); no flooding Cons: Still no provable guarantees! +/- anonymity Anonymity properties may give you “plausible deniability” Anonymity features make it hard to measure, debug P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

DHT: History In 2000-2001, academic researchers said “we want to play too!” Motivation: Frustrated by popularity of all these “half-baked” P2P apps :) We can do better! (so we said) Guaranteed lookup success for files in system Provable bounds on search time Provable scalability to millions of node Hot Topic in networking ever since P2P

Motivation How to find data in a distributed file sharing system? N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 ? N4 Client Lookup(“LetItBe”) Lookup is a key problem P2P

Centralized Solution N2 N3 N1 DB N5 N4 Central server (Napster) Publisher Key=“LetItBe” Value=MP3 data Internet DB N5 N4 Client Lookup(“LetItBe”) Requires O(M) state Single point of failure P2P

Distributed Solution (1) Flooding (Gnutella, Morpheus, etc.) N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 N4 Client Lookup(“LetItBe”) Worst case O(N) messages per lookup P2P

Distributed Solution (2) Routed messages (Freenet, Tapestry, Chord, CAN, etc.) N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 N4 Client Lookup(“LetItBe”) Only exact matches P2P

Routing Challenges Define a useful key nearness metric Keep the hop count small Keep the routing tables “right size” Stay robust despite rapid changes in membership P2P

DHT: Overview Abstraction: a distributed “hash-table” (DHT) data structure: put(id, item); item = get(id); Implementation: nodes in system form a distributed data structure Can be Ring, Tree, Hypercube, Skip List, Butterfly Network, ... P2P

DHT: Overview (2) Structured Overlay Routing: Join: On startup, contact a “bootstrap” node and integrate yourself into the distributed data structure; get a node id Publish: Route publication for file id toward a close node id along the data structure Search: Route a query for file id toward a close node id. Data structure guarantees that query will meet the publication. Fetch: Two options: Publication contains actual file => fetch from where query stops Publication says “I have file X” => query tells you 128.2.1.3 has X, use IP routing to get X from 128.2.1.3 P2P

Chord Overview Provides peer-to-peer hash lookup service: Lookup(key)  IP address How does Chord locate a node? How does Chord maintain routing tables? How does Chord cope with changes in membership? P2P

Node identifier = SHA-1(IP address) Chord IDs m bit identifier space for both keys and nodes Key identifier = SHA-1(key) Key=“LetItBe” ID=60 SHA-1 Node identifier = SHA-1(IP address) IP=“198.10.10.1” ID=123 SHA-1 Both are uniformly distributed How to map key IDs to node IDs? P2P

Consistent Hashing [Karger 97] K5 IP=“198.10.10.1” N123 K20 Circular 7-bit ID space K101 N32 Key=“LetItBe” N90 K60 A key is stored at its successor: node with next higher ID P2P

Consistent Hashing Every node knows of every other node requires global information Routing tables are large O(N) Lookups are fast O(1) N10 Where is “LetItBe”? Hash(“LetItBe”) = K60 N123 N32 “N90 has K60” K60 N90 N55 P2P

Chord: Basic Lookup Every node knows its successor in the ring N10 N10 Where is “LetItBe”? N123 Hash(“LetItBe”) = K60 N32 “N90 has K60” N55 K60 N90 requires O(N) time P2P

“Finger Tables” Every node knows m other nodes in the ring Increase distance exponentially N112 N16 80 + 25 80 + 26 N96 80 + 24 80 + 23 80 + 22 80 + 21 80 + 20 N80 P2P

“Finger Tables” Finger i points to successor of n+2i N120 N112 N16 N96 80 + 25 80 + 26 N96 80 + 24 80 + 23 80 + 22 80 + 21 80 + 20 N80 P2P

Lookups are Faster Lookups take O(Log N) hops Resembles binary search Lookup(K19) N80 N60 P2P

Chord properties Efficient: O(Log N) messages per lookup N is the total number of servers Scalable: O(Log N) state per node Robust: survives massive changes in membership Proofs are in paper / tech report Assuming no malicious participants P2P

DHT: Chord Summary Routing table size? Routing time? Log N fingers Each hop expects to 1/2 the distance to the desired id => expect O(log N) hops. P2P

DHT: Discussion Pros: Cons: Guaranteed Lookup O(log N) per node state and search scope Cons: No one uses them? (only one file sharing app) Supporting non-exact match search is hard P2P

Overview Centralized Database Query Flooding Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA Swarming BitTorrent Unstructured Overlay Routing Freenet Structured Overlay Routing Distributed Hash Tables Other P2P

Other P2P networks Content-Addressable Network (CAN) topological routing (k-dimensional grid) Tapestry, Pastry, DKS: ring organization as Chord, other measure of key closeness, k-ary search instead of binary search paradigm Viceroy: butterfly-type overlay network .... Sethi@home: for sharing computational resources... Skype P2P

P2P Case study: Skype inherently P2P: pairs of users communicate. Skype clients (SC) inherently P2P: pairs of users communicate. proprietary application-layer protocol (inferred via reverse engineering) hierarchical overlay with SNs Index maps usernames to IP addresses; distributed over SNs Skype login server Supernode (SN) 2: Application Layer

Peers as relays Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside peer from initiating a call to insider peer Solution: Using Alice’s and Bob’s SNs, Relay is chosen Each peer initiates session with relay. Peers can now communicate through NATs via relay 2: Application Layer

P2P: Summary Observe: Application-layer networking (sublayers...) Many different styles of organization (data, routing); pros and cons of each Lessons learned: Single points of failure are bad Flooding messages to everyone is bad Underlying network topology is important Not all nodes are equal Need incentives to discourage freeloading Privacy and security are important Structure can provide bounds and guarantees Observe: Application-layer networking (sublayers...) P2P

Bibliography Ion Stoica, Robert Morris, David Karger, Frans Kaashoek, Hari Balakrishnan. Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications. Proceedings of the ACM SIGCOMM, 2001. Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker. A Scalable Content-Addressable Network. Proceedings of the ACM SIGCOMM, 2001. M.A. Jovanovic, F.S. Annexstein, and K.A.Berman. Scalability Issues in Large Peer-to-Peer Networks - A Case Study of Gnutella. University of Cincinnati, Laboratory for Networks and Applied Graph Theory, 2001. Frank Dabek, Emma Brunskill, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan. Building Peer-to-Peer Systems With Chord, a Distributed Lookup Service. Proceedings of the 8th Workshop on Hot Topics in Operating Systems (HotOS-VIII), 2001. Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong. Freenet: A Distributed Anonymous Information Storage and Retrieval System. Designing Privacy Enhancing Technologies: International Workshop on Design Issues in Anonymity and Unobservability. LLNCS 2009. Springer Verlag 2001. P2P

Some extra notes P2P

Chord: Joining the Ring Three step process: Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node Less aggressive mechanism (lazy finger update): Initialize only the finger to successor node Periodically verify immediate successor, predecessor Periodically refresh finger table entries P2P

Joining the Ring - Step 1 Initialize the new node finger table Locate any node p in the ring Ask node p to lookup fingers of new node N36 Return results to new node N5 N20 N99 N36 1. Lookup(37,38,40,…,100,164) N40 N80 N60 P2P

Joining the Ring - Step 2 Updating fingers of existing nodes new node calls update function on existing nodes existing nodes can recursively update fingers of other nodes N5 N20 N99 N36 N40 N80 N60 P2P

Joining the Ring - Step 3 Transfer keys from successor node to new node only keys in the range are transferred N5 N20 N99 K30 K38 N36 Copy keys 21..36 from N40 to N36 N40 K30 K38 N80 N60 P2P

Chord: Handling Failures Use successor list Each node knows r immediate successors After failure, will know first live successor Correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small P2P

Chord: Evaluation Overview Quick lookup in large systems Low variation in lookup costs Robust despite massive failure Experiments confirm theoretical results P2P