Peer-to-Peer Networking Marina Papatriantafilou CSE Department, Distributed Computing and Systems group Ack: Many of the slides are adaptation of slides.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Course on Computer Communication and Networks Lecture 10 Chapter 2; peer-to-peer applications (and network overlays) EDA344/DIT 420, CTH/GU.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
Road Map Application basics Web FTP DNS P2P DHT.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Peer-to-Peer Jeff Pang Spring Spring 2004, Jeff Pang2 Intro Quickly grown in popularity –Dozens or hundreds of file sharing applications.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
2: Application Layer P2P applications and Sockets.
Peer-to-Peer Applications
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
CSE 124 Networked Services Fall 2009 B. S. Manoj, Ph.D 11/03/2009CSE 124 Network Services FA 2009 Some of these.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Computer Communications Peer to Peer networking Ack: Many of the slides are adaptations of slides by authors in the bibliography section.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
Introduction 1 Lecture 8 Application Layer (DNS, p2p) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science & Engineering.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
P2P File Sharing Systems
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 10 Omar Meqdadi Department of Computer Science and Software Engineering University.
1 Lecture05: Application layer r Principles of network applications r DNS r P2P and DHT.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
Content Distribution March 6, : Application Layer1.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction of P2P systems
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
P2P Networking and Content Distribution
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Content Distribution March 2, : Application Layer1.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer (P2P) networks and applications. What is P2P? r “the sharing of computer resources and services by direct exchange of information”
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Peer-to-Peer Networks Hongli Luo CEIT, IPFW. r Topics m Application architecture m P2P file sharing m P2P networks: Napster Gnutella KaAzA Bittorrent.
Lecture 2 Distributed Hash Table
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
CS 3830 Day 10 Introduction 1-1. Announcements r Quiz #2 this Friday r Program 2 posted yesterday 2: Application Layer 2.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
2: Application Layer 1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April.
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Peer-to-Peer File Sharing
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
No Class on Friday There will be NO class on: FRIDAY 1/29/15 1.
2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
Marina Papatriantafilou – Overlays and peer-to-peer applications Based on the book Computer Networking: A Top Down Approach, Jim Kurose, Keith Ross, Addison-Wesley.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
05 - P2P applications and Sockets
An example of peer-to-peer application
Server-client vs. P2P: example
Part 4: Peer to Peer - P2P Applications
Pure P2P architecture no always-on server
Chapter 2 Application Layer
#02 Peer to Peer Networking
Presentation transcript:

Peer-to-Peer Networking Marina Papatriantafilou CSE Department, Distributed Computing and Systems group Ack: Many of the slides are adaptation of slides by authors in the bibliography section and by the authors of the course’s main textbook, J.F Kurose and K.W. Ross, All Rights Reserved 1996-2009 P2P

Intro Quickly grown in popularity But what is P2P? Dozens or hundreds of file sharing applications many million people worldwide use P2P networks Audio/Video transfer now dominates traffic on the Internet But what is P2P? Searching or location? Computers “Peering”? Take advantage of resources at the edges of the network End-host resources have increased dramatically Broadband connectivity now common P2P

Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change IP addresses Three topics: Searching for information File distribution Case Study: Skype peer-peer 2: Application Layer

First steps in p2p file sharing/lookup Centralized Database Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA … P2P

P2P: centralized directory directory server peers Alice Bob 1 2 3 original “Napster” design (1999, S. Fanning) 1) when peer connects, it informs central server: IP address, content 2) Alice queries directory server for “Boulevard of Broken Dreams” 3) Alice requests file from Bob Problems? Single point of failure Performance bottleneck Copyright infringement P2P

Napster: Publish insert(X, 123.2.21.23) ... Publish I have X, Y, and Z! 123.2.21.23 P2P

Napster: Search 123.2.0.18 Fetch search(A) --> 123.2.0.18 Query Reply Where is file A? P2P

Napster: Discussion +, - ? +: Simple Search scope is O(1) -: Server maintains O(N) State Server does all processing Single point of failure P2P

First steps in p2p file sharing/lookup Centralized Database Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA … P2P

Gnutella: Overview Query Flooding: Join: on startup, client contacts a few other nodes; these become its “neighbors” Publish: no need Search: ask neighbors, who ask their neighbors, and so on... when/if found, reply to sender. Fetch: get the file directly from peer P2P

Gnutella: Search I have file A. Reply Query Where is file A? P2P

Gnutella: protocol Query message sent over existing TCP connections File transfer: HTTP Query message sent over existing TCP connections peers forward Query message QueryHit sent over reverse path Query QueryHit Scalability: limited scope flooding P2P

Query flooding: Gnutella Pros: Fully de-centralized Search cost distributed Cons: Search scope is O(N) Search time is O(???) But can limit to some distance Sensitive to churn overlay network: edge between peer X and Y if there’s a TCP connection all active peers and edges is overlay net Edge is not a physical link Given peer will typically be connected with < 10 overlay neighbors What is routing in p2p info-sharing networks? P2P

First steps in p2p file sharing/lookup Centralized Database Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA … P2P

KaZaA: Overview “Smart” Query Flooding: Join: on startup, client contacts a “supernode” ... may at some point become one itself Publish: send list of files to supernode Search: send query to supernode, supernodes flood query amongst themselves. Fetch: get the file directly from peer(s); can fetch simultaneously from multiple peers P2P

KaZaA: Network Design “Super Nodes” P2P

KaZaA: File Insert insert(X, 123.2.21.23) ... Publish I have X! P2P

KaZaA: File Search search(A) --> 123.2.0.18 123.2.22.50 Replies Query Where is file A? P2P

KaZaA: Discussion Pros: Tries to take into account node heterogeneity: Bandwidth Host Computational Resources Host Availability (?) Rumored to take into account network locality Cons: Still no real guarantees on search scope or search time P2P architecture used by Skype, Joost (communication, video distribution p2p systems) P2P

Next steps in p2p networking Centralized Database Napster Query Flooding Gnutella Hierarchical Query Flooding KaZaA … Academia: “we can show how to do this better” :) Motivation: Frustrated by popularity of all these “half-baked” P2P apps :) Guaranteed lookup success for files in system Provable bounds on search time Provable scalability to millions of node Hot Topic in networking ever since P2P

Next steps in p2p netwoking Structured Overlay Organization and Routing Distributed Hash Tables Swarming BitTorrent … P2P

Distributed Hash Table (DHT) DHT = distributed P2P database Database has (key, value) pairs; key: ss number; value: human name key: content type; value: IP address Peers query DB with key DB returns values that match the key Peers can also insert (key, value) peers

DHT Identifiers: e.g: Assign integer identifier to each peer in range [0,2n-1]. Each identifier can be represented by n bits. Require each key to be an integer in same range. To get integer keys, hash original key. eg, key = h(“Led Zeppelin IV”) This is why they call it a distributed “hash” table

How to assign keys to peers? E.g. Central issue: Assigning (key, value) pairs to peers. Rule: assign key to the peer that has the closest ID. Convention in lecture: closest is the immediate successor of the key. Ex: n=4; peers: 1,3,4,5,8,10,12,14; key = 13, then successor peer = 14 key = 15, then successor peer = 1

Circular DHT (1) 1 3 4 5 8 10 12 15 Each peer only aware of immediate successor and predecessor. “Overlay network”

Circle DHT (2) 0001 0011 1111 0100 1100 0101 1010 1000 O(N) messages on avg to resolve query, when there are N peers Who’s resp for key 1110 ? I am 0011 1111 1110 0100 1110 1110 1100 1110 0101 1110 Define closest as closest successor 1110 1010 1000

Circular DHT with Shortcuts 1 3 4 5 8 10 12 15 Who’s resp for key 1110? Each peer keeps track of IP addresses of predecessor, successor, + shortcuts. Possible to design shortcuts so O(log N) neighbors, O(log N) messages in query: Query simulates binary search (divide and conquer) Reduced from 6 to 2 messages.

Peer Churn Peer 5 abruptly leaves 1 3 4 5 8 10 12 15 To handle peer churn, require each peer to know the IP address of its two successors. Each peer periodically pings its two successors to see if they are still alive. Peer 5 abruptly leaves Peer 4 detects; makes 8 its immediate successor; asks 8 who its immediate successor is; makes 8’s immediate successor its second successor. What if peer 13 wants to join?

Common DHT structures for P2P overlays/searching Chord: ring with ”chords” (i.e. Shortcuts), works as binary tree Content-Addressable Network (CAN) topological routing (k-dimensional grid) Tapestry, Pastry, DKS: ring organization as Chord, other measure of key closeness, k-ary search instead of binary search paradigm Kademlia: tree-based overlay Viceroy: butterfly-type overlay network .... P2P

Next steps in p2p netwoking Structured Overlay Organization and Routing Distributed Hash Tables Swarming BitTorrent … P2P

All Peers Equal? 56kbps Modem 10Mbps LAN 1.5Mbps DSL P2P

BitTorrent: Overview Swarming: Join: contact centralized “tracker” server, get a list of peers. Publish: can run a tracker server. Search: Out-of-band. E.g., use Google, some DHT, etc to find a tracker for the file you want. Get list of peers to contact for assembling the file in chunks Fetch: (FOCUS)Download chunks of the file from your peers. Upload chunks you have to them. P2P

File distribution: BitTorrent P2P file distribution tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain list of peers trading chunks peer 2: Application Layer

BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!

BitTorrent – joining a torrent new leecher metadata file website 1 2 join peer list 3 data request seed/leecher tracker 4 Peers divided into: seeds: have the entire file leechers: still downloading 1. obtain the metadata file 2. contact the tracker 3. obtain a peer list (contains seeds & leechers) 4. contact peers from that list for data

BitTorrent – exchanging data leecher B leecher A I have ! seed leecher C ● Verify pieces using hashes ● Download sub-pieces in parallel ● Advertise received pieces to the entire peer list ● Look for the rarest pieces

BitTorrent - unchoking leecher B leecher A seed leecher C leecher D ● Periodically calculate data-receiving rates ● Upload to (unchoke) the fastest downloaders ● Optimistic unchoking ▪ periodically select a peer at random and upload to it ▪ continuously look for the fastest partners

BitTorrent (1) file divided into 256KB chunks. peer joining torrent: has no chunks, but will accumulate them over time registers with tracker to get list of peers, connects to subset of peers (“neighbors”) while downloading, peer uploads chunks to other peers. peers may come and go once peer has entire file, it may (selfishly) leave or (altruistically) remain

BitTorrent (2) Sending Chunks: tit-for-tat Pulling Chunks at any given time, different peers have different subsets of file chunks periodically, a peer (Alice) asks each neighbor for list of chunks that they have. Alice sends requests for her missing chunks rarest first Sending Chunks: tit-for-tat Alice sends chunks to (4) neighbors currently sending her chunks at the highest rate re-evaluate top 4 every 10 secs every 30 secs: randomly select another peer, starts sending chunks newly chosen peer may join top 4 “optimistically unchoke”

BitTorrent: Sharing Strategy “Tit-for-tat” sharing strategy “I’ll share with you if you share with me” Be optimistic: occasionally let freeloaders download Otherwise no one would ever start! Also allows you to discover better peers to download from when they reciprocate Approximates Pareto Efficiency Game Theory: “No change can make anyone better off without making others worse off” P2P

BitTorrent: Discussion Pros: Works reasonably well in practice Gives peers incentive to share resources; avoids freeloaders to some extend Cons: Central tracker server needed to bootstrap swarm P2P

Discussion bittorrent, gaming, fairness, etc Gaming Incentives, tuning of behaviour Other issues: sybil attacks Literature, evolution: includes currency, economic games (but brings problems with inflation, etc like in real economics systems) Evolving literature, including economic and social sciences Related issue: information dissemination? P2P

File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? us: server upload bandwidth Server ui: peer i upload bandwidth u1 d1 u2 d2 us di: peer i download bandwidth File, size F dN Network (with abundant bandwidth) uN 2: Application Layer

File distribution time: server-client server sequentially sends N copies: NF/us time client i takes F/di time to download F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN = dcs depends on max { NF/us, F/min(di) } i Time to distribute F to N clients using client/server approach increases linearly in N (for large N) 2: Application Layer

File distribution time: P2P Server server must send one copy: F/us time client i takes F/di time to download NF bits must be downloaded (aggregate) F u2 u1 d1 d2 us Network (with abundant bandwidth) dN uN fastest possible upload rate: us + Sui dP2P depends on max { F/us, F/min(di) , NF/(us + Sui) } i 2: Application Layer

Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, us = 10u, dmin ≥ us 2: Application Layer

Overview First steps in p2p, file sharing Next steps: structure, performance (DHTs, bit-torrent) NEXT ? … Observe: Application-layer networking P2P

P2P – not only sharing files… Content delivery, software publication Streaming media applications Distributed computations (volunteer computing) Portal systems Distributed search engines Collaborative platforms Communication networks Social applications Other overlay-related applications.... Overlay: a network implemented on top of a network E.g. Peer-to-peer networks, ”backbones” in adhoc networks, transportaiton network overlays, ...

P2P Case study: Skype inherently P2P: pairs of users communicate. Skype clients (SC) inherently P2P: pairs of users communicate. proprietary application-layer protocol (inferred via reverse engineering) hierarchical overlay with SNs Index maps usernames to IP addresses; distributed over SNs Skype login server Supernode (SN) 2: Application Layer

Peers as relays Problem when both Alice and Bob are behind “NATs”. NAT prevents an outside peer from initiating a call to insider peer Solution: Using Alice’s and Bob’s SNs, Relay is chosen Each peer initiates session with relay. Peers can now communicate through NATs via relay 2: Application Layer

More on examples : New power grids Natural overlays

Bibliography (arbitrary order) Do Incentives build Robustness in BitTorrent?, Michael Piatek, Tomas Isdal, Thomas Anderson, Arvind Krishnamurthy and Arun Venkataramani, NSDI 2007. Law and Economics: The Prisoners’ Dilemma mason.gmu.edu/~fbuckley/documents/PrisonersDilemma.ppt Exploiting BitTorrent For Fun (But Not Profit) iptps06.cs.ucsb.edu/talks/Liogkas_BitTorrent.ppt Aberer’s coursenotes http://lsirwww.epfl.ch/courses/dis/2007ws/lecture/week%208%20P2P%20systems-general.pdf http://lsirwww.epfl.ch/courses/dis/2007ws/lecture/week%209%20Structured%20Overlay%20Networks.pdf Chord presentation by Cristine Kiefer, MPII Saarbruecken www.mpi-inf.mpg.de/departments/d5/teaching/ws03_04/p2p-data/11-18-writeup1.pdf www.mpi-inf.mpg.de/departments/d5/teaching/ws03_04/p2p-data/11-18-paper1.ppt Kurose, Ross: Computer Networking, a top-down approach, AdisonWesley 2009 P2P

Bibliography (cont) Kademlia: A Peer to peer information system Based on the XOR Metric. Petar Maymounkov and David Mazières , 1st International Workshop on Peer-to-peer Systems, 2002. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems, A. Rowstron and P. Druschel, IFIP/ACM International Conference on Distributed Systems Platforms (Middleware), November 2001. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, and Hari Balakrishnan. ACM SIGCOMM 2001, San Deigo, CA, August 2001, pp. 149-160. A Scalable Content-Addressable Network, S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker, Sigcomm 2001, San Diego, CA, USA, August, 2001. Incentives build Robustness in BitTorrent, Bram Cohen. Workshop on Economics of Peer-to-Peer Systems, 2003. Ian Clarke, Oskar Sandberg, Brandon Wiley, and Theodore W. Hong. Freenet: A Distributed Anonymous Information Storage and Retrieval System. Int’l Workshop on Design Issues in Anonymity and Unobservability. LLNCS 2009. Springer Verlag 2001. Viceroy: A Scalable and Dynamic Emulation of the Butterfly. By D. Malkhi, M. Naor and D. Ratajczak. In Proceedings of the 21st ACM Symposium on Principles of Distributed Computing (PODC '02), August 2002. Postscript. P2P

Bibliography cont. http://en.wikipedia.org/wiki/Chord_(DHT) http://en.wikipedia.org/wiki/Tapestry_(DHT) http://en.wikipedia.org/wiki/Pastry_(DHT) http://en.wikipedia.org/wiki/Kademlia http://en.wikipedia.org/wiki/Content_addressable_network http://en.wikipedia.org/wiki/Comparison_of_file_sharing_applications P2P