NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Incentives Build Robustness in BitTorrent Bram Cohen.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Peer-to-Peer Architecture Steve Ko Computer Sciences and Engineering University at Buffalo.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
Gnutella 2 GNUTELLA A Summary Of The Protocol and it’s Purpose By
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
Presented by Stephen Kozy. Presentation Outline Definition and explanation Comparison and Examples Advantages and Disadvantages Illegal and Legal uses.
Peer-to-Peer (or P2P) From user to user. Peer-to-peer implies that either side can initiate a session and has equal responsibility. Corey Chan Andrew Merfeld.
CS 525 Advanced Distributed Systems Spring 09 Indranil Gupta Lecture 2 Introduction to Peer to Peer Systems January 22, 2009.
Peer-to-Peer Intro Jani & Sami Peltotalo.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Indranil Gupta (Indy) September 21, 2010 Lecture 9 Peer-to-peer Systems I Reading: Gnutella paper on website  2010, I. Gupta Computer Science 425 Distributed.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
Client-Server vs P2P or, HTTP vs Bittorrent. Client-Server Architecture SERVER client.
P2P File Sharing Systems
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
Lecture 12 Peer-to-Peer systems (Search Capabilities in Distributed Systems) Sections 10.1, 10.2, plus Paper “The Gnutella Protocol Specification v0.4”
Indranil Gupta (Indy) September 26, 2013 Lecture 10 Peer-to-peer Systems I Reading: Gnutella paper on website  2013, I. Gupta Computer Science 425 Distributed.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
BitTorrent Internet Technologies and Applications.

Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
P2P Web Standard IS3734/19/10 Michael Radzin. What is P2P? Peer to Peer Networking (P2P) is a “direct communications initiations session.” Modern uses.
Introduction of P2P systems
Chapter 2: Application layer
Bit Torrent A good or a bad?. Common methods of transferring files in the internet: Client-Server Model Peer-to-Peer Network.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
The Start Shawn Fanning (19-yr-old student nicknamed Napster) developed the original Napster application and service in January 1999 while a freshman.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Indranil Gupta (Indy) Lecture 4 Peer to Peer Systems January 30, 2014 All Slides © IG CS 525 Advanced Distributed Systems Spring 2014.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Peer-to-Peer File Sharing
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Bit Torrent Nirav A. Vasa. Topics What is BitTorrent? Related Terms How BitTorrent works Steps involved in the working Advantages and Disadvantages.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
1 Indranil Gupta (Indy) Lecture 4 Peer to Peer Systems January 28, 2010 All Slides © IG CS 525 Advanced Distributed Systems Spring 2010.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
1 Indranil Gupta (Indy) Lecture 4 Peer to Peer Systems January 27, 2011 All Slides © IG CS 525 Advanced Distributed Systems Spring 2011.
Distributed Systems Lecture 10 P2P systems 1. Previous lecture Leader election – Problem – Algorithms 2.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
An example of peer-to-peer application
BitTorrent Vs Gnutella.
EE 122: Peer-to-Peer (P2P) Networks
The BitTorrent Protocol
Presentation transcript:

NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1

P2P  Peer-to-peer:  “direct” connections between peers  Peers are all equal - both a sender and a receiver of a content  P2P core principle  Self-organizing  no central management, peers are completely independent  Large collection of resources  Millions of simultaneous users, voluntary participation  Scalability  scalability with respect to number of nodes 2

P2P principle  P2P is an overlay network (of internet)  a virtual network on top of the underlying IP network 3 Overlay graph

Overlay network 4

Peer-to-Peer (P2P) Systems  Old ideas  USENET news service (still in use)  Popular around 1999  Napster, Kazaa and Gnutella for sharing files, music..  ‘01: Skype launched (Kazaa) ‘06’, ’10’: Acquired by eBay, Microsoft  ‘01: BitTorrent launched – heavily used for file and music sharing Still very popular today for sharing multimedia content  BitTorrent – 30% of internet traffic (mid 2000s)  Skype – 663M users (2010), 700 M minutes a day  Problem: Free Riders - only consume, not contribute 5

Current State of P2P  P2P networks going strong, all over the world  Currently P2P accounts for almost 70% of network traffic  P2P networks currently mostly used for illegal sharing of copyrighted material  Music, videos, software, …  Content providers not so happy  Sue companies making P2P software (e.g., Napster), sue software developers (Winny), sue users sharing material 6

P2P Application  P2P principle applicable to many kinds of systems  Content distribution  Most current P2P targeted at one application: File sharing  Users share files (e.g., music, video, software) and others download  Also often illegally shared (except BitTorrent)  Example  BitTorrent, Napster, Gnutella, KaZaA  From Acadamic  Chord  Communication  Skype 7

Napster  Napster launched in 1999 by Shawn Fanning  The term “P2P was coined by Napster.  In 2000:,25% of traffic out of Uni. of Wisconsin Madison, 60M users  Centralized real-time directory, distributed files, mostly MP3 music;  Based in USA; lawsuits put it out of business  RIAA sues Napster, asking $100K per download  Indirectly helping users to infringe copyright  Currently, paid service  Pay % to songwriters and music companies as copyright required  Napster protocol is open, people free to develop 8

Napster  Connect to Napster server  Upload list of music files that you want to share  Server stores no files  Maintain a list of 9 Structure

Napster search  Send server keywords to search with  Server returns a list of hosts – tuples – to client  Client pings each host in the list to find transfer rates  Client fetches file from best host 10

Napster Problem  Centralized server a source of congestion  Centralized server single point of failure  Napster.com declared to be responsible for users’ copyright violation “Indirect infringement”  Next system: Gnutella 11

Gnutella  Eliminate the servers  Client search and retrieve amongst themselves  Clients act as servers too, called servents  In 2000, release by AOL, 88K users by ’03’ 12

How a peer join a network  To join the network,  peer needs the address of another peer that is currently a member  New peer sends connect message to existing peer  GNUTELLA CONNECT  Reply is simply “OK” 13

Gnutella search  Gnutella routes different messages within the overlay graph  Gnutella protocol has 5 main message types Query (search)  QueryHit (response to query)  Ping (to probe network for other peers)  Pong (reply to ping, contains address of another peer)  Push (used to initiate file transfer) 14

Gnutella Message Header Format 15

Flooding query message  Query message 16

How do search results come back? 17

Avoiding excessive traffic  To avoid duplicate transmissions, each peer maintains a list of recently received messages  Query forwarded to all neighbors except peer from which received  Each Query (identified by DescriptorID) forwarded only once  QueryHit routed back only to peer from which Query received with same DescriptorID  Duplicates with same DescriptorID and Payload descriptor (msg type) are dropped  QueryHit with DescriptorID for which Query not seen is dropped 18

After receiving QueryHit messages  Requestor chooses “best” QueryHit responder  Initiates HTTP request directly to responder’s ip+port  Responder then replies with file packets after this message: 19

Dealing with Firewalls  Requestor sends Push to responder asking for file transfer  Responder establishes a TCP connection at ip_address, port specified. Sends  Requestor then sends GET to responder (as before) and file is transferred as explained earlier 20

PING-PONG  Peers initiate Ping’s periodically  Ping’s flooded out like Query’s, Pong’s routed along reverse path like QueryHit’s  Pong replies used to update set of neighboring peers  To keep neighbor lists fresh in spite of peers joining, leaving and failing 21

Problem  Flooding a query is extremely inefficient  Wastes lot of network and peer resources  Repeated searches with same keywords Solution:  Gnutella’s network management not efficient  Periodic PING/PONGs consume lot of resources  Ping/Pong constituted 50% traffic  Modem-connected hosts do not have enough bandwidth for passing Gnutella traffic  Another solution:  FastTrack System 22

FastTrack  Hybrid between Gnutella and Napster  Takes advantage of “healthier” participants in the system  Underlying technology in Kazaa, KazaaLite, Grokster  Like Gnutella, but with some peers designated as supernodes 23

FastTrack (2)  A supernode stores a directory listing a subset of nearby ( ), similar to Napster servers  Supernode membership changes over time  Any peer can become (and stay) a supernode, provided it has earned enough reputation  Kazaalite: participation level (=reputation) of a user between 0 and 1000, initially 10, then affected by length of periods of connectivity and total number of uploads  More sophisticated Reputation schemes invented, especially based on economics (See P2PEcon workshop)  A peer searches by contacting a nearby supernode 24

Strength  Combines good points from Napster and Gnutella  Efficient searching under each supernode  Flooding restricted to supernodes only  Result: Efficient searching with “low” resource usage  Most popular network  Lot of content, lot of users  Currently most file sharing networks adopted this architecture 25

BitTorrent  Developed by Bram Cohen in 2001  Written in Python, available on many platforms  BitTorrent is a new approach for sharing large files  distributed directories, distributed files  Each file divided as chunks  Each chunk contains 32 KB – 256 KB  Each chunks can traverse different paths  BitTorrent widely used also for legal content  For example, Linux distributions, software patches, Official movie  Currently lots of illegal content on BitTorrent too… 26

Topology of BitTorrent  Overlay graph  (1) physical  (2) neighboring peer  (3) peering relationship  A tracker  a server which tracks the currently active clients  serves as a centralized directory  Topology can be changed regularly  Tracker factors: content, distance, peer churn, randomization 27

BitTorrent: Players  Three entities needed to start distribution of a file  Terminology:  A “torrent” file: the metadata about the file  Seed: Client with a complete copy of the file  Leecher: Client still downloading the file 28

BitTorrent Start Up  New client gets torrent-file and gets peer list from tracker 29

BitTorrent Operation 30

Summary of BitTorrent operation  A new peer A receives a.torrent file from one of the BitTorrent web servers, including the name, size, and number of chunks of a particular file, together with the IP address and port number of the corresponding tracker.  It then registers with the right tracker. It will also periodically send keep-alive messages to the tracker.  The tracker sends to peer A a list of potential peers (peer set = 50 peers).  Peer A selects a subset (following the tit-for-tat and randomization rules) and establishes connections with these five peers.  Peer A downloads chunks from peers in peer set and provides them with its own chunks (possible to parallel)  Chunks typically 256 KB  Starting with the rarest chunks.  Every now and then, each peer updates its peer list. 31

Peering construction methods  Tracker suggests a set of 50 peers  Let new peer picks 5 peers (at this time!) for exchanging chunks  Exchanging contents evenly between them (Rarely chunk first) 32  Peer serves 4 peers in peer set simultaneously (tit-for-tat)  Seeks 4 best downloaders in last time slot if it’s a seed  Seeks 4 best uploaders in last time slot if it’s a leecher  The fifth peer selected at 50% randomly (randomization)  Choking: Limit number of neighbors to which concurrent uploads or download <= a number

Strength  Tit-for-Tat  A peer serves peers that serve it  Encourages cooperation, discourage free-riding  Rarely chunk first  Prefer early download of blocks that are least replicated among neighbors  Avoid the problem that most of peers have most of the chuck but all must wait for the few rare chunks  Randomization  avoids unfairness of little upload capacity nodes 33

Weakness  File needs to be quite large  256 KB chunks  Rarest first needs large number of chunks  Everyone must contribute  Low-bandwidth clients have a disadvantage 34

How can BitTorrent be free?  It leverages peer uplink capacities to send chunks of files to each other without deploying many media servers.  P2P is used for sharing content in BitTorrent.  Scalable?  Add many nodes as the network scale up without a bottleneck 35

P2P versus client-server architecture  Client-server architecture  Each client requests data from the server  Not help each other  P2P  Peer is both a sender and a receiver of a content  each peer helps each other in a distributed manner  Data transmission is distributed  Although control plane for signaling is centralized 36

Summary  Most existing P2P networks built on searching, however Searching does not scale in same way  Either centralized system with all its problems  Distributed system with all its problems  Hybrid systems cannot guarantee discovery either  Alternatively, use addressing instead of searching  Distributed hash tables (DHTs) - efficient searching and object location in P2P network  Example  Chord, CAN, Plaxton, Pastry, Tapestry 37

Reference  Kangasharju: Peer-to-Peer Networks  Brinton, Christopher; Chiang, Mung ( ). Networks Life: 20 Questions and Answers