Presentation is loading. Please wait.

Presentation is loading. Please wait.

NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1.

Similar presentations


Presentation on theme: "NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1."— Presentation transcript:

1 NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD suronape@mut.ac.th 1

2 P2P  Peer-to-peer:  “direct” connections between peers  Peers are all equal - both a sender and a receiver of a content  P2P core principle  Self-organizing  no central management, peers are completely independent  Large collection of resources  Millions of simultaneous users, voluntary participation  Scalability  scalability with respect to number of nodes 2

3 P2P principle  P2P is an overlay network (of internet)  a virtual network on top of the underlying IP network 3 Overlay graph

4 Overlay network 4

5 Peer-to-Peer (P2P) Systems  Old ideas  1979 - USENET news service (still in use)  Popular around 1999  Napster, Kazaa and Gnutella for sharing files, music..  ‘01: Skype launched (Kazaa) ‘06’, ’10’: Acquired by eBay, Microsoft  ‘01: BitTorrent launched – heavily used for file and music sharing Still very popular today for sharing multimedia content  BitTorrent – 30% of internet traffic (mid 2000s)  Skype – 663M users (2010), 700 M minutes a day  Problem: Free Riders - only consume, not contribute 5

6 Current State of P2P  P2P networks going strong, all over the world  Currently P2P accounts for almost 70% of network traffic  P2P networks currently mostly used for illegal sharing of copyrighted material  Music, videos, software, …  Content providers not so happy  Sue companies making P2P software (e.g., Napster), sue software developers (Winny), sue users sharing material 6

7 P2P Application  P2P principle applicable to many kinds of systems  Content distribution  Most current P2P targeted at one application: File sharing  Users share files (e.g., music, video, software) and others download  Also often illegally shared (except BitTorrent)  Example  BitTorrent, Napster, Gnutella, KaZaA  From Acadamic  Chord  Communication  Skype 7

8 Napster  Napster launched in 1999 by Shawn Fanning  The term “P2P was coined by Napster.  In 2000:,25% of traffic out of Uni. of Wisconsin Madison, 60M users  Centralized real-time directory, distributed files, mostly MP3 music;  Based in USA; lawsuits put it out of business  RIAA sues Napster, asking $100K per download  Indirectly helping users to infringe copyright  Currently, paid service  Pay % to songwriters and music companies as copyright required  Napster protocol is open, people free to develop 8

9 Napster  Connect to Napster server  Upload list of music files that you want to share  Server stores no files  Maintain a list of 9 Structure

10 Napster search  Send server keywords to search with  Server returns a list of hosts – tuples – to client  Client pings each host in the list to find transfer rates  Client fetches file from best host 10

11 Napster Problem  Centralized server a source of congestion  Centralized server single point of failure  Napster.com declared to be responsible for users’ copyright violation “Indirect infringement”  Next system: Gnutella 11

12 Gnutella  Eliminate the servers  Client search and retrieve amongst themselves  Clients act as servers too, called servents  In 2000, release by AOL, 88K users by ’03’ 12

13 How a peer join a network  To join the network,  peer needs the address of another peer that is currently a member  New peer sends connect message to existing peer  GNUTELLA CONNECT  Reply is simply “OK” 13

14 Gnutella search  Gnutella routes different messages within the overlay graph  Gnutella protocol has 5 main message types Query (search)  QueryHit (response to query)  Ping (to probe network for other peers)  Pong (reply to ping, contains address of another peer)  Push (used to initiate file transfer) 14

15 Gnutella Message Header Format 15

16 Flooding query message  Query message 16

17 How do search results come back? 17

18 Avoiding excessive traffic  To avoid duplicate transmissions, each peer maintains a list of recently received messages  Query forwarded to all neighbors except peer from which received  Each Query (identified by DescriptorID) forwarded only once  QueryHit routed back only to peer from which Query received with same DescriptorID  Duplicates with same DescriptorID and Payload descriptor (msg type) are dropped  QueryHit with DescriptorID for which Query not seen is dropped 18

19 After receiving QueryHit messages  Requestor chooses “best” QueryHit responder  Initiates HTTP request directly to responder’s ip+port  Responder then replies with file packets after this message: 19

20 Dealing with Firewalls  Requestor sends Push to responder asking for file transfer  Responder establishes a TCP connection at ip_address, port specified. Sends  Requestor then sends GET to responder (as before) and file is transferred as explained earlier 20

21 PING-PONG  Peers initiate Ping’s periodically  Ping’s flooded out like Query’s, Pong’s routed along reverse path like QueryHit’s  Pong replies used to update set of neighboring peers  To keep neighbor lists fresh in spite of peers joining, leaving and failing 21

22 Problem  Flooding a query is extremely inefficient  Wastes lot of network and peer resources  Repeated searches with same keywords Solution:  Gnutella’s network management not efficient  Periodic PING/PONGs consume lot of resources  Ping/Pong constituted 50% traffic  Modem-connected hosts do not have enough bandwidth for passing Gnutella traffic  Another solution:  FastTrack System 22

23 FastTrack  Hybrid between Gnutella and Napster  Takes advantage of “healthier” participants in the system  Underlying technology in Kazaa, KazaaLite, Grokster  Like Gnutella, but with some peers designated as supernodes 23

24 FastTrack (2)  A supernode stores a directory listing a subset of nearby ( ), similar to Napster servers  Supernode membership changes over time  Any peer can become (and stay) a supernode, provided it has earned enough reputation  Kazaalite: participation level (=reputation) of a user between 0 and 1000, initially 10, then affected by length of periods of connectivity and total number of uploads  More sophisticated Reputation schemes invented, especially based on economics (See P2PEcon workshop)  A peer searches by contacting a nearby supernode 24

25 Strength  Combines good points from Napster and Gnutella  Efficient searching under each supernode  Flooding restricted to supernodes only  Result: Efficient searching with “low” resource usage  Most popular network  Lot of content, lot of users  Currently most file sharing networks adopted this architecture 25

26 BitTorrent  Developed by Bram Cohen in 2001  Written in Python, available on many platforms  BitTorrent is a new approach for sharing large files  distributed directories, distributed files  Each file divided as chunks  Each chunk contains 32 KB – 256 KB  Each chunks can traverse different paths  BitTorrent widely used also for legal content  For example, Linux distributions, software patches, Official movie  Currently lots of illegal content on BitTorrent too… 26

27 Topology of BitTorrent  Overlay graph  (1) physical  (2) neighboring peer  (3) peering relationship  A tracker  a server which tracks the currently active clients  serves as a centralized directory  Topology can be changed regularly  Tracker factors: content, distance, peer churn, randomization 27

28 BitTorrent: Players  Three entities needed to start distribution of a file  Terminology:  A “torrent” file: the metadata about the file  Seed: Client with a complete copy of the file  Leecher: Client still downloading the file 28

29 BitTorrent Start Up  New client gets torrent-file and gets peer list from tracker 29

30 BitTorrent Operation 30

31 Summary of BitTorrent operation  A new peer A receives a.torrent file from one of the BitTorrent web servers, including the name, size, and number of chunks of a particular file, together with the IP address and port number of the corresponding tracker.  It then registers with the right tracker. It will also periodically send keep-alive messages to the tracker.  The tracker sends to peer A a list of potential peers (peer set = 50 peers).  Peer A selects a subset (following the tit-for-tat and randomization rules) and establishes connections with these five peers.  Peer A downloads chunks from peers in peer set and provides them with its own chunks (possible to parallel)  Chunks typically 256 KB  Starting with the rarest chunks.  Every now and then, each peer updates its peer list. 31

32 Peering construction methods  Tracker suggests a set of 50 peers  Let new peer picks 5 peers (at this time!) for exchanging chunks  Exchanging contents evenly between them (Rarely chunk first) 32  Peer serves 4 peers in peer set simultaneously (tit-for-tat)  Seeks 4 best downloaders in last time slot if it’s a seed  Seeks 4 best uploaders in last time slot if it’s a leecher  The fifth peer selected at 50% randomly (randomization)  Choking: Limit number of neighbors to which concurrent uploads or download <= a number

33 Strength  Tit-for-Tat  A peer serves peers that serve it  Encourages cooperation, discourage free-riding  Rarely chunk first  Prefer early download of blocks that are least replicated among neighbors  Avoid the problem that most of peers have most of the chuck but all must wait for the few rare chunks  Randomization  avoids unfairness of little upload capacity nodes 33

34 Weakness  File needs to be quite large  256 KB chunks  Rarest first needs large number of chunks  Everyone must contribute  Low-bandwidth clients have a disadvantage 34

35 How can BitTorrent be free?  It leverages peer uplink capacities to send chunks of files to each other without deploying many media servers.  P2P is used for sharing content in BitTorrent.  Scalable?  Add many nodes as the network scale up without a bottleneck 35

36 P2P versus client-server architecture  Client-server architecture  Each client requests data from the server  Not help each other  P2P  Peer is both a sender and a receiver of a content  each peer helps each other in a distributed manner  Data transmission is distributed  Although control plane for signaling is centralized 36

37 Summary  Most existing P2P networks built on searching, however Searching does not scale in same way  Either centralized system with all its problems  Distributed system with all its problems  Hybrid systems cannot guarantee discovery either  Alternatively, use addressing instead of searching  Distributed hash tables (DHTs) - efficient searching and object location in P2P network  Example  Chord, CAN, Plaxton, Pastry, Tapestry 37

38 Reference  Kangasharju: Peer-to-Peer Networks  Brinton, Christopher; Chiang, Mung (2013-06-10). Networks Life: 20 Questions and Answers


Download ppt "NETE4631 Network Information Systems (NISs): Peer-to-Peer (P2P) Suronapee, PhD 1."

Similar presentations


Ads by Google