Copyright 2002 Ellis Horowitz A look at Peer-to-Peer File Sharing with Gnutella Prof. Ellis Horowitz November 25, 2002.

Slides:



Advertisements
Similar presentations
The Internet and the Web
Advertisements

Peer-to-Peer and Social Networks An overview of Gnutella.
CPSC156a: The Internet Co-Evolution of Technology and Society Lecture 14: October 28, 2003 Peer-to-Peer File Sharing.
INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Peer to Peer (P2P) Networks and File sharing. By: Ryan Farrell.
Gnutella 2 GNUTELLA A Summary Of The Protocol and it’s Purpose By
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
CS 34701: Large-Scale Networked Systems Professor: Ian Foster TA: Adriana Iamnitchi
Rheeve: A Plug-n-Play Peer- to-Peer Computing Platform Wang-kee Poon and Jiannong Cao Department of Computing, The Hong Kong Polytechnic University ICDCSW.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Spotlighting Decentralized P2P File Sharing Archie Kuo and Ethan Le Department of Computer Science San Jose State University.
P2P Network is good or bad? Sang-Hyun Park. P2P Network is good or bad? - Definition of P2P - History of P2P - Economic Impact - Benefits of P2P - Legal.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
Peer-to-Peer (or P2P) From user to user. Peer-to-peer implies that either side can initiate a session and has equal responsibility. Corey Chan Andrew Merfeld.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
CSPP 54001: Large-Scale Networked Systems Week 5: P2P Technologies and Applications Matei Ripeanu.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Improving Data Access in P2P Systems Karl Aberer and Magdalena Punceva Swiss Federal Institute of Technology Manfred Hauswirth and Roman Schmidt Technical.
Review of Free Riding on Gnutella Eytan Adar and Bernardo Huberman Shreeram Sahasrabudhe.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Client-Server vs P2P or, HTTP vs Bittorrent. Client-Server Architecture SERVER client.
KaZaA: Behind the Scenes Shreeram Sahasrabudhe Lehigh University
P2P File Sharing Systems
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
The Internet, World Wide Web, and Computer Communication.
P2P Architecture Case Study: Gnutella Network
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
Gnutella2: A Better Gnutella?

1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
1 P2P Computing. 2 What is P2P? Server-Client model.
Peer-to-Peer Networking. Presentation Introduction Characteristics and Challenges of Peer-to-Peer Peer-to-Peer Applications Classification of Peer-to-Peer.
A P2P file distribution system ——BitTorrent Pegasus Team CMPE 208.
Vulnerabilities in peer to peer communications Web Security Sravan Kunnuri.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
GNUTELLA PEER-TO-PEER NETWORKING. GNUTELLA n What is Gnutella n Relation to the World Wide Web n How it Works n Sites / Links / Information.
Mapping the Gnutella Network Presented By: Tony Young M.Math Candidate October 7th, 2004.
03/19/02Scalab Seminar Series1 Mapping the Gnutella Network Macroscopic Properties of Large Scale P2P Systems Ramaswamy N.Vadivelu Scalab, ASU.
CS155b: E-Commerce Lecture 11: February 15, 2001 Alternative Content-Distribution Methods.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Application Layer Functionality and Protocols.
Peer-to-peer systems (part I) Slides by Indranil Gupta (modified by N. Vaidya)
A Reputation-Based Approach for Choosing Reliable Resources in Peer-to-Peer Networks E. Damiani S. De Capitani di Vimercati S. Paraboschi P. Samarati F.
Mapping the Gnutella Network: Properties of Large-Scale Peer-to-Peer Systems and Implications for System Design Authors: Matei Ripeanu Ian Foster Adriana.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Peer to Peer Networking. Network Models => Mainframe Ex: Terminal User needs direct connection to mainframe Secure Account driven  administrator controlled.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
P2P Networking: Freenet Adriane Lau November 9, 2004 MIE456F.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
1 Gnutella. 2 Overview r P2P search mechanism r Simple and straightforward r Completely decentralized r Creates overlay network r Different applications.
BitTorrent Vs Gnutella.
Internet and Intranet.
Peer-to-Peer and Social Networks
Internet and Intranet.
Internet and Intranet.
A look at Peer-to-Peer File Sharing with Gnutella
Internet and Intranet.
Presentation transcript:

Copyright 2002 Ellis Horowitz A look at Peer-to-Peer File Sharing with Gnutella Prof. Ellis Horowitz November 25, 2002

Copyright 2002 Ellis Horowitz Outline P2P file sharing clients Gnutella protocol Gnutella network properties Gnutella protocol issues –topolgy mismatch –scalability –free riding –query types –anonymity –security Conclusions

Copyright 2002 Ellis Horowitz Peer-to-Peer File Sharing is all about the trading of copyrighted music and videos without paying anything to the authors query music category banner ad 3 million users online sharing 4 PetaBytes of data Kazaa Native Windows Application

Copyright 2002 Ellis Horowitz Kazaa Survives By Legal Manuvering March 2001, Kazaa is founded by two Dutchmen, Niklas Zennstrom and Janus Friis in a company called Computer Empowerment The software is based upon their FastTrack P2P Stack, a proprietary algorithm for peer-to-peer communication Kazaa licenses FastTrack to Morpheus and Grokster Oct MPAA and RIAA sue Kazaa, Morpheus and Grokster Nov. 2001, Consumer Empowerment is sued in the Netherlands by the Dutch music publishing body, Buma/Stemra. The court orders KaZaA to take steps to prevent its users from violating copyrights or else pay a heavy fine. Jan. 2002, Zennstrom&Friis sell Kazaa software and website to Sharman Networks, based in Vanuatu, an island in the Pacific, but operating out of Australia Feb. 2002, Kazaa cuts off Morpheus clients from FastTrack April 2002, Sharman Networks agrees to let Brilliant Digital bundle their own stealth P2P application called AltNet within KaZaA. This network would be remotely switched on, allowing KaZaA users to trade Brilliant Digital content throughout FastTrack

Copyright 2002 Ellis Horowitz Morpheus File Sharing Software behind a firewall searches over multiple categories, metadata a Java application Search Power Morpheus adopts the Jtella version of Gnutella banner ad shopping, web browser

Copyright 2002 Ellis Horowitz There are many Gnutella Clients See

Copyright 2002 Ellis Horowitz Gnutella History Originally conceived of by Justin Frankel, 21 year old founder of Nullsoft March 2000, Nullsoft posts Gnutella to the web A day later AOL removes Gnutella at the behest of Time Warner The Gnutella protocol version and version gnutella.sourceforge.net/Proposals/Ultrapeer/Ultrapeers.htm there are multiple open source implementations at including: –Jtella –Gnucleus Software released under the Lesser Gnu Public License (LGPL) the Gnutella protocol has been widely analyzed

Copyright 2002 Ellis Horowitz Gnutella Protocol Messages Broadcast Messages –Ping: initiating message (“I’m here”) –Query: search pattern and TTL (time-to-live) Back-Propagated Messages –Pong: reply to a ping, contains information about the peer –Query response: contains information about the computer that has the needed file Node-to-Node Messages –GET: return the requested file –PUSH: push the file to me

Copyright 2002 Ellis Horowitz Gnutella search mechanism A Steps: Node 2 initiates search for file A

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism A Steps: Node 2 initiates search for file A Sends message to all neighbors A A

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism A Steps: Node 2 initiates search for file A Sends message to all neighbors Neighbors forward message A A A

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism Steps: Node 2 initiates search for file A Sends message to all neighbors Neighbors forward message Nodes that have file A initiate a reply message A:5 A A:7 A A

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism Steps: Node 2 initiates search for file A Sends message to all neighbors Neighbors forward message Nodes that have file A initiate a reply message Query reply message is back- propagated A:5 A:7 A A

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism Steps: Node 2 initiates search for file A Sends message to all neighbors Neighbors forward message Nodes that have file A initiate a reply message Query reply message is back- propagated A:5 A:7

Copyright 2002 Ellis Horowitz Gnutella Search Mechanism Steps: Node 2 initiates search for file A Sends message to all neighbors Neighbors forward message Nodes that have file A initiate a reply message Query reply message is back- propagated File download Note: file transfer between clients behind firewalls is not possible; if only one client, X, is behind a firewall, Y can request that X push the file to Y download A

Copyright 2002 Ellis Horowitz Other Gnutella Issues GUID: Short for Global Unique Identifier, a randomized string that is used to uniquely identify a host or message on the Gnutella Network. This prevents duplicate messages from being sent on the network. GWebCache: a distributed system for helping servents connect to the Gnutella network, thus solving the "bootstrapping" problem. Servents query any of several hundred GWebCache servers to find the addresses of other servents. GWebCache servers are typically web servers running a special module. Host Catcher: Pong responses allow servents to keep track of active gnutella hosts On most servents, the default port for Gnutella is 6346

Copyright 2002 Ellis Horowitz Network growth statistics Growth Factors  DSL and cable modem nodes grew substantially  Multiple client implementations became available  There was significant growth in the Gnutella network in 2001  5,000 nodes on February 2001,  10,000 nodes on March 19, 2001  20,000 nodes on May 12, 2001  40,000 nodes on May 29, 2001 Statistics due to Matei Ripeanu, see PAPERS/gnutella-rc.pdf

Copyright 2002 Ellis Horowitz Limewire Count of Gnutella Hosts in 2002 Green graph represents unique hosts

Copyright 2002 Ellis Horowitz Growth invariants (1): avg. node connectivity  3.4 links per node on average graph due to Matei Ripeanu

Copyright 2002 Ellis Horowitz Growth invariants (2): network diameter  Node-to-node distance maintains similar distribution  Average node-to-node distance grew 25% while the network grew 50 times over 6 months graph due to Matei Ripeanu

Copyright 2002 Ellis Horowitz Is Gnutella a power-law network? November 2000 Power-law networks: the number of links per node follows a power-law distribution Examples:  the Internet,  in/out links to/from HTML pages,  citation network,  US power grid Implications: High tolerance to random node failure but low reliability when facing of an ‘intelligent’ adversary graph due to Matei Ripean

Copyright 2002 Ellis Horowitz Total Generated Traffic Ripeanu has determined that Gnutella traffic totals 1Gbps (or 330TB/month)! –Compare to 15,000TB/month in US Internet backbone (Dec. 2000) –this estimate excludes actual file transfers Reasoning:  QUERY and PING messages are flooded. They form more than 90% of generated traffic  predominant TTL=7  >95% of nodes are less than 7 hops away  measured traffic at each link about 6kbs  network with 50k nodes and 170k links Statistics due to Matei Ripeanu

Copyright 2002 Ellis Horowitz Mapping between Gnutella Network and Internet Infrastructure A DB C E H G F Perfect Mapping

Copyright 2002 Ellis Horowitz A DB C E H G F Mismatch between Gnutella Network and Internet Infrastructure Inefficient mapping Link D-E needs to support six times higher traffic.

Copyright 2002 Ellis Horowitz Topology mismatch The overlay network topology doesn’t match the underlying Internet infrastructure topology!  40% of all nodes are in the 10 largest Autonomous Systems (AS)  Only 2-4% of all TCP connections link nodes within the same AS  Largely ‘random wiring’ Most Gnutella generated traffic crosses AS border, making the traffic more expensive May cause ISPs to change their pricing scheme

Copyright 2002 Ellis Horowitz Scalability Whenever a node receives a message, (ping/query) it sends copies out to all of its other connections. existing mechanisms to reduce traffic: –TTL counter –Cache information about messages they received, so that they don't forward duplicated messages.

Copyright 2002 Ellis Horowitz Free Riding on Gnutella 70% of Gnutella users share no files 90% of users answer no queries Those who have files to share may limit number of connections or upload speed, resulting in a high download failure rate. If only a few individuals contribute to the public good, these few peers effectively act as centralized servers. see Adar and Huberman at 2.cs.cmu.edu/~kunwadee/res earch/p2p/gnutella.html

Copyright 2002 Ellis Horowitz Free Riding on Gnutella More than 25% of Gnutella clients share no files; 75% share 100 files or less Conclusion: Gnutella has a high percentage of free riders * Statistics due to S. Gribble

Copyright 2002 Ellis Horowitz Anonymity Gnutella provides for anonymity by masking the identity of the peer that generated a query. However, IP addresses are revealed at various points in its operation: HITS packets includes the URL for each file, revealing the IP addresses Clients claim that they have no control, but.. –they support bootstrapping –they may control message flow –they may control metadata searches –they may control program updates

Copyright 2002 Ellis Horowitz Query Expressiveness Format of query not standardized No standard format or matching semantics for the QUERY string. Its interpretation is completely determined by each node that receives it. String literal vs. regular expression Directory name, filename, or file contents Malicious users may even return files unrelated to the query

Copyright 2002 Ellis Horowitz Gnutella Queries "The popularity of Gnutella queries and its implications on scalability" Kunwadee Sripanidkulchai, see 2.cs.cmu.edu/~kunwadee/research/p2p/gnutella.html Examining over 5 million queries

Copyright 2002 Ellis Horowitz Security Recently there have been P2P viruses and worms constructed –the Benjamin virus uses Kazaa to spread itself, see 90 Kazaa now includes virus checking software that is applied before upload/after download There have been several Gnutella worms: Gnutella.worm, VBS/GWV.a, VBS_GNUTELWORM, VBS.Gnut.A, VBS/Gnu A Gnutella worm spreads by making a copy of itself in the Gnutella program directory, then making that directory available for sharing files on the Gnutella network.

Copyright 2002 Ellis Horowitz Conclusions  Gnutella is a self-organizing, large-scale, P2P application that produces an overlay network on top of the Internet; it appears to work  Growth is hindered by the volume of generated traffic and inefficient resource use  since there is no central authority the open source community must commit to making any changes  Suggested changes have been made by –Peer-to-Peer Architecture Case Study: Gnutella Network, by Matei Ripeanu –Improving Gnutella Protocol: Protocol Analysis and Research Proposals by Igor Ivkovic

Copyright 2002 Ellis Horowitz Legal Questions Do US courts have jurisdiction over P2P companies? Do P2P companies really contribute to copyright infringement, cite: Sony BetaMax case? Do P2P companies affect file sharing? If Kazaa, Grokster and Morpheus are stopped, will that stop file sharing or copyright infringement?

Copyright 2002 Ellis Horowitz Some References [1] Eytan Adar and Bernardo A. Huberman, Free Riding on Gnutella [2] Igor Ivkovic, Improving Gnutella Protocol: Protocol Analysis And Research Proposals [3] Jordan Ritter, Why Gnutella Can't Scale. No, Really. [4] Matei Ripeanu, Peer-to-Peer Architecture Case Study: Gnutella network. [5] The Gnutella Protocol Specification v0.4