GIA: Making Gnutella-like P2P Systems Scalable

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Intel Research Seattle Sylvia Ratnasamy, Lee Breslau, Scott Shenker, and Nick Lanham.
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Improving Gnutella Willy Henrique Säuberli Seminar in Distributed Computing, 16. November 2005 Papers: I.Making Gnutella-like P2P Systems Scalable; SIGCOMM.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Technion –Israel Institute of Technology Software Systems Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed.
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
CSc 461/561 CSc 461/561 Peer-to-Peer Streaming. CSc 461/561 Summary (1) Service Models (2) P2P challenges (3) Service Discovery (4) P2P Streaming (5)
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
Making Gnutella-like P2P Systems Scalable Presented by: Karthik Lakshminarayanan Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott.
A Distributed Search Service for Peer-to-Peer File Sharing in Mobile Application Presented by Tony Sung On Loy, MC Lab, CUHK IE 1 A Distributed Search.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
CS 640: Introduction to Computer Networks Yu-Chi Lai Lecture 18 - Peer-to-Peer.
1 Virtual Direction Routing for Overlay Networks Bow-Nan Cheng Murat Yuksel Shivkumar Kalyanaraman.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
P2P Group Meeting (ICS/FORTH) Monday, 21 February, 2005 Making Gnutella-like P2P Systems Scalable (Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick.
1 - CS7701 – Fall 2004 Review of: Making Gnutella-like P2P Systems Scalable Paper by: – Yatin Chawathe (AT&T) –Sylvia Ratnasamy (Intel) –Lee Breslau (AT&T)
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Peer-to-Peer Networking. Presentation Introduction Characteristics and Challenges of Peer-to-Peer Peer-to-Peer Applications Classification of Peer-to-Peer.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Peer-to-Peer Networks University of Jordan. Server/Client Model What?
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau (Several slides have been taken.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau Parts of it has been adopted from.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
Efficient P2P Search by Exploiting Localities in Peer Community and Individual Peers A DISC’04 paper Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang.
CS 640: Introduction to Computer Networks Aditya Akella Lecture 24 - Peer-to-Peer.
"A Measurement Study of Peer-to-Peer File Sharing Systems" Stefan Saroiu, P. Krishna Gummadi Steven D. Gribble, "A Measurement Study of Peer-to-Peer File.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
An overview of Gnutella
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
School of Electrical Engineering &Telecommunications UNSW Cost-effective Broadcast for Fully Decentralized Peer-to-peer Networks Marius Portmann & Aruna.
Advanced Computer Networks: Part 2 Complex Networks, P2P Networks and Swarm Intelligence on Graphs.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Unstructured Networks: Search Márk Jelasity. 2 Outline ● Emergence of decentralized networks ● The Gnutella network: how it worked and looked like ● Search.
Virtual Direction Routing
Peer-to-peer systems and
Peer to peer Internet telephony challenges, status and trend
Peer-to-Peer Data Management
Handling Churn in Less-structured P2P Systems Elders Know Best
Peer-to-Peer and Social Networks
PROGRAM STUDI TEKNIK INFORMATIKA FAKULTAS ILMU KOMPUTER
مظفر بگ محمدی دانشگاه ایلام
Plethora: Infrastructure and System Design
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Elders know best Lifespan-based ideas in P2P systems
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
Presentation transcript:

GIA: Making Gnutella-like P2P Systems Scalable Dr. Yingwu Zhu

The Peer-to-peer Phenomenon Internet-scale distributed system Distributed file-sharing applications E.g., Napster, Gnutella, KaZaA File sharing is the dominant P2P app Mass-market Mostly music, some video, software

The Problem Potentially millions of users Wide range of heterogeneity Large transient user population Existing search solutions cannot scale Flooding-based solutions limit capacity Distributed Hash Tables (DHTs) not necessarily appropriate

Why Not DHTs Structured solution Can DHTs do file sharing? Given a filename, find its location Can DHTs do file sharing? Probably, but with lots of extra work: Caching, keyword searching Do we need DHTs? Not necessarily: Great at finding rare files, but most queries are for popular files Note: Not questioning the utility of DHTs in general, merely for mass-market file sharing

Our Solution: GIA Unstructured, but take node capacity into account High-capacity nodes have room for more queries: so, send most queries to them Will work only if high-capacity nodes: Have correspondingly more answers, and Are easily reachable from other nodes

GIA Design Make high-capacity nodes easily reachable Dynamic topology adaptation Make high-capacity nodes have more answers One-hop replication Search efficiently Biased random walks Prevent overloaded nodes Active flow control Make high-capacity nodes easily reachable Dynamic topology adaptation Make high-capacity nodes have more answers One-hop replication Search efficiently Biased random walks Prevent overloaded nodes Active flow control Query

Dynamic Topology Adaptation Make high-capacity nodes have high degree (i.e., more neighbors) Per-node level of satisfaction, S: 0  no neighbors, 1  enough neighbors Function of: Node’s capacity ● Neighbors’ capacities Neighbors’ degrees ● Their age When S << 1, look for neighbors aggressively

Simulation Results Compare four systems Metric: FLOOD: TTL-scoped, random topologies RWRT: Random walks, random topologies SUPER: Supernode-based search GIA: search using GIA protocol suite Metric: Collapse point: aggregate throughput that the system can sustain E.g., knee for <90% query success rate

Questions What is the relative performance of the four algorithms? Which of the GIA components matters the most? How does the system behave in the face of transient nodes?

System Performance % % % GIA outperforms SUPER, RWRT & FLOOD by many orders of magnitude in terms of aggregate query load

Factor Analysis Algorithm Collapse point Algorithm Collapse point RWRT 0.0005 RWRT+OHR 0.005 RWRT+BIAS 0.0015 RWRT+TADAPT 0.001 RWRT+FLWCTL 0.0006 Algorithm Collapse point GIA 7 GIA – OHR 0.004 GIA – BIAS 6 GIA – TADAPT 0.2 GIA – FLWCTL 2 No single component is useful by itself; the combination of all of them is what makes GIA scalable

Transient Behavior Even under heavy churn GIA outperforms the other Static SUPER Static RWRT (1% repl) Even under heavy churn GIA outperforms the other algorithms by many orders of magnitude

Summary GIA: scalable Gnutella Unstructured approach is good enough! 3–5 orders of magnitude improvement in system capacity Unstructured approach is good enough! DHTs may be overkill Incremental changes to deployed systems Status: Prototype implementation deployed on PlanetLab