GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau (Several slides have been taken.

Slides:



Advertisements
Similar presentations
Peer-to-Peer and Social Networks An overview of Gnutella.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
INF 123 SW ARCH, DIST SYS & INTEROP LECTURE 12 Prof. Crista Lopes.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Intel Research Seattle Sylvia Ratnasamy, Lee Breslau, Scott Shenker, and Nick Lanham.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
1 An Overview of Gnutella. 2 History The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao, Christine Lv., Edith Cohen, Kai Li and Scott Shenker ICS 2002.
Antonis Papadogiannakis
Resilient Peer-to-Peer Streaming Paper by: Venkata N. Padmanabhan Helen J. Wang Philip A. Chou Discussion Leader: Manfred Georg Presented by: Christoph.
Gnutella 2 GNUTELLA A Summary Of The Protocol and it’s Purpose By
Improving Gnutella Willy Henrique Säuberli Seminar in Distributed Computing, 16. November 2005 Papers: I.Making Gnutella-like P2P Systems Scalable; SIGCOMM.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Receiver-driven Layered Multicast S. McCanne, V. Jacobsen and M. Vetterli SIGCOMM 1996.
Small-world Overlay P2P Network
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Technion –Israel Institute of Technology Software Systems Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi Melamed.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
1 Denial-of-Service Resilience in P2P File Sharing Systems Dan Dumitriu (EPFL) Ed Knightly (Rice) Aleksandar Kuzmanovic (Northwestern) Ion Stoica (Berkeley)
Locality-Aware Request Distribution in Cluster-based Network Servers 1. Introduction and Motivation --- Why have this idea? 2. Strategies --- How to implement?
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
Search and Replication in Unstructured Peer-to-Peer Networks Pei Cao Cisco Systems, Inc. (Joint work with Christine Lv, Edith Cohen, Kai Li and Scott Shenker)
Making Gnutella-like P2P Systems Scalable Presented by: Karthik Lakshminarayanan Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick Lanham, and Scott.
Vassilios V. Dimakopoulos and Evaggelia Pitoura Distributed Data Management Lab Dept. of Computer Science, Univ. of Ioannina, Greece
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
P2P Group Meeting (ICS/FORTH) Monday, 21 February, 2005 Making Gnutella-like P2P Systems Scalable (Yatin Chawathe, Sylvia Ratnasamy, Lee Breslau, Nick.
1 - CS7701 – Fall 2004 Review of: Making Gnutella-like P2P Systems Scalable Paper by: – Yatin Chawathe (AT&T) –Sylvia Ratnasamy (Intel) –Lee Breslau (AT&T)
1 Pertemuan 20 Teknik Routing Matakuliah: H0174/Jaringan Komputer Tahun: 2006 Versi: 1/0.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
A Peer-to-Peer Approach to Resource Discovery in Grid Environments (in HPDC’02, by U of Chicago) Gisik Kwon Nov. 18, 2002.
Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Kaleidoscope – Adding Colors to Kademlia Gil Einziger, Roy Friedman, Eyal Kibbar Computer Science, Technion 1.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau Parts of it has been adopted from.
K-Anycast Routing Schemes for Mobile Ad Hoc Networks 指導老師 : 黃鈴玲 教授 學生 : 李京釜.
"A Measurement Study of Peer-to-Peer File Sharing Systems" Stefan Saroiu, P. Krishna Gummadi Steven D. Gribble, "A Measurement Study of Peer-to-Peer File.
By Jonathan Drake.  The Gnutella protocol is simply not scalable  This is due to the flooding approach it currently utilizes  As the nodes increase.
An overview of Gnutella
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
School of Electrical Engineering &Telecommunications UNSW Cost-effective Broadcast for Fully Decentralized Peer-to-peer Networks Marius Portmann & Aruna.
Peer-to-Peer File Sharing Systems Group Meeting Speaker: Dr. Xiaowen Chu April 2, 2004 Centre for E-transformation Research Department of Computer Science.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Unstructured Networks: Search Márk Jelasity. 2 Outline ● Emergence of decentralized networks ● The Gnutella network: how it worked and looked like ● Search.
Peer-to-Peer Data Management
Peer-to-Peer and Social Networks
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
GIA: Making Gnutella-like P2P Systems Scalable
Elders know best Lifespan-based ideas in P2P systems
Joydeep Chandra, Santosh Shaw and Niloy Ganguly
A Scalable Content Addressable Network
Presentation transcript:

GIA: Making Gnutella-like P2P Systems Scalable Yatin Chawathe Sylvia Ratnasamy, Scott Shenker, Nick Lanham, Lee Breslau (Several slides have been taken from the authors’ original presentation)

The Problem Large scale P2P system: millions of users –Wide range of heterogeneity –Large transient user population (in a system with 100,000 nodes, 1600 nodes join and leave per minute) Existing search solutions cannot scale –Flooding-based solutions limit capacity –Distributed Hash Tables (DHTs) not necessarily appropriate (for keyword-based searches)

A Solution: GIA Scalable Gnutella-like P2P system Design principles: –Explicitly account for node heterogeneity –Query load proportional to node capacity Results: –GIA outperforms Gnutella by 3–5 orders of magnitude

Outline Existing approaches GIA: Scalable Gnutella Results: Simulations & Experiments Conclusion

Distributed Hash Tables (DHTs) Structured solution –Given the exact filename, find its location Can DHTs do file sharing? –Yes, but with lots of extra work needed for keyword searching Do we need DHTs? Not necessarily: Great at finding rare files, but most queries are for popular files Poor handling of churn – why?

Other Solutions Supernodes [KaZaA] –Classify nodes as low- or high-capacity –Only pushes the problem to a bigger scale Random Walks [Lv et al] –Forwarding is blind –Queries can get stuck in overloaded nodes Biased Random Walks [Adamic et al] –Right idea, but exacerbates overloaded-node problem

Outline Existing approaches GIA: Scalable Gnutella Results: Simulations & Experiments Conclusion

GIA: High-level view Unstructured, but take node capacity into account –High-capacity nodes have room for more queries: so, send most queries to them Will work only if high-capacity nodes: –Have correspondingly more answers, and –Are easily reachable from other nodes

Make high-capacity nodes easily reachable! –Dynamic topology adaptation converts them into high- degree nodes Make high-capacity nodes have more answers –One-hop replication Search efficiently –Biased random walks Prevent overloaded nodes –Active flow control GIA Design

Dynamic Topology Adaptation Make high-capacity nodes have high degree (i.e., more neighbors), and keep low capacity nodes within short reach from them. Per-node level of satisfaction, S: –0 = no neighbors, 1 = enough neighbors Satisfaction S is a function of: ● Node’s capacity ● Neighbors’ capacities ● Neighbors’ degrees When S << 1, look for neighbors aggressively

Dynamic Topology Adaptation Each GIA node maintains a host cache containing a list of other GIA nodes. The host cache is populated using a variety of methods (like contacting well- known web-based hosts, and exchanging host information using PING-PONG messages. A node X with S < 1 randomly picks a node Y from its host cache, and examines if it can be added as a neighbor.

Topology adaptation steps Life of Node X : it picks node Y from its host cache Case 1 {Y can be added as a new neighbor} (Let C i represent capacity of node i) if num nbrsX + 1 < max nbrs that it can handle then there is room ACCEPT Y ; return Case 2 {Node X decides if to replace an existing neighbor in favor of Y} subset := every neighbor i from nbrsX such that C i ≤ C Y if subset is empty, i.e. no such neighbors exist then REJECT Y ; return else candidate Z := highest-degree neighbor from subset If Y has higher capacity than Z or (num nbrsZ > num nbrsY + H) {Y has fewer nbrs} thenDROP Z; ACCEPT Y elseREJECT Y { Do not drop poorly connected nodes in favor of well-connected ones }

Topology adaptation steps

Active Flow Control Accept queries based on capacity –Actively allocate “tokens” to neighbors –Send query to neighbor only if we have received token from it Incentives for advertising true capacity –High capacity neighbors get more tokens to send outgoing queries –Nodes not using their tokens are marked inactive and this capacity is redistributed among its neighbors.

Outline Existing approaches GIA: Scalable Gnutella Results: Simulations & Experiments Conclusion

Simulation Results Compare four systems –FLOOD: TTL-scoped, random topologies –RWRT: Random walks, random topologies –SUPER: Supernode-based search –GIA: search using GIA protocol suite Metric: –Collapse point: aggregate throughput that the system can sustain (per node query rate beyond which the success rate drops below 90%)

Questions What is the relative performance of the four algorithms? Which of the GIA components matters the most? How does the system behave in the face of transient nodes?

System Performance GIA outperforms SUPER, RWRT & FLOOD by many orders of magnitude in terms of aggregate query load % % % population of the object

Factor Analysis AlgorithmCollapse point RWRT RWRT+OHR RWRT+BIAS RWRT+TADAPT RWRT+FLWCTL AlgorithmCollapse point GIA 7 GIA – OHR GIA – BIAS 6 GIA – TADAPT 0.2 GIA – FLWCTL 2 Topology adaptation Flow control No single component is useful by itself; the combination of them all is what makes GIA scalable

Transient Behavior Static SUPER Static RWRT (1% repl) Even under heavy churn, GIA outperforms the other algorithms by many orders of magnitude

Deployment Prototype client implementation using C++ Deployed on PlanetLab: –100 machines spread across 4 continents Measured the progress of topology adaptation…

Progress of Topology Adaptation Nodes quickly discover each other and soon reach their target “satisfaction level”

Outline Existing approaches GIA: Scalable Gnutella Results: Simulations & Experiments Conclusion

Summary GIA: scalable Gnutella –3–5 orders of magnitude improvement in system capacity Unstructured approach is good enough! –DHTs may be overkill –Incremental changes to deployed systems