Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe

Slides:



Advertisements
Similar presentations
Scalable Content-Addressable Network Lintao Liu
Advertisements

CMPE 521 Improving Search In P2P Systems by Yang and Molina Prepared by Ayhan Molla.
Efficient Search - Overview Improving Search In Peer-to-Peer Systems Presented By Jon Hess cs294-4 Fall 2003.
Design and Implementation of HTTP-Gnutella Gateway Baoning Wu (baw4) Wei Zhang (wez5) CSE Department Lehigh University.
Routing Indices For Peer-to-Peer Systems Arturo Crespo, Hector Garcia-Molina Stanford ICDCS 2002.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
Farnoush Banaei-Kashani and Cyrus Shahabi Criticality-based Analysis and Design of Unstructured P2P Networks as “ Complex Systems ” Mohammad Al-Rifai.
LightFlood: An Optimal Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Gnutella 2 GNUTELLA A Summary Of The Protocol and it’s Purpose By
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
1 SLIC: A Selfish Link-based Incentive Mechanism for Unstructured P2P Networks Qixiang Sun Hector Garcia-Molina Stanford University.
Improving Search in P2P Networks By Shadi Lahham.
Responder Anonymity and Anonymous Peer-to-Peer File Sharing. by Vincent Scarlata, Brian Levine and Clay Shields Presentation by Saravanan.
Exploiting Content Localities for Efficient Search in P2P Systems Lei Guo 1 Song Jiang 2 Li Xiao 3 and Xiaodong Zhang 1 1 College of William and Mary,
1 Maximizing Remote Work in Flooding-based P2P Systems Qixiang Sun Neil Daswani Hector Garcia-Molina Stanford University.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Comparing Hybrid Peer-to-Peer Systems Beverly Yang and Hector Garcia-Molina Presented by Marco Barreno November 3, 2003 CS 294-4: Peer-to-peer systems.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Efficient Search in Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002.
Searching in Unstructured Networks Joining Theory with P-P2P.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
Gnutella & Searching Algorithms in Unstructured Peer-to-Peer Networks CS780-3 Lecture Notes In Courtesy of David Bryan.
KaZaA: Behind the Scenes Shreeram Sahasrabudhe Lehigh University
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
IR Techniques For P2P Networks1 Information Retrieval Techniques For Peer-To-Peer Networks Demetrios Zeinalipour-Yazti, Vana Kalogeraki and Dimitrios Gunopulos.
1 Reading Report 4 Yin Chen 26 Feb 2004 Reference: Peer-to-Peer Architecture Case Study: Gnutella Network, Matei Ruoeanu, In Int. Conf. on Peer-to-Peer.
1 P2P Querying Wes Hatch MUMT-614 Mar What is P2P? Nodes of equal roles exchanging information and services directly “distributed databases”
1 Unstructured P2P overlay. 2 Centralized model  e.g. Napster  global index held by central authority  direct contact between requestors and providers.
09/07/2004Peer-to-Peer Systems in Mobile Ad-hoc Networks 1 Lookup Service for Peer-to-Peer Systems in Mobile Ad-hoc Networks M. Tech Project Presentation.
Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
Comparing Hybrid Peer-Peer Systems Beverly Yang Hector Garcia-Molina Stanford University Presented by Kalyan Boggavarapu.
Skype P2P Kedar Kulkarni 04/02/09.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Using the Small-World Model to Improve Freenet Performance Hui Zhang Ashish Goel Ramesh Govindan USC.
1 ICS 214B: Transaction Processing and Distributed Data Management Lecture 18: Data Management in Peer-to-Peer Systems Professor Chen Li Based on slides.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
Lect3..ppt - 09/13/04 CIS 4100 Systems Performance and Evaluation Lecture 4 by Zornitza Genova Prodanoff.
Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Quantitative Evaluation of Unstructured Peer-to-Peer Architectures Fabrício Benevenuto José Ismael Jr. Jussara M. Almeida Department of Computer Science.
03/19/02Scalab Seminar Series1 Routing in Peer-to-Peer Systems Ramaswamy N.Vadivelu Scalab, ASU.
Unstructure P2P Overlay. Improving Search in Peer-to-Peer Networks ICDCS 2002 Beverly Yang Hector Garcia-Molina.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Peer to Peer Network Design Discovery and Routing algorithms
1 Reading Report 3 Yin Chen 20 Feb 2004 Reference: Efficient Search in Peer-to-Peer Networks, Beverly Yang, Hector Garcia-Molina, In 22 nd Int. Conf. on.
Evaluation GUESS and Non-Forwarding Peer-to-Peer search ICDCS paper Beverly Yang Patrick Vinograd Hector Garcia-Molina Computer Science Department, Stanford.
1 Improve search in unstructured P2P overlay. 2 Peer-to-peer Networks Peers are connected by an overlay network. Users cooperate to share files (e.g.,
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
School of Electrical Engineering &Telecommunications UNSW Cost-effective Broadcast for Fully Decentralized Peer-to-peer Networks Marius Portmann & Aruna.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Mesh-based Geocast Routing Protocols in an Ad Hoc Network
Early Measurements of a Cluster-based Architecture for P2P Systems
EE 122: Peer-to-Peer (P2P) Networks
Improving Performance in the Gnutella Protocol
Peer-to-Peer Information Systems Week 6: Performance
Mobile P2P Data Retrieval and Caching
Presentation transcript:

Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe

Goals Three search techniques: 1. Iterative Deepening 2. Directed BFS 3. Local Indices Evaluation and extensive measurements of these techniques on the Gnutella network. Ready-to-use results and recommendations. Basically - just trying to reduce nodes that handle a query.

Current Techniques Gnutella –Breadth First Search (BFS) with depth limit D (typically 7). Disadvantages  Wastage of resources  Inefficient Freenet: Depth First Search (DFS) Disadvantages  Poor Response Time

Iterative Deepening Required System Wide policy P={a,b,c} Time between successive iterations W. S P = {a,b,c} 1a FreezeFreeze Wait = W Resend [(TTL a) + query_id] … (TTL b-a) b

Directed BFS Send queries to a subset of nodes Subset nodes selected by heuristics like : Select node … That has highest number of results for provided queries Whose response messages have taken lowest avg number of hops. Who has forwarded most messages to our client Who has the shortest messages queue

Local Indices Each node n maintains an index of data for nodes within r hops So a node can process a query on behalf of every node within r hops small r = less storage. (e.g. for r(1)=70KB) S 1 process P= {1,5}

More work Node Join Sends join message with TTL of r, containing metadata over its collection A node receiving a join messages sends a return join message with its metadata Periodic refreshes Cost ?? QueryJoinRatio = Average ratio of queries to join messages QueryUpdateRatio = Average ratio of queries to update messages

Experiment Data Collection Observed Gnutella network traffic for 1 month Determined some general statistics like average number of files shared /user, query strings etc. Iterative Deepening For each query Q sent: log response message arriving in 2min. Ping messages to all neighbors: hops and IP addr. Same data used for Local Indices Directed BFS Same as above, but each query sent to single node.

Cost Bandwidth Cost in BFS: Processing Cost Nodes at depth N Redundant edges between n-1 and n Size of query message Total Records Response messages from nodes n Size of header Size of Record

Results Iterative Deepening Neighbors = 8 Desired number of results Z=50 Policies P={P d = {d, d+1, … D} for d=1,2,3..D} d = cost W = cost “ overshooting” W = time d = time COST

Directed BFS Studied 8 heuristics ‘Random neighbor’ is baseline for comparison COST

Local Indices

Conclusions Three new search systems specified and tested. Recommend: Local Indices with r=1. Savings: 61% bandwidth 49% processing