Searching In Peer-To-Peer Networks Chunlin Yang. What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions.

Slides:



Advertisements
Similar presentations
Scalable Content-Addressable Network Lintao Liu
Advertisements

CMPE 521 Improving Search In P2P Systems by Yang and Molina Prepared by Ayhan Molla.
Efficient Search - Overview Improving Search In Peer-to-Peer Systems Presented By Jon Hess cs294-4 Fall 2003.
Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe
Technion –Israel Institute of Technology Computer Networks Laboratory A Comparison of Peer-to-Peer systems by Gomon Dmitri and Kritsmer Ilya under Roi.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
An Overview of Peer-to-Peer Networking CPSC 441 (with thanks to Sami Rollins, UCSB)
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
P2p, Spring 05 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems March 29, 2005.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Name Services Jessie Crane CPSC 550. History ARPAnet – experimental computer network (late 1960s) hosts.txt – a file that contained all the information.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Peer-to-Peer Based Multimedia Distribution Service Zhe Xiang, Qian Zhang, Wenwu Zhu, Zhensheng Zhang IEEE Transactions on Multimedia, Vol. 6, No. 2, April.
A Trust Based Assess Control Framework for P2P File-Sharing System Speaker : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2004 / 3 / 15.
Improving Search in P2P Networks By Shadi Lahham.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Gnutella, Freenet and Peer to Peer Networks By Norman Eng Steven Hnatko George Papadopoulos.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
1 Seminar: Information Management in the Web Gnutella, Freenet and more: an overview of file sharing architectures Thomas Zahn.
Efficient Search in Peer to Peer Networks By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002.
Searching in Unstructured Networks Joining Theory with P-P2P.
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
DOMAIN NAME SYSTEM. Introduction  There are several applications that follow client server paradigm.  The client/server programs can be divided into.
1CS 6401 Peer-to-Peer Networks Outline Overview Gnutella Structured Overlays BitTorrent.
A distributed Search Service for Peer-to-Peer File Sharing in Mobile Applications From U. of Dortmund, Germany.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Freenet. Anonymity  Napster, Gnutella, Kazaa do not provide anonymity  Users know who they are downloading from  Others know who sent a query  Freenet.
Peer-to-Peer Computing CS587x Lecture Department of Computer Science Iowa State University.
1 Napster & Gnutella An Overview. 2 About Napster Distributed application allowing users to search and exchange MP3 files. Written by Shawn Fanning in.
Introduction Widespread unstructured P2P network
IR Techniques For P2P Networks1 Information Retrieval Techniques For Peer-To-Peer Networks Demetrios Zeinalipour-Yazti, Vana Kalogeraki and Dimitrios Gunopulos.
Chapter 16 – DNS. DNS Domain Name Service This service allows client machines to resolve computer names (domain names) to IP addresses DNS works at the.
COCONET: Co-Operative Cache driven Overlay NETwork for p2p VoD streaming Abhishek Bhattacharya, Zhenyu Yang & Deng Pan.
思科网络技术学院理事会. 1 Application Layer Functionality and Protocols Network Fundamentals – Chapter 3.
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
Peer to Peer Research survey TingYang Chang. Intro. Of P2P Computers of the system was known as peers which sharing data files with each other. Build.
1 BitHoc: BitTorrent for wireless ad hoc networks Jointly with: Chadi Barakat Jayeoung Choi Anwar Al Hamra Thierry Turletti EPI PLANETE 28/02/2008 MAESTRO/PLANETE.
P2p, Fall 06 1 Topics in Database Systems: Data Management in Peer-to-Peer Systems Routing indexes A. Crespo & H. Garcia-Molina ICDCS 02.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Routing Indices For P-to-P Systems ICDCS Introduction Search in a P2P system –Mechanisms without an index –Mechanisms with specialized index nodes.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
The Start Shawn Fanning (19-yr-old student nicknamed Napster) developed the original Napster application and service in January 1999 while a freshman.
03/19/02Scalab Seminar Series1 Routing in Peer-to-Peer Systems Ramaswamy N.Vadivelu Scalab, ASU.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
P2PComputing/Scalab 1 Gnutella and Freenet Ramaswamy N.Vadivelu Scalab.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
LightFlood: An Efficient Flooding Scheme for File Search in Unstructured P2P Systems Song Jiang, Lei Guo, and Xiaodong Zhang College of William and Mary.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
NETWORKING FUNDAMENTALS. Network+ Guide to Networks, 4e2.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Peer to Peer Network Design Discovery and Routing algorithms
1 Reading Report 3 Yin Chen 20 Feb 2004 Reference: Efficient Search in Peer-to-Peer Networks, Beverly Yang, Hector Garcia-Molina, In 22 nd Int. Conf. on.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
Distributed Caching and Adaptive Search in Multilayer P2P Networks Chen Wang, Li Xiao, Yunhao Liu, Pei Zheng The 24th International Conference on Distributed.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Distributed Web Systems Peer-to-Peer Systems Lecturer Department University.
What Are Routers? Routers are an intermediate system at the network layer that is used to connect networks together based on a common network layer protocol.
Internet Networking recitation #12
Peer-to-Peer Information Systems Week 6: Performance
Mobile P2P Data Retrieval and Caching
Peer-To-Peer Data Management
Presentation transcript:

Searching In Peer-To-Peer Networks Chunlin Yang

What’s P2P - Unofficial Definition All of the computers in the network are equal Each computer functions as a client as well as a server with no administrator User on each computer decides what data on their computer will be shared on the network

What’s P2P – Continue To share huge volumes of data among peers in the network No dedicated servers or hierarchy among the computers in the network Examples: Gnutella, Freenet, and Napster

Why P2P Three internet fundamental assets: information, bandwidth, storage space Increasing amount of information, find useful information in real time is increasingly difficult Bandwidth: more have been done, however hot sites like Yahoo, eBay get more and more traffic bottleneck

Why P2P - Continue Computing resource: processors speed increase and storage device capacity get bigger, but data center accumulate more and more computation tasks P2P networking can greatly improve the utilization of the internet resources

Why P2P - Continue Load balance traffic to reduce the peak load on network Increase reliability and fault tolerance of the global system Fault tolerance for server down time, such as delivery or slice big package to small packets and transfer through multi- path.

Basic Searching Algorithms Gnutella: BFS Freenet: DFS Napster: Index Server

Basic Search Algorithm Gnutella Each node of the network simultaneously acts as a client as well as a server Conducts searching while listening for incoming queries Completely decentralized, every node is equal

Basic Searching Algorithm Gnutella - Continue A node send query to all its neighbors and each neighbor searches in its own resource and forward the message to all it’s own neighbors If a query is satisfied, a response will be sent back to the original requester using the reverse path

Basic Searching Algorithm Gnutella - Continue Queries are assigned GUIDs to avoid repetition Use a TTL of 7 (about nodes) to not congest the network Problem: can be cyclical, and cause excessive traffic

Basic Searching Algorithm Freenet Cooperative file distribution to improve documentation distribution efficiency by sharing bandwidth and disk Each file has a unique id and its locations Network of equal nodes, each acting as client and server

Basic Searching Algorithm Freenet - Continue Information stored on hosts under searchable keys Uses a depth-first search with depth limit D. Each node forwards the query to a single neighbor, and waits for a definite response from the neighbor If the query was not satisfied, the neighbor forwards the query to another neighbor

Basic Searching Algorithm Freenet - Continue If the query was satisfied, the response will be sent back to the query source using the reverse path Each node along the path copies data to its own database as well More popular information becomes easier to access

Basic Searching Algorithm Napster Centralized server has information of online users and songs location in database for quick search Client use peer-to-peer file transfer when a location of a song found from server Legal problem: ignores copyright Problem: same issue for client-server bottleneck and if the index server down

Improving Search Algorithms In Peer-to-Peer Network Iterative Deepening Directed BFS Local Indices Routing Indices NEVRLATE

Iterative Deepening Multiple breadth-first searches initiated with successively larger depth limits, until the query is satisfied or the Maximum depth has been reached. Example: policy P(a,b,c) first depth a, second depth b, and third depth c.

Iterative Deepening - Continue A Source mode S first initiates a BFS of depth a, When a node at depth a receives and process the query, it will store the query temporarily All messages frozen at nodes of a hops from the source S receives response messages from nodes that have processes the query

Iterative Deepening - Continue After a time period of predefined W, if the query has been satisfied, S does nothing Otherwise S starts another round of iteration by initiating a BFS of depth b S send a resend message of TTL of a, all node will only forward the resend message until to nodes at a hops

Iterative Deepening - Continue A node at hop a will drop the resend message and unfreeze the corresponding query by forwarding the query to all its neighbor with a TTL of b-a When message reach to node of hop b, the process continues in a similar fashion When process to level c, query will not be frozen, S will not initiate another iteration even the query is not satisfied. Problem ?

Directed BFS A node sends query to a subset of its neighbors that could return many results for minimum response time A node maintains simple statistics on its neighbors for past queries or the latency of the connection with that neighbor From these statistics, some rules can be used to pick up a node to send a query:

Directed BFS - Continue Neighbors that has returned highest number of results for previous queries Neighbors that returns response message having the lowest average number of hops Neighbors that has forward the largest number of message Neighbors that has the shortest message queue

Local Indices Each node n maintains an index over the data of all nodes within r hops of itself r is a system-wide variable known as the radius of the index When receive a query, a node can process it on behalf of every node with in r hops, data can be searched on fewer nodes to reduce the cost while keep the satisfaction

Local Indices - Continue A system-wide policy specifies the depths at which the query should be processed All nodes at the depths not listed in the policy simply forward the query Example P(1,5), Only nodes with a depths of 1 and 5 process the query while nodes at other depth just forward the query, Reason: Each node has information of its neighbors within 4 hops.

Routing Indices To allow a node to select the “best” neighbors to send a query to, Routing Indices is a data structure and associated algorithms that, given a query, returns a list of neighbors, ranked according to their goodness for the query, The goodness should in general reflect the number of documents in nearby nodes.

Routing Indices - Continue Each node has a local index for quickly finding local documents when a query is received. Nodes also have a Compound Routing Indices containing: The number of documents along each path, The number of documents on each topic of interest,

Routing Indices Example

Documents with topics Path #docs DB N T L A B C D

Routing Indices - Maintain When a connection is established between two nodes, they exchange their routing indices, and update its own indices and send message to its neighbors, When a node I disconnected from the network, node D detected, it will remove the row for I, and send a new routing indices of its own to all its neighbors to update.

NEVRLATE Network-Efficient Vast Resource Lookup At The Edge Directory servers to be organized into a logical 2-dimensional grid, or a set of sets of servers Enabling registration in one “horizontal” dimension and Lookup in the other “vertical” dimension.

NEVRLATE - Continue Each node is a directory server Each set of servers, the vertical cloud, can reach each other member of the set The set of sets of servers is the entire NEVRLATE network.

NEVRLATE - Continue

Each host register its resource and location to one node of each set When a query comes, only one set need to be searched to get all location containing the satisfied information Can also register to two nodes in each sets for fault tolerance

Extension Total rank of neighbor’s : weighed sum of all key ranks Assumption: high rank nodes should always be better to access or close to resource Dominating-set mark process: rule1/rule2, when remove a node from the DS, choose the one with less rank instead of uid

Extension - Continue Based on Mark Process (Wu & Li), the connected dominating set nodes will have relatively higher connectivity than non-DS nodes. The dominating set nodes need to have resource information and location of resource for their neighbor nodes. When search, request will be sent only to DS nodes to reduce cost and traffic while keep satisfactions.

Extension - Continue Clustering: when construct a cluster, choose the one with highest rank instead of lowest uid, choose the node with lowest rank as the gateway – low traffic Consider not only its own rank but also total ranks of its neighbors Max-min ranking: when searching, choose max as well as min for the key index rank

Extension - Continue Reason: max could be high traffic, min, low traffic Networks are dynamic, resources are dynamic, help to re-rank the networks Example: Glades Rd/Palmetto Park Rd SW  NE

Summary