Download presentation
Presentation is loading. Please wait.
1
07.04.2003presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University
2
07.04.2003presented by Hasan SÖZER2 Outline Motivation behind P2P systems Considerations Resource-Location Problem Probabilistic Search Protocol Protocol Performance Conclusion Other P2P Efforts
3
07.04.2003presented by Hasan SÖZER3 Motivation (1) Client-server model – Underusage of Internet’s bandwidth – Increasing load on dedicated servers – Vulnerable attacks to servers – Single point of failure
4
07.04.2003presented by Hasan SÖZER4 Motivation (2) P2P systems: rely on individual computer’s computing power & storage capacity – Better utilize bandwith – Distribute load in a self organizing manner – Robust to random attacks Especially, if the P2P system exhibits the “small world” property ( Most peers have few links to other peers ) – Enhance reliability leading to fault tolerance Does not rely on dedicated servers
5
07.04.2003presented by Hasan SÖZER5 Motivation (3) Some application areas of P2P systems – Distributed directory systems – E-commerce models – Web service discovery
6
07.04.2003presented by Hasan SÖZER6 Considerations P2P nodes, – act as both clients & servers – form an application network and route messages i.e. for locating a service Design of communication (messaging) protocols have crutial importance by means of efficiency Naive approaches usually fail – i.e. Gnutella’s flood routing adds traffic overhead
7
07.04.2003presented by Hasan SÖZER7 Resource Location Problem Deterministic approach: given a resource name, find the node or nodes that manage the resource – May not be feasible in a very large network Probabilistic approach: given a resource name, find with a given probability the node or nodes that manage the resource
8
07.04.2003presented by Hasan SÖZER8 Probabilistic Search Protocol (1) Trades performance and scalability for the probability P f, that a resource will be located Goal: Achieve a probability P f close to 1 with much lower cost compared to the deterministic case
9
07.04.2003presented by Hasan SÖZER9 Probabilistic Search Protocol (2) Cost Measurement – Number of messages exchanged – Bandwidth used – Number of peers contacted Abbrevations & Terms – LD: Local Directory – DC: Directory Cache – N(s): Neighborhood of peer s Nodes that are one hop away or in the same LAN
10
07.04.2003presented by Hasan SÖZER10 LD can be managed without intervention of P2P system Each source has a unique location-independent global identifier (GUID) – Can be computed with a hash function (i.e. SHA-1) or can be assigned according to the resource managed (ISBN in case of a book) DC points to the presumed location of resources managed by other peers (also contains GUID – physical address (network address) mapping) Probabilistic Search Protocol (3)
11
07.04.2003presented by Hasan SÖZER11 Probabilistic Search Protocol (4) A peer-to-peer computing system:
12
07.04.2003presented by Hasan SÖZER12 Probabilistic Search Protocol (5) The algorithm:
13
07.04.2003presented by Hasan SÖZER13 Probabilistic Search Protocol (6) An example scenario (SearchRequest message’s propagation):
14
07.04.2003presented by Hasan SÖZER14 Probabilistic Search Protocol (7) An example scenario (ResourceFound message’s propagation):
15
07.04.2003presented by Hasan SÖZER15 Probabilistic Search Protocol (8) Two basic operations: – SearchRequest(src, res, RevPath, TTL) – ResourceFound(src, res, RevPath, v) res being searched by source src was found at peer v Broadcast probability, p – Can be adjusted according to the path traversed
16
07.04.2003presented by Hasan SÖZER16 Protocol Performance (1) Model: 120 peers, average node degree = 5
17
07.04.2003presented by Hasan SÖZER17 Protocol Performance (2) When p becomes 0.6, P f reaches the value of 0.9 while only %10.5 of the peers are involved in the search A relatively small p can generate reasonable P f for finding a resource at a very low cost – Small fraction of the peer nodes are involved in the search
18
07.04.2003presented by Hasan SÖZER18 Protocol Performance (3) Implementation on a randomly generated topology – Each computer on a LAN simulates multiple peer nodes – For N peers generate a regular graph each node having k neighbours – Rewire the graph If graph becomes disconnected, restart the process
19
07.04.2003presented by Hasan SÖZER19 Protocol Performance (4) Model: N=30, k=4, P(rewiring)=0.1, size(DC)=%1*#ofRes, LRU
20
07.04.2003presented by Hasan SÖZER20 Protocol Performance (5) Maximum value of the curve occurs when p = 0.5 At this value, a resource will be found with probability 0.84 within 3.4 hops from the source on average
21
07.04.2003presented by Hasan SÖZER21 Conclusion Issues that have not been addressed: – Cache replacement policies – Cache invalidation options – Provision to avoid broadcasting a message more than once Optimization – Directly sending the result to the search source before traversing the path backwards would increase the responce time
22
07.04.2003presented by Hasan SÖZER22 Other P2P Efforts Gnutella (Used mainly for file sharing) Uses flood routing to broadcast queries Gridella (A Gnutella-compatible system) Reduces bandwidth by superimposing a binary tree on top of the P2P network Freenet (Creates a virtual file system) Pools unused disk space accross many computers JXTA (A suite of protocols) Uses specialized peers that can register with each other A middle ground between centralized and decentralized approaches
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.