Download presentation
Presentation is loading. Please wait.
Published byRudolph Simon Hines Modified over 9 years ago
1
P2P Search COP5711
2
2 P2P Search Techniques Centralized P2P systems e.g. Napster, SETI@home Decentralized & unstructured P2P systems e.g. Gnutella Hybrid - partially decentralized e.g., Freenet Structured P2P systems DHT CAN
3
P2P Network P2P network is an overlay network built on top of a real physical network (e.g., Internet) In a P2P network, peers are network nodes connected by virtual or logical links A logical link is a path through many physical links in the underlying network 3
4
4 Napster server (Central Catalog) (xyz.mp3, 192.1.2.3) 192.1.2.3 Napster: Publish a File Users upload their IP address and music titles they wish to share
5
5 Users search for peers to download desired files xyz.mp3 ? 192.1.2.3 Napster: Query for a File Central Napster server
6
6 File transfer is P2P, using a proprietary protocol 192.1.2.3 xyz.mp3 ? Napster: Transfer Requested File Central Napster server
7
7 Disadvantage of Centralized Directory Performance bottleneck Single point of failure Can we do it without a directory ?
8
8 Decentralized P2P - Gnutella No catalog Pings network to locate Gnutella peers File requests are broadcast to peers Flooding or breadth-first research When provider is located, the file is transferred via HTTP
9
9 Who are my neighbors ? Gnutella: Join the Network Peers are Internet edges Special peer maintained by Gnutella Pings network to locate peers
10
10 xyz.mp3 ? Gnutella: Broadcast Request to Peers
11
11 Gnutella: Flood the Request (Breadth-first research) I have it.
12
12 xyz.mp3 Gnutella: Reply with the File (via HTTP) I have it.
13
13 Gnutella - Disadvantages Network flooding - unnecessary network traffic Using TTL - some files might not be found Alternatively, using ultranodes (or supernodes) using depth-first search, i.e., Freenet
14
14 Morpheus, Kazaa Flooding only the Supernodes Supernode Layer
15
15 Using Ultranodes Queries flood only the network of ultranodes Other peer nodes shielded from query traffic Combine the benefits of centralized and decentralized search; Take advantage of the heterogeneity in peer capabilities;
16
16 Freenet - Depth-First Search
17
17 Freenet – File not Found The requested file not found due to a poor routing decision made at peer D In this case, query backs out of the dead-end, and tries another peer in depth-first manner I have file X
18
Using Distributed Directory Data objects are everywhere Distribute subsets of the data directory among peers If we can find the relevant sub-directory, we can locate the data object 18 Directory Data Objects Sub-directory
19
19 How to Bound Search Space ? Basic Idea - Hashing Hash key Object “y” Objects have hash keys Peer “x” Peer nodes also have hash keys in the same hash space P2P Network yx H(y)H(x) Join (H(x)) Publish (H(y)) Place location information about an object at the peer with closest hash keys (i.e., a distributed directory)
20
20 Viewed as a Distributed Hash Table Hash table 02 128 -1 Peer nodes Each peer node is responsible for a range of the hash table, according to the peer hash key Location information about Objects are placed in the peer with the closest key (information redundancy)
21
21 How to Find an Object ? Looks for a peer /w the corresponding peer hash key A peer knows its logical neighbors Find peer X based on multihop routing X knows who has the object Hash table 02 128 -1 Peer node X Peer Y has the file
22
22 K V Dynamic Hash Table (DHT) in action
23
23 K V DHT in action
24
24 K V DHT in action: put() insert(K 1,V 1 ) Operation: Route message, “I have the file,” to node holding key K 1 Want to share a file
25
25 (K 1,V 1 ) K V DHT in action: put() Operation: take key as input; route messages to node holding key
26
26 retrieve (K 1 ) K V DHT in action: get() Operation: Retrieve message V 1 at node holding key K 1
27
27 K V DHT in action Retrieve file according to V 1
28
28 Still Flooding Still flood the network although intermediate nodes do not need to search Can we avoid flooding ?
29
29 CAN – Content Addressable Network Each peer is responsible for one zone, i.e., stores all (key, value) pairs of the zone Each peer knows the neighbors of its zone Random assignment of peers to zones at startup – split zone if not empty Dimensional-ordered multihop routing
30
30 CAN: Object Publishing node I::publish(K,V) I
31
31 (1) a = h x (K) CAN: Object Publishing x = a node I::publish(K,V) I
32
32 (1) a = h x (K) b = h y (K) CAN: Object Publishing x = a y = b node I::publish(K,V) I
33
33 (1) a = h x (K) b = h y (K) CAN: Object Publishing (2) route (K,V) -> J node I::publish(K,V) I J
34
34 (2) route (K,V) -> J (3) J stores (K,V) CAN: Object Publishing (K,V) node I::publish(K,V) I (1) a = h x (K) b = h y (K) J
35
35 (2) route “retrieve(K)” to J that is in charge of (a,b) (K,V) (1) a = h x (K) b = h y (K) node I::retrieve(K) I CAN: Object Retrieval J
36
36 Maintenance Inform neighbors that you are alive at discrete time interval t If your neighbor does not send alive message in time t, takeover its zone
37
P2P Benefits Efficient use of resources Use unused bandwidth, storage, and processing power at the edge of the network Scalability Consumers of resources also donate resources Reliability Replicas, geographic distribution No single point of failure Ease of administration Self organized nodes Built-in reliability and load balancing 37
38
Some Prototypes at UCF iSEE (Internet-scale Sensor Exploration Environement) Publishing real-time sensor data Browsing and querying real-time sensor data P2P Video Streaming for VoD and Live Broadcast Applications 38
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.