Download presentation
Presentation is loading. Please wait.
Published byGervase Hines Modified over 8 years ago
1
P2P Search COP6731 Advanced Database Systems
2
P2P Computing Powerful personal computer Share computing resources P2P Computing Advantages: Shared infrastructure costs Highly scalable No SPOF censorship-resistance
3
P2P Search Techniques Centralized P2P systems e.g. Napster, SETI@home Decentralized & unstructured P2P systems e.g. Gnutella Hybrid - partially decentralized e.g., Freenet Structured P2P systems DHT systems (CAN/Chord/Pastry/Tapestry) Skip-list based systems
4
Napster MP3 file sharing with a centralized catalog Peers hold files Napster Inc’s servers hold catalog File transfer is P2P, using a proprietary protocol
5
Central Napster server (xyz.mp3, 192.1.2.3) 192.1.2.3 Napster: Publish a File Users upload their IP address and music titles they wish to share
6
Users search for peers to download desired files xyz.mp3 ? 192.1.2.3 Napster: Query for a File Central Napster server
7
File transfer is P2P, using a proprietary protocol 192.1.2.3 xyz.mp3 ? Napster: Transfer Requested File Central Napster server
8
Disadvantage of Centralized Directory Performance bottleneck Single point of failure Can we do it without a directory ?
9
Gnutella No catalog Pings network to locate Gnutella peers File requests are broadcast to peers Flooding or breadth-first research When provider is located, the file is transferred via HTTP
10
xyz.mp3 ? Gnutella: Issue a Request
11
Gnutella: Flood the Request
12
xyz.mp3 Gnutella: Reply with the File
13
Gnutella - Disadvantages Network flooding - unnecessary network traffic Using TTL - some files might not be found Alternatively, using ultranodes (or supernodes) using depth-first search, i.e., Freenet
14
Morpheus, Kazaa Supernode Layer
15
Using Ultranodes Queries flood only the network of ultranodes Other peer nodes shielded from query traffic Combine the benefits of centralized and decentralized search; Take advantage of the heterogeneity in peer capabilities;
16
Freenet - Depth-First Search
17
Freenet – File not Found The requested file not found due to a poor routing decision made at peer D In this case, query backs out of the dead- end, and tries another peer in depth-first manner
18
Structured P2P Systems DHT-based Chord / Pastry / Tapestry: hash- based into single dimensional space CAN: hash-based into multi- dimensional space P-grid: hash-based into virtual binary search tree Skip-list based Skipgraph / SkipNet Index Tree-based BATON
19
DHT Design Goals An “overlay” network with: Flexible mapping of keys to physical nodes Data Independence Small network diameter Small degree (fan-out) Local routing decisions Robustness to churn Routing flexibility Proximity A “storage” or “memory” mechanism with No guarantees on persistence Maintenance via soft state
20
Metrics Searching/Lookup Number of hops in searching Number of messages Database related metrics: Total disk I/O Response Time Accuracy Maintenance Number of hops Number of messages
21
How to Bound Search Space ? Network Work on placement!
22
Basic Idea - Hashing Hash key Object “y” Objects have hash keys Peer “x” Peer nodes also have hash keys in the same hash space P2P Network yx H(y)H(x) Join (H(x)) Publish (H(y)) Place object to the peer with closest hash keys
23
Viewed as a Distributed Hash Table Hash table 02 128 -1 Peer nodes Each is responsible for a range of the hash table, according to the peer hash key Objects are placed in the peer with the closest key Note that peers are Internet edges Internet
24
How to Find an Object? Hash table 02 128 -1 Peer node Simplest idea: Everyone knows everyone else! one hop to find the object Want to keep only a few entries!
25
Using Distributed Hash Table (DHT) A peer only needs to know its logical neighbors Search based on multihop routing Hash table 02 128 -1 Peer node
26
K V DHT in action
27
K V DHT in action
28
K V DHT in action Operation: take key as input; route messages to node holding key
29
K V DHT in action: put() insert(K 1,V 1 ) Operation: take key as input; route messages to node holding key
30
K V DHT in action: put() Operation: take key as input; route messages to node holding key insert(K 1,V 1 )
31
(K 1,V 1 ) K V DHT in action: put() Operation: take key as input; route messages to node holding key
32
retrieve (K 1 ) K V DHT in action: get() Operation: take key as input; route messages to node holding key
33
K V DHT in action retrieve (K1)
34
CAN – Content Addressable Network Each peer is responsible for one zone, i.e., stores all (key, value) pairs of the zone Each peer knows the neighbors of its zone Random assignment of peers to zones at startup Dimensional-ordered multihop routing
35
CAN: Object Publishing node I::publish(K,V) I
36
(1) a = h x (K) CAN: Object Publishing x = a node I::publish(K,V) I
37
(1) a = h x (K) b = h y (K) CAN: Object Publishing x = a y = b node I::publish(K,V) I
38
(1) a = h x (K) b = h y (K) CAN: Object Publishing (2) route (K,V) -> J node I::publish(K,V) I J
39
(2) route (K,V) -> J (3) J stores (K,V) CAN: Object Publishing (K,V) node I::publish(K,V) I (1) a = h x (K) b = h y (K) J
40
(2) route “retrieve(K)” to J that is in charge of (a,b) (K,V) (1) a = h x (K) b = h y (K) node I::retrieve(K) I CAN: Object Retrieval J
41
Some Research Topics Content-based Image Retrieval in P2P Location Management in P2P Security Considerations for DHT P2P Backup Wireless P2P
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.