Download presentation
Presentation is loading. Please wait.
1
Peer-To-Peer Data Management
Hector Garcia-Molina ICDE Conference, February 28, 2002
2
? What is P2P? pastry can jxta fiorana napster freenet united devices
open cola ? aim ocean store netmeeting gnutella farsite icq maorpheus ebay limewire bearshare uddi grove jabber popular power kazaa tapestry process tree mojo nation chord
3
Napster join query answer get file central index ...
4
Gnutella query
5
Morpheus ... ... ... ... super peer ... ...
6
satellite dish raw data chunk analyzed data central site ...
7
Lockss D3 D1 library D library A D2 library C library B library E
8
PeerCast Stanford source after: before: Stanford source
9
What is a P2P System? Multiple sites (at edge) Distributed resources
Sites are autonomous (different owners) Sites are both clients and servers Sites have equal functionality P2P Purity
10
P2P is BAD IDEA!! Distribution is expensive!
Specialized functionality is good!
11
Example: Distributed Data Management
Distribution is expensive If you must distribute: build centralized directory, index use backups for reliability for replicated data, use primary copy
12
Computational Efficiency is NOT Main Goal
Main driving force in a P2P system: exploiting existing (often free) resources sharing costs among many legal protection autonomy anonymity
13
Should We Do P2P Research?
Should we help people break the law? Analogy: Should we develop pillows, knives, hammers, drugs, bath tubs, cars, airplanes, ... ??
14
Should We Do P2P Research?
YES: P2P not exclusively for breaking law Remember the VCR YES: P2P can liberate us from culture “plantation owners” (Lessig)
15
Is “Free Culture’’ Feasible?
Example: Legal texts Can we afford it? economic activity rules of the game today
16
Should DB community work on P2P?
YES
17
P2P Challenges Easier to list NON-Research-Topics:
Color schemes for P2P Nodes Impact of P2P on Moroccan 15th Century Literature
18
P2P Challenges Search Resource Management Security & Privacy
19
Search Taxonomy lookup freenet can partial replicated SP
content queries search gnutella morpheus napster routing single site regional global scope of index
20
Index Implementation Taxonomy
routing replicated SP freenet yes gnutella morpheus index location correlated with content location partial no napster can centralized distributed P2P nature of index
21
Content Addressable Network (CAN)
Nodes 1 Data 2
22
Can We Improve Flooding?
routing replicated SP freenet yes gnutella morpheus index location correlated with content location partial no napster can centralized distributed P2P nature of index
23
Directed BFS in Gnutella
? ... query Heuristics for Selecting Direction >RES: Returned most results <TIME: Shortest satisfaction time <HOPS: Min hops for results >MSG: Sent us most messages (all types) <QLEN: Shortest queue <LAT: Shortest latency >DEG: Highest degree
24
How Does One Evaluate? Live Gnutella?
Use real Gnutella as “laboratory”
25
Time to Satisfaction for Directed BFS
26
Routing Index C Q(DB) A B D 50 25 C AI DB 20 65 B 70 75 50 90 20 A AI
A AI DB 50 B AI DB D 15 D 20 A 50 25 C 15 D AI DB
27
Types of Routing Indexes
Compound Hop Count Exponential Decay Strategies for Cycles Ignore (for Hop-Count, exponential) Avoid Update Cycles Detect Update Cycles and Recover
28
Effect of Index Compression
29
Effect of Network Topology
30
Resource Management Resource: Issues: storage (lockss)
CPU processing bandwidth (PeerCast) Issues: fairness load balancing
31
A1 B1 C1 A2 B2 C2 B1 A1 B2 A2 Example: Data Trading site 1 site 2
trade B2 A2 trade
32
A1 B1 C1 A2 B2 C2 B1 A1 C1 A2 C2 B2 Example: Data Trading site 1
trade C1 A2 trade C2 B2 trade
33
Data Trading Order of trades impacts reliability Issues:
Swaps vs. Deeds Fixed price vs. bids Preference to sites with a lot of space? reliable sites? “desperate” sites?
34
Effect of Bid Policies bid more (ask more in return)
when I have less free space bid more (ask more in return) when I have more free space
35
Effect of One Maverick Site
always bids high
36
Security & Privacy Issues: Anonymity Reputation Accountability
Information Preservation Information Quality Trust Denial of service attacks
37
Information Preservation
Example Policy: make 3 copies of documents A1 make copies What can go wrong?
38
A1 A1 A’1 What Can Go Wrong? “Bad” sites make copies
“Bad” site alters copy “Bad” site publishes fake “Bad” site makes may copies of other docs ... A1 A1 make copies A’1
39
Conclusion P2P systems popular today
P2P systems vulnerable and inefficient Many challenges ahead Search Resource Management Security and Privacy
40
For Additional Information
Google: “Stanford Peers”
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.