Download presentation
Presentation is loading. Please wait.
1
1 Peer-To-Peer Data Management Hector Garcia-Molina ICDE Conference, February 28, 2002
2
2 What is P2P? napster gnutella maorpheus kazaa bearshare seti@home folding@home ebay limewire icq fiorana mojo nation jxta united devices open cola uddi process tree can chord ocean store farsite pastry tapestry ? grove netmeeting freenet popular power aim jabber
3
3 Napster central index join query answer get file...
4
4 Gnutella query
5
5 Morpheus... super peer
6
6 Seti@Home satellite dish... raw data chunk analyzed data central site
7
7 Lockss library A library B library C library E library D D1 D2 D3
8
8 PeerCast Stanford source Stanford source after: before:
9
9 What is a P2P System? uMultiple sites (at edge) uDistributed resources uSites are autonomous (different owners) uSites are both clients and servers uSites have equal functionality P2P Purity
10
10 P2P is BAD IDEA!! uDistribution is expensive! uSpecialized functionality is good!
11
11 Example: Distributed Data Management uDistribution is expensive uIf you must distribute: build centralized directory, index use backups for reliability for replicated data, use primary copy use backups for reliability
12
12 Computational Efficiency is NOT Main Goal uMain driving force in a P2P system: exploiting existing (often free) resources sharing costs among many legal protection autonomy anonymity
13
13 Should We Do P2P Research? uShould we help people break the law? uAnalogy: Should we develop pillows, knives, hammers, drugs, bath tubs, cars, airplanes,... ??
14
14 Should We Do P2P Research? uYES: P2P not exclusively for breaking law Remember the VCR uYES: P2P can liberate us from culture “plantation owners” (Lessig)
15
15 Is “Free Culture’’ Feasible? uExample: Legal texts uCan we afford it? economic activity rules of the game today
16
16 Should DB community work on P2P? uYES
17
17 P2P Challenges uEasier to list NON-Research-Topics: Color schemes for P2P Nodes Impact of P2P on Moroccan 15th Century Literature
18
18 P2P Challenges uSearch uResource Management uSecurity & Privacy
19
19 Search Taxonomy lookup content queries search single site regional global scope of index freenet gnutella napster morpheus can routing replicated SP partial
20
20 Index Implementation Taxonomy yes no centralized distributed P2P nature of index freenet gnutella napster morpheus can routing index location correlated with content location replicated SP partial
21
21 Content Addressable Network (CAN) 1 2 Nodes Data
22
22 Can We Improve Flooding? yes no centralized distributed P2P nature of index freenet gnutella napster morpheus can routing index location correlated with content location replicated SP partial
23
23 Directed BFS in Gnutella uHeuristics for Selecting Direction >RES: Returned most results <TIME: Shortest satisfaction time <HOPS: Min hops for results >MSG: Sent us most messages (all types) <QLEN: Shortest queue <LAT: Shortest latency >DEG: Highest degree query ?...
24
24 How Does One Evaluate? uLive Gnutella? uUse real Gnutella as “laboratory”
25
25 Time to Satisfaction for Directed BFS
26
26 Routing Index AB C D 5025C AIDB 015D AIDB 050B AIDB 015D 200A 5025C 2065B 7075B 5090B 200A AIDB Q(DB)
27
27 Types of Routing Indexes uCompound uHop Count uExponential Decay uStrategies for Cycles Ignore (for Hop-Count, exponential) Avoid Update Cycles Detect Update Cycles and Recover
28
28 Effect of Index Compression
29
29 Effect of Network Topology
30
30 Resource Management uResource: storage (lockss) CPU processing (seti@home) bandwidth (PeerCast) uIssues: fairness load balancing
31
31 Example: Data Trading site 1 site 2 site 3 A1A1 B1B1 C1C1 A2A2 B2B2 C2C2 B1B1 A1A1 trade B2B2 A2A2
32
32 Example: Data Trading site 1 site 2 site 3 A1A1 B1B1 C1C1 A2A2 B2B2 C2C2 B1B1 A1A1 trade C1C1 A2A2 C2C2 B2B2
33
33 Data Trading uOrder of trades impacts reliability uIssues: Swaps vs. Deeds Fixed price vs. bids Preference to sites with a lot of space? reliable sites? “desperate” sites?
34
34 Effect of Bid Policies bid more (ask more in return) when I have more free space bid more (ask more in return) when I have less free space
35
35 Effect of One Maverick Site always bids high
36
36 Security & Privacy uIssues: Anonymity Reputation Accountability Information Preservation Information Quality Trust Denial of service attacks
37
37 Information Preservation uExample Policy: make 3 copies of documents A1A1 make copies What can go wrong?
38
38 What Can Go Wrong? u“Bad” sites make copies u“Bad” site alters copy u“Bad” site publishes fake u“Bad” site makes may copies of other docs u... A1A1 make copies A’1A’1 A1A1
39
39 Conclusion uP2P systems popular today uP2P systems vulnerable and inefficient uMany challenges ahead Search Resource Management Security and Privacy
40
40 For Additional Information uGoogle: “Stanford Peers” uhttp://www-db.stanford.edu/peers/
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.