Download presentation
Presentation is loading. Please wait.
1
Data Management in Peer-to- Peer Systems Qi Sun Beverly Yang
2
2 Introduction What is P2P? Distributed nodes Equal roles and functionality Providing/exchanging resources Why now? PCs are becoming valuable resources! Computing devices becoming pervasive
3
3 Many Applications Grid computing e.g., Seti-at-Home Ubiquitous computing Cell phones, wireless devices, hand helds Cars, refrigerators, microwaves Preservation/Archival systems File-sharing
4
4 File-sharing model Data: (Title string, File blob) Query: “Find songs by Madonna” Result: 63.274.18.3: Madonna – “Vogue” 63.274.18.3: Madonna – “Beautiful Stranger” 27.48.3.124: Madonna – “Like a Prayer” 17.64.75.18: Madanna – “Vogue” How is this “search” implemented?
5
5 Many Approaches Napster Gnutella KaZaA OverNet BitTorrent
6
6 Napster “Hybrid” P2P system Server Index Peers ? A B C D E F C,E,F
7
7 Napster Benefits Efficient Comprehensive Can handle complex queries Disadvantages Server is single point of failure Server is performance bottleneck Server costs money to maintain!!!
8
8 Gnutella “Pure” P2P system TCP “Overlay network”
9
9 Gnutella = forward query = processed query = source = found result = forward response
10
10 Gnutella Benefits No server needed (cost) Robust (nodes can come and go) Can handle complex queries per node Disadvantages Not comprehensive (can miss results) Inefficient! (many messages)
11
11 KaZaA “Super-peer” P2P system Index
12
12 KaZaA “Super-peer” P2P system Index ? Like Napster Like Gnutella
13
13 KaZaA Change the ratio of clients to super-peers Napster: everyone (minus one) is a client Gnutella: no one is a client Combines strengths of hybrid and pure systems Leverages heterogeneity of peers e.g., bandwidth, memory, processing power Napster: everyone (minus one) is a client Gnutella: no one is a client
14
14 OverNet Uses all peers to build a distributed index 0 - 10 6 10 6 – 2x10 6 7x10 6 – 8x10 6 X Z 3x10 6 – 4x10 6 Y W... ABC Hash(ABC) 3561246
15
15 OverNet: Searching Given key k, which peer has the index? 1 0 2 8 16 24 31 25 4 Peer 0 looking for k=25 Distributed Hash Table (DHT)
16
16 BitTorrent Downloading of a single file Tracker Blk1 Blk2 Blk3... Blk n Peers 2, 3, 6
17
17 BitTorrent: Downloading Tit-for-Tat strategy Choking Mechanism Periodic un-choke Rare blocks first A: 1,2,3,4 B: 3,5 C: 2,3,4 A: 1,2,3,4 B: 3 C: 4
18
18 Challenges Performance, Performance, Performance! Find rare/popular files quickly Minimize maintenance cost Spread workload evenly Etc. Zillions of heuristics/variants
19
19 Challenges (2) Participation: Peers are selfish! Do not want to “donate” bandwidth Do not want to share their files Do not care about others Need some incentive mechanism!!
20
20 Challenges (3) Authenticity of data How do you know you have the right file? Bogus copies Corrupt copies Need detection/correction mechanisms
21
21 Techniques Performance Routing Indices Network Awareness Participation SLIC Micropayments Correctness DoS Prevention Reputation Systems
22
22 Routing Indices ?
23
23 Routing Indices (2) 5 6 789 DB OS AI EE DB AI DBAI 10 11 12 13 DB 5 OS 5,6,7 1 DB 2,4 OS 2 AI 2,3,4 EE 3 DB? 2 AI 3 4 AI 8,9 EE 10 DB 11,13 AI 11,12 EE DB
24
24 Routing Indices (3) Benefits Potentially reduce # messages Drawbacks Update cost (any time you have state) Size of index
25
25 Reputation Systems Alice Bob Who has file X? I do! File Y
26
26 Reputation Systems Have a “opinion list” Base on personal experience? Problem: sparse Node 0 Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Node 8 ? ? ? ? ? ?
27
27 Reputation Systems Node 4Node 1Node 2Node 6 Have a “trust list” Base on personal experience? Problem: sparse Ask friends Efficient Automatic
28
28 Micropayments Only if you have money, will people do things for you! Like a vending machine Goods are cheap Security can’t be too expensive Micropayments
29
29 $ Micropayments Server is needed… Handle accounts Distribute and cash coins Security Scalability and performance bottleneck
30
30 Micropayments Peers can do work too! Challenge: SECURITY $
31
31 SLIC: Link-based Incentive Use quality of service as incentive Fragment A Fragment B A B They need each other to reach more nodes. Can retaliate
32
32 SLIC (2) DCB W(A,B) W(A,C) W(A,D) A Adjust weights, and use them to reward good neighbors and to penalize bad ones
33
33 Network Awareness Overlay network can be poor! San Francisco Palo Alto Timbuktu Mali, Africa
34
34 Network Awareness (2) Form only “good” links Probe a few and pick the best San Francisco Palo Alto Timbuktu Mali, Africa
35
35 Network Awareness (3) “Swap” peers around San Francisco Palo Alto Timbuktu Mali, Africa
36
36 Denial of Service Malicious peers can flood queries on unstructured networks Rate limit Incentive Micro-payment
37
37 Denial of Service Malicious peers can drop queries and indices in structured networks Tracing/Audit Reorganization Alternate path
38
38 Concluding Remarks P2P provides a cheap infrastructure for leveraging the capacities of the masses. P2P’s “openness” is both its strength and its weakness.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.