Download presentation
Presentation is loading. Please wait.
1
Peer to Peer Technologies Roy Werber Idan Gelbourt prof. Sagiv’s Seminar The Hebrew University of Jerusalem, 2001
2
Lecture Overview 1st Part: The P2P communication model, architecture and applications 2nd Part: Chord and CFS
3
Peer to Peer - Overview A class of applications that takes advantage of resources: Storage, CPU cycles, content, human presence Available at the edges of the Internet A decentralized system that must cope with the unstable nature of computers located at the network edge
4
Client/Server Architecture An architecture in which each process is a client or a server Servers are powerful computers dedicated for providing services – storage, traffic, etc Clients rely on servers for resources
5
Client/Server Properties Big, strong server Well known port/address of the server Many to one relationship Different software runs on the client/server Client can be dumb (lacks functionality), server performs for the client Client usually initiates connection
6
Client Server Architecture Server Client Internet
7
Client/Server Architecture GET /index.html HTTP/1.0 HTTP/1.1 200 OK... Client Server
8
Disadvantages of C/S Architecture Single point of failure Strong expensive server Dedicated maintenance (a sysadmin) Not scalable - more users, more servers
9
Solutions Replication of data (several servers) Problems: redundancy, synchronization, expensive Brute force (a bigger, faster server) Problems: Not scalable, expensive, single point of failure
10
The Client Side Although the model hasn’t changed over the years, the entities in it have Today’s clients can perform more roles than just forwarding users requests Today’s clients have: More computing power Storage space
11
Thin Client Performs simple tasks: I/O Properties: Cheap Limited processing power Limited storage
12
Fat Client Can perform complex tasks: Graphics Data manipulation Etc… Properties: Strong computation power Bigger storage More expensive than thin
13
Evolution at the Client Side IBM PC @ 4.77MHz 360k diskettes A PC @ 2GHz 40GB HD DEC’S VT100 No storage ‘70‘80 2001
14
What Else Has Changed? The number of home PCs is increasing rapidly PCs with dynamic IPs Most of the PCs are “fat clients” Software cannot cope with hardware development As the Internet usage grow, more and more PCs are connecting to the global net Most of the time PCs are idle How can we use all this?
15
Sharing Definition: 1.To divide and distribute in shares 2.To partake of, use, experience, occupy, or enjoy with others 3.To grant or give a share in intransitive senses Merriam Webster’s online dictionary (www.m-w.com) There is a direct advantage of a co-operative network versus a single computer
16
Resources Sharing What can we share? Computer resources Shareable computer resources: “CPU cycles” - seti@home Storage - CFS Information - Napster / Gnutella Bandwidth sharing - Crowds
17
SETI@Home SETI – Search for ExtraTerrestrial Intelligence @Home – On your own computer A radio telescope in Puerto Rico scans the sky for radio signals Fills a DAT tape of 35GB in 15 hours That data has to be analyzed
18
SETI@Home (cont.) The problem – analyzing the data requires a huge amount of computation Even a supercomputer cannot finish the task on its own Accessing a supercomputer is expensive What can be done?
19
SETI@Home (cont.) Can we use distributed computing? YEAH Fortunately, the problem be solved in parallel - examples: Analyzing different parts of the sky Analyzing different frequencies Analyzing different time slices
20
SETI@Home (cont.) The data can be divided into small segments A PC is capable of analyzing a segment in a reasonable amount of time An enthusiastic UFO searcher will lend his spare CPU cycles for the computation When? Screensavers
21
SETI@Home - Example
22
SETI@Home - Summary SETI reverses the C/S model Clients can also provide services Servers can be weaker, used mainly for storage Distributed peers serving the center Not yet P2P but we’re close Outcome - great results: Thousands of unused CPU hours tamed for the mission 3+ millions of users
23
What Exactly is P2P? A distributed communication model with the properties: All nodes have identical responsibilities All communication is symmetric
24
P2P Properties Cooperative, direct sharing of resources No central servers Symmetric clients Client Internet
25
P2P Advantages Harnesses client resources Scales with new clients Provides robustness under failures Redundancy and fault-tolerance Immune to DoS Load balance
26
P2P Disadvantages -- A Tough Design Problem How do you handle a dynamic network (nodes join and leave frequently) A number of constrains and uncontrolled variables: No central servers Clients are unreliable Client vary widely in the resources they provide Heterogeneous network (different platforms)
27
Two Main Architectures Hybrid Peer-to-Peer Preserves some of the traditional C/S architecture. A central server links between clients, stores indices tables, etc Pure Peer-to-Peer All nodes are equal and no functionality is centralized
28
Hybrid P2P A main server is responsible for various administrative operations: Users’ login and logout Storing metadata Directing queries Example: Napster
29
Examples - Napster Napster is a program for sharing information (mp3 music files) over the Internet Created by Shawn Fanning in 1999 although similar services were already present (but lacked popularity and functionality)
30
Napster Sharing Style: hybrid center+edge “slashdot” song5.mp3 song6.mp3 song7.mp3 “kingrook” song4.mp3 song5.mp3 song6.mp3 song5.mp3 1. Users launch Napster and connect to Napster server 3. beastieboy enters search criteria 4. Napster displays matches to beastieboy 2. Napster creates dynamic directory from users’ personal.mp3 libraries Title User Speed song1.mp3 beasiteboy DSL song2.mp3 beasiteboy DSL song3.mp3 beasiteboy DSL song4.mp3 kingrook T1 song5.mp3 kingrook T1 song5.mp3 slashdot 28.8 song6.mp3 kingrook T1 song6.mp3 slashdot 28.8 song7.mp3 slashdot 28.8 5. beastieboy makes direct connection to kingrook for file transfer song5 “beastieboy” song1.mp3 song2.mp3 song3.mp3
31
What About Communication Between Servers? Each Napster server creates its own mp3 exchange community: rock.napster.com, dance.napster.com, etc… Creates a separation which is bad We would like multiple servers to share a common ground. Reduces the centralization nature of each server, expands searchability
32
Various HP2P Models – 1. Chained Architecture Chained architecture – a linear chain of servers Clients login to a random server Queries are submitted to the server If the server satisfies the query – Done Otherwise – Forward the query to the next server Results are forwarded back to the first server The server merges the results The server returns the results to the client Used by OpenNap network
33
2. Full Replication Architecture Replication of constantly updated metadata A client logs on to a random server The server sends the updated metadata to all servers Result: All servers can answer queries immediately
34
3. Hash Architecture Each server holds a portion of the metadata Each server holds the complete inverted list for a subset of all words Client directs a query to a server that is responsible for at least one of the keywords That server gets the inverted lists for all the keywords from the other servers The server returns the relevant results to the client
35
4. Unchained Architecture Independent servers which do not communicate with each other A client who logs on to one server can only see the files of other users at the same local server A clear disadvantage of separating users into distinct domains Used by Napster
36
Pure P2P All nodes are equal No centralized server Example: Gnutella
37
A completely distributed P2P network Gnutella network is composed of clients Client software is made of two parts: A mini search engine – the client A file serving system – the “server” Relies on broadcast search
38
Gnutella - Operations Connect – establishing a logical connection PingPong – discovering new nodes (my friend’s friends) Query – look for something Download – download files (simple HTTP)
39
Gnutella – Form an Overlay Connect OK Ping Pong
40
How to find a node? Initially, ad hoc ways Email, online chat, news groups… Bottom line: you got to know someone! Set up some long-live nodes New comer contacts the well-known nodes Useful for building better overlay topology
41
Gnutella – Search Green Toad I have Toad A – look nice Toad B – too far A B I have
42
On a larger scale, things get more complicated
43
Gnutella – Scalability Issue Can the system withstand flooding from every node? Use TTL to limit the range of propagation 5 ^ 5 = 3125, how much can you get ? Creates an “horizon” of computers The promise is an expectation that you can change horizon everyday when login
44
The Differences While the pure P2P model is completely symmetric, in the hybrid model elements of both PP2P and C/S coexist Each model has its disadvantages PP2P is still having problems locating information HP2P is having scalability problems as with ordinary server oriented models
45
P2P – Summary The current settings allowed P2P to enter the world of PCs Controls the niche of sharing resources The model is being studied from the academic and commercial point of view There are still problems out there…
46
End Of Part I
47
Part II Roy Werber Idan Gelbourt Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
48
A P2P Problem Every application in a P2P environment must handle an important problem: The lookup problem What is the problem?
49
A Peer-to-peer Storage Problem 1000 scattered music enthusiasts Willing to store and serve replicas How do you find the data?
50
The Lookup Problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Key=“title” Value=MP3 data… Client Lookup(“title”) ? Dynamic network with N nodes, how can the data be found?
51
Centralized Lookup (Napster) Publisher@ Client Lookup(“title”) N6N6 N9N9 N7N7 DB N8N8 N3N3 N2N2 N1N1 SetLoc(“title”, N4) Simple, but O( N ) state and a single point of failure Key=“title” Value=MP3 data… N4N4 Hard to keep the data in the server updated
52
Flooded queries (Gnutella) N4N4 Publisher@ Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Robust, but worst case O( N ) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”) Not scalable
53
So Far Centralized : - Table size – O(n) - Number of hops – O(1) Flooded queries: - Table size – O(1) - Number of hops – O(n)
54
We Want Efficiency : O(log(N)) messages per lookup N is the total number of servers Scalability : O(log(N)) state per node Robustness : surviving massive failures
55
How Can It Be Done? How do you search in O(log(n)) time? Binary search You need an ordered array How can you order nodes in a network and data items? Hash function!
56
Chord: Namespace Namespace is a fixed length bit string Each object is identified by a unique ID How to get the ID? Shark SHA-1 Object ID:DE11AC SHA-1 Object ID:AABBCC 194.90.1.5:8080
57
Chord Overview Provides just one operation : A peer-to-peer hash lookup: Lookup(key) IP address Chord does not store the data Chord is a lookup service, not a search service It is a building block for P2P applications
58
Chord IDs Uses Hash function: Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Both are uniformly distributed Both exist in the same ID space How to map key IDs to node IDs?
59
Mapping Keys To Nodes 0 M - an item - a node
60
Consistent Hashing [Karger 97] N32 N90 N105 K80 K20 K5 Circular 7-bit ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID
61
Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”
62
“Finger Table” Allows Log(n)-time Lookups N80 1/128 ½ ¼ 1/8 1/16 1/32 1/64 Circular 7-bit ID space N80 knows of only seven other nodes.
63
Finger i Points to Successor of N+2 i N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128 112 N120
64
Lookups Take O(log(n)) Hops N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19
65
Joining: Linked List Insert N36 N40 N25 1. Lookup(36) K30 K38 1. N36 wants to join. He finds his successor
66
Join (2) N36 N40 N25 2. N36 sets its own successor pointer K30 K38
67
Join (3) N36 N40 N25 3. Copy keys 26..36 from N40 to N36 K30 K38 K30
68
Join (4) 4. Set N25’s successor pointer Update finger pointers in the background Correct successors produce correct lookups N36 N40 N25 K30 K38 K30
69
Join: Lazy Finger Update Is OK N36 N40 N25 N2 K30 N2 finger should now point to N36, not N40 Lookup(K30) visits only nodes < 30, will undershoot
70
Failures Might Cause Incorrect Lookup N120 N113 N102 N80 N85 N80 doesn’t know correct successor, so incorrect lookup N10 Lookup(90)
71
Solution: Successor Lists Each node knows r immediate successors After failure, will know first live successor Correct successors guarantee correct lookups Guarantee is with some probability
72
Choosing the Successor List Length Assume 1/2 of nodes fail P(successor list all dead) = (1/2) r i.e. P(this node breaks the Chord ring) Depends on independent failure P(no broken nodes) = (1 – (1/2) r ) N If we choose : r = 2log(N) makes prob. = 1 – 1/N
73
Chord Properties Log(n) lookup messages and table space. Well-defined location for each ID. No search required. Natural load balance. No name structure imposed. Minimal join/leave disruption. Does not store documents…
74
Experimental Overview Quick lookup in large systems Low variation in lookup costs Robust despite massive failure See paper for more results Experiments confirm theoretical results
75
Chord Lookup Cost Is O(log N) Number of Nodes Average Messages per Lookup Constant is 1/2
76
Failure Experimental Setup Start 1,000 CFS/Chord servers Successor list has 20 entries Wait until they stabilize Insert 1,000 key/value pairs Five replicas of each Stop X% of the servers Immediately perform 1,000 lookups
77
Massive Failures Have Little Impact Failed Lookups (Percent) Failed Nodes (Percent) (1/2) 6 is 1.6%
78
Chord Summary Chord provides peer-to-peer hash lookup Efficient: O(log(n)) messages per lookup Robust as nodes fail and join Good primitive for peer-to-peer systems http://www.pdos.lcs.mit.edu/chord
80
Wide-area Cooperative Storage With CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley
81
What Can Be Done With Chord Cooperative Mirroring Time-Shared Storage Makes data available when offline Distributed Indexes Support Napster keyword search
82
How to Mirror Open-source Distributions? Multiple independent distributions Each has high peak load, low average Individual servers are wasteful Solution: aggregate Option 1: single powerful server Option 2: distributed service But how do you find the data?
83
Design Challenges Avoid hot spots Spread storage burden evenly Tolerate unreliable participants Fetch speed comparable to whole-file TCP Avoid O(#participants) algorithms Centralized mechanisms [Napster], broadcasts [Gnutella] CFS solves these challenges
84
CFS Overview CFS – Cooperative File System: P2P read-only storage system Read-only – only the owner can modify files Completely decentralized node clientserver node clientserver Internet
85
CFS - File System A set of blocks distributed over the CFS servers 3 layers: FS – interprets blocks as files (Unix V7) Dhash – performs block management Chord – maintains routing tables used to find blocks
86
Chord Uses 160-bit identifier space Assigns to each node and block an identifier Maps block’s id to node’s id Performs key lookups (as we saw earlier)
87
Dhash – Distributed Hashing Performs blocks management on top of chord : Block’s retrieval,storage and caching Provides load balance for popular files Replicates each block at a small number of places (for fault-tolerance)
88
CFS - Properties Tested on prototype : Efficient Robust Load-balanced Scalable Download as fast as FTP Drawbacks No anonymity Assumes no malicious participants
89
Design Overview FS Dhash Chord Dhash Chord DHash stores, balances, replicates, caches blocks DHash uses Chord [SIGCOMM 2001] to locate blocks
90
Client-server Interface Files have unique names Files are read-only (single writer, many readers) Publishers split files into blocks Clients check files for authenticity FS Clientserver Insert file f Lookup file f Insert block Lookup block node server node
91
Naming and Authentication 1.Name could be hash of file content Easy for client to verify But update requires new file name 2.Name could be a public key Document contains digital signature Allows verified updates w/ same name
92
CFS File Structure Public key Root block signature H(D) D Directory block H(F) F Inode block H(B1) B1 B2 H(B2) Data block
93
File Storage Data is stored for an agreed-upon finite interval Extensions can be requested No specific delete command After expiration – the blocks fade
94
Storing Blocks Long-term blocks are stored for a fixed time Publishers need to refresh periodically Cache uses LRU (Least Recently Used) disk: cacheLong-term block storage
95
Replicate Blocks at k Successors N40 N10 N5 N20 N110 N99 N80 N60 N50 Block 17 N68 Replica failure is independent
96
Lookups Find Replicas N40 N10 N5 N20 N110 N99 N80 N60 N50 Block 17 N68 1. 3. 2. 4. Lookup(BlockID=17) RPCs: 1. Lookup step 2. Get successor list 3. Failed block fetch 4. Block fetch
97
First Live Successor Manages Replicas N40 N10 N5 N20 N110 N99 N80 N60 N50 Block 17 N68 Copy of 17
98
DHash Copies to Caches Along Lookup Path N40 N10 N5 N20 N110 N99 N80 N60 Lookup(BlockID=45) N50 N68 1. 2. 3. RPCs: 1. Chord lookup 2. Chord lookup 3. Block fetch 4. Send to cache 4.
99
Naming and Caching D30 @ N32 Client 1 Client 2 Every hop is smaller,the chance of collision when doing lookup is high Caching is efficient
100
Caching Doesn’t Worsen Load N32 Only O(log N) nodes have fingers pointing to N32 This limits the single-block load on N32
101
Virtual Nodes Allow Heterogeneity – Load Balancing Hosts may differ in disk/net capacity Hosts may advertise multiple IDs Chosen as SHA-1(IP Address, index) Each ID represents a “virtual node” Host load proportional to # v.n.’s Manually controlled Node A N60N10N101 Node B N5
102
Server Selection By Chord N80 N48 100ms 10ms Each node monitors RTTs to its own fingers Tradeoff: ID-space progress vs delay N25 N90 N96 N18N115 N70 N37 N55 50ms 12ms Lookup(47)
103
Why Blocks Instead of Files? Cost: one lookup per block Can tailor cost by choosing good block size Benefit: load balance is simple For large files Storage cost of large files is spread out Popular files are served in parallel
104
CFS Project Status Working prototype software Some abuse prevention mechanisms Guarantees authenticity of files, updates, etc. Napster-like interface in the works Decentralized indexing system Some measurements on RON testbed Simulation results to test scalability
105
Experimental Setup (12 nodes) One virtual node per host 8Kbyte blocks RPCs use UDP Caching turned off Proximity routing turned off
106
CFS Fetch Time for 1MB File Average over the 12 hosts No replication, no caching; 8 KByte blocks Fetch Time (Seconds) Prefetch Window (KBytes)
107
Distribution of Fetch Times for 1MB Fraction of Fetches Time (Seconds) 8 Kbyte Prefetch 24 Kbyte Prefetch40 Kbyte Prefetch
108
CFS Fetch Time vs. Whole File TCP Fraction of Fetches Time (Seconds) 40 Kbyte Prefetch Whole File TCP
109
Robustness vs. Failures Failed Lookups (Fraction) Failed Nodes (Fraction) (1/2) 6 is 0.016 Six replicas per block;
110
Future work Test load balancing with real workloads Deal better with malicious nodes Indexing Other applications
111
CFS Summary CFS provides peer-to-peer r/o storage Structure: DHash and Chord It is efficient, robust, and load-balanced It uses block-level distribution The prototype is as fast as whole-file TCP http://www.pdos.lcs.mit.edu/chord
112
The End
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.