P2P Networking 2010
Client/Server Architecture GET /index.html HTTP/1.0 HTTP/1.1 200 OK ... Server Clients
Peer-to-Peer Architecture Gateway Server Peers
The architectures Server-based architecture Client-Server / Server-Cluster Problems: Limited resources All loads are centered on the server Server-based architecture has low scalability. The setup and maintenance cost is high. Peer-to-Peer (P2P) architecture Advantages: Distributing loads to all users Users consume and provide resources P2P architecture has high scalability. The setup and maintenance cost is low.
The Client Side Today’s clients can perform more roles than just forwarding users requests Today’s clients have: more computing power more storage space Thin client Fat client
Evolution at the Client Side IBM 8-bit PC @ 4.77MHz 360k diskettes 64-bit PC @ 4-core 4GHz 750GB HD DEC’S VT100 No storage 2010 ‘70 ‘80
What Else Has Changed? Peer-to-Peer (P2P) The number of home PCs is increasing rapidly Most of the PCs are “fat clients” As the Internet usage grow, more and more PCs are connecting to the global net Most of the time PCs are idle How can we use all this? Peer-to-Peer (P2P)
What is peer-to-peer (P2P)? “Peer-to-peer is a way of structuring distributed applications such that the individual nodes have symmetric roles. Rather than being divided into clients and servers each with quite distinct roles, in P2P applications a node may act as both a client and a server.” -- Charter of Peer-to-peer Research Group, IETF/IRTF, June 24, 2004 (http://www.irtf.org/charters/p2prg.html)
Resources Sharing What can we share? Computer-related resources Shareable related-computer resources: CPU cycles - seti@home, GIMPS Bandwidth - PPLive, PPStream Storage Space - OceanStore, Murex Data - Napster, Gnutella People - Buddy Finder Camera, Microphone, Sensor, Service???
SETI@Home SETI – Search for Extra-Terrestrial Intelligence @Home – On your own computer A radio telescope in Puerto Rico scans the sky for radio signals Fills a DAT tape of 35GB in 15 hours That data have to be analyzed
SETI@Home - Example
MUREX: A Mutable Replica Control Scheme for Peer-to-Peer Storage Systems Jehn-Ruey Jiang ACN Lab NCU
Murex: Basic Concept HotOS Attendee
Peer-to-Peer Video Streaming … Video stream
Napster -- Shawn Fanning
Napster Sharing Style: hybrid center+edge Title User Speed song1.mp3 beasiteboy DSL song2.mp3 beasiteboy DSL song3.mp3 beasiteboy DSL song4.mp3 kingrook T1 song5.mp3 kingrook T1 song5.mp3 slashdot 28.8 song6.mp3 kingrook T1 song6.mp3 slashdot 28.8 song7.mp3 slashdot 28.8 1. Users launch Napster and connect to Napster server 2. Napster creates dynamic directory from users’ personal .mp3 libraries 3. beastieboy enters search criteria s o n g 5 “beastieboy” song1.mp3 song2.mp3 song3.mp3 “kingrook” song4.mp3 song5.mp3 song6.mp3 “slashdot” song5.mp3 song6.mp3 song7.mp3 4. Napster displays matches to beastieboy 5. beastieboy makes direct connection to kingrook for file transfer song5.mp3
History of Napster 5/99: Shawn Fanning (freshman, Northeastern University) founds Napster Online (supported by Groove) 12/99: First lawsuit 7/01: simultaneous online users 160K 6/02: file bankrupt … 10/03: Napster 2 (Supported by Roxio) (users should pay $9.99/month)
Gnutella -- Justin Frankel and Tom Pepper Gnutella (pronounced: /nʊˈtɛlə/ with a silent g, or alternatively /gnʊˈtɛlə/, following R. M. Stallman’s pronunciation of GNU) is a file sharing network. As of December 2005, Gnutella is the third-most-popular file sharing network on the Internet, following eDonkey 2000 and FastTrack. Gnutella is thought to host on an average of approximately 2.2 million users, although around 750,000-1,000,000 are on-line at any given moment.[1]
GNU: Recursive Acronym The ‘Animal’ GNU GNU: Recursive Acronym GNU’s Not Unix …. + Gnutella = GNU Nutella Nutella: a hazelnut chocolate spread produced by the Italian confectioner Ferrero ….
GNU GNU's Not Unix 1983 Richard Stallman (MIT) established Free Software Foundation and Proposed GNU Project Free software is not freeware GPL: GNU General Public License
Gnutella History Gnutella was written by Justin Frankel, the 21-year-old founder of Nullsoft. (Nullsoft acquired by AOL, June 1999) Nullsoft (maker of WinAmp) posted Gnutella on the Web, March 14, 2000. A day later AOL yanked Gnutella, at the behest of Time Warner. Too late: 23k users on Gnutella People had already downloaded and shared the program. Gnutella continues today, run by independent programmers.
Scenario: Joining Gnutella Network Gnutella Protocol Scenario: Joining Gnutella Network Gnutella Network The new node connects to a well known ‘Anchor’ node or ‘Bootstrap’ node. Then sends a PING message to discover other nodes. PONG messages are sent in reply from hosts offering new connections with the new node. Direct connections are then made to the newly discovered nodes. New PING PING PING PONG PING PING A PING PING PONG PING PING PING
Peer-to-Peer Overlay Network Focus at the application layer
Peer-to-Peer Overlay Network End systems one hop (end-to-end comm.) a TCP thru the Internet Internet
Topology of a Gnutella Network
Gnutella: Issue a Request xyz.mp3 ?
Gnutella: Flood the Request
Gnutella: Reply with the File Fully distributed storage and directory! xyz.mp3
n: number of participating nodes So Far n: number of participating nodes Centralized : - Directory size – O(n) - Number of hops – O(1) Flooded queries: - Directory size – O(1) - Number of hops – O(n)
We Want Efficiency : O(log(n)) messages per lookup Scalability : O(log(n)) state per node Robustness : surviving massive failures
How Can It Be Done? How do you search in O(log(n)) time? Binary search You need an ordered array How can you order nodes in a network and data objects? Hash function!
Object ID (key):AABBCC Example of Hasing SHA-1 Object ID (key):AABBCC Shark SHA-1 Object ID (key):DE11AC 194.90.1.5:8080
Basic Idea Place object to the peer with closest hash keys P2P Network Publish (H(y)) Join (H(x)) Object “y” Peer “x” H(y) H(x) Objects have hash keys Peer nodes also have hash keys in the same hash space y x Hash key Place object to the peer with closest hash keys
Mapping an object to the closest node with a larger key M - a node - an data object
Viewed as a Distributed Hash Table 2128-1 Hash table Peer node Internet
DHT Distributed Hash Table Input: key (file name) Output: value (file location) Each node is responsible for a range of the hash table, according to the node’s hash key. Objects’ directories are placed in (managed by) the node with the closest key It must be adaptive to dynamic node joining and leaving
How to Find an Object? 2128-1 Hash table Peer node
Simple Idea Track peers which allow us to move quickly across the hash space a peer p tracks those peers responsible for hash keys (p+2i-1), i=1,..,m i i+22 i+24 i+28 2128-1 Hash table Peer node
Chord Lookup – with finger table Start Int. node 2+1 [3,4) 3 2+2 [4,6) 7 2+4 [6,10) 2+8 [10,2) 10 O(log n) hops (messages) for each lookup!! 1 I’m node 2. Please find key 14! 15 14 2 14 ∈[10,2) O(log n) states per node 3 12 Start Int. node 10+1 [11,12) 12 10+2 [12,14) 10+4 [14,2) 14 10+8 [2,10) 2 Circular 4-bit ID space 10 14 ∈[14,2) 7
Classification of P2P systems Hybrid P2P – Preserves some of the traditional C/S architecture. A central server links between clients, stores indices tables, etc Napster Unstructured P2P – no control over topology and file placement Gnutella, Morpheus, Kazaa, etc Structured P2P – topology is tightly controlled and placement of files are not random Chord, CAN, Pastry, Tornado, etc Morpheus the Greek god of dreams and of sl51p 【希】【神話】睡夢之神 Tornado a violent and destructive whirlwind, in the form of a funnel-shaped cloud moving on a long and narrow path 龍捲風
Q&A