Presentation is loading. Please wait.

Presentation is loading. Please wait.

CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XIV: P2P.

Similar presentations


Presentation on theme: "CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XIV: P2P."— Presentation transcript:

1 CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XIV: P2P

2 2 CMPT 401 Summer 2007 © A. Fedorova Outline Definition of peer-to-peer systems Motivation and challenges of peer-to-peer systems Early P2P systems (Napster, Gnutella) Structured overlays (Pastry) P2P applications: Squirrel, OceanStore

3 3 CMPT 401 Summer 2007 © A. Fedorova Definition of P2P Peer to Peer Client Server P2P systems motivated by massive computing resources connected over the network available all over the world

4 4 CMPT 401 Summer 2007 © A. Fedorova Simple Definition –A network architecture –Without centralized coordination Complex Definition –Each node/peer is client and server at the same time –Each peer provides content and/or resources –Direct exchange between peers –Autonomy of peers (can join and leave at their will) Definition of Peer-2-Peer

5 5 CMPT 401 Summer 2007 © A. Fedorova Why P2P? Enable the sharing of data and resources. Computer and Internet usage has exploded in the recent years Massive computing resource available at the edges of the Internet – storage, cycles, content, human presence.

6 6 CMPT 401 Summer 2007 © A. Fedorova WORLD INTERNET USAGE AND POPULATION STATISTICS World Regions Population ( 2007 Est.) Population % of World Internet Usage, Latest Data % Population ( Penetration ) Usage % of World Usage Growth 2000- 2007 Africa 933,448,29214.2 %33,545,6003.6 %2.9 %643.1 % Asia 3,712,527,62456.5 %418,007,01511.3 %36.2 %265.7 % Europe 809,624,68612.3 %321,853,47739.8 %27.9%206.2 % Middle East 193,452,7272.9 %19,539,30010.1 %1.7 %494.8 % North America 334,538,0185.1 %232,655,28769.5 %20.2%115.2 % Latin America/Caribbean 556,606,6278.5 %109,961,60919.8 %9.5 %508.6 % Oceania / Australia 34,468,4430.5 %18,796,49054.5 %1.6 %146.7 % WORLD TOTAL 6,574,666,417100.0 %1,154,358,77817.6 %100.0 %219.8 %

7 7 CMPT 401 Summer 2007 © A. Fedorova Benefits and Challenges Benefits –Massive resources –Load balancing –Anonymity –Fault tolerance –Locality Challenges –Security –Failure handling (nodes coming and leaving) –Efficiency – massive system: how to search it efficiently –Support data mutation

8 8 CMPT 401 Summer 2007 © A. Fedorova Evolution of P2P Systems Three generations: Generation 1: Early music exchange services (Napster, Gnutella) Generation 2: Offers greater scalability, anonymity and fault tolerance (Kazaa) Generation 3: Emergence of middleware layers for the application-independent management (Pastry, Tapestry)

9 9 CMPT 401 Summer 2007 © A. Fedorova Architecture Peer to Peer Hybrid (Napster, SETI@Home) Pure Unstructured (Gnutella) Structured (Pastry) Super-peer (Kazaa)

10 10 CMPT 401 Summer 2007 © A. Fedorova Overlay Routing versus IP Routing Routing overlays: route from one node in the P2P system to another At each hop deliver to the next P2P node Another layer or routing on top of existing IP routing

11 11 CMPT 401 Summer 2007 © A. Fedorova Search in Hybrid P2P: Napster Lookup Server, Index table Peer A Peer B (song A) Peer C Peer D 2.Return IP(B) 3. Download from B 0. Upload Song Names 0. Upload Song Names 0. Upload Song Names 1.Query song A Lookup centralized Peers provide meta-information to Lookup server Data exchange between peers

12 12 CMPT 401 Summer 2007 © A. Fedorova Search in Unstructured P2P Peer A Peer B Peer C Peer I (song A) Peer F 1. Query song A 2. query Peer H Peer EPeer D Peer G 3. [File found] Download TTL= NTTL= N-1

13 13 CMPT 401 Summer 2007 © A. Fedorova Common Issues Organize, maintain overlay network –node arrivals –node failures Resource allocation/load balancing Resource location Network proximity routing Idea: provide a generic P2P substrate (Pastry, Chord, Tapestry)

14 14 CMPT 401 Summer 2007 © A. Fedorova Architecture TCP/IP Pastry Network storage Event notification Internet P2P substrate (self-organizing overlay network) P2P application layer ?

15 15 CMPT 401 Summer 2007 © A. Fedorova Pastry Generic p2p location and routing substrate Self-organizing overlay network Lookup/insert object in < log 16 N routing steps (expected) O(log N) per-node state

16 16 CMPT 401 Summer 2007 © A. Fedorova Pastry: Object distribution objId Globally Unique IDs (GUIDs) 128 bit circular GUID space nodeIds (uniform random) objIds (uniform random) Invariant: node with numerically closest nodeId maintains object nodeIds O 2 128 -1

17 17 CMPT 401 Summer 2007 © A. Fedorova Pastry: Object insertion/lookup X Route(X) Msg with key X is routed to live node with nodeId closest to X Problem: complete routing table not feasible O 2 128 -1

18 18 CMPT 401 Summer 2007 © A. Fedorova Pastry Routing Leaf sets – closest nodes Routing table – subset of nodes that are far away If you are far from the target node/object, route using routing table Once you get closer use the leaf set Routing table has to be well populated, so you can reach many far-away destinations A complete routing table can be very large How to make routing table size feasible?

19 19 CMPT 401 Summer 2007 © A. Fedorova Pastry: Routing Properties log 16 N steps O(log N) state d46a1c Route(d46a1c) d462ba d4213f d13da3 65a1fc d467c4 d471f1

20 20 CMPT 401 Summer 2007 © A. Fedorova Pastry: Routing table (# 65a1fc) log 16 N rows Row 0 Row 1 Row 2 Row 3 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 7x7x 8x8x 9x9x axax bxbx cxcx dxdx exex fxfx 60x60x 61x61x 62x62x 63x63x 64x64x 66x66x 67x67x 68x68x 69x69x 6ax6ax 6bx6bx 6cx6cx 6dx6dx 6ex6ex 6fx6fx 650x650x 651x651x 652x652x 653x653x 654x654x 655x655x 656x656x 657x657x 658x658x 659x659x 65bx65bx 65cx65cx 65dx65dx 65ex65ex 65fx65fx 65a0x65a0x 65a2x65a2x 65a3x65a3x 65a4x65a4x 65a5x65a5x 65a6x65a6x 65a7x65a7x 65a8x65a8x 65a9x65a9x 65aax65aax 65abx65abx 65acx65acx 65adx65adx 65aex65aex 65afx65afx

21 21 CMPT 401 Summer 2007 © A. Fedorova Pastry Routing Table Each row i corresponds to the length of the common prefix –row 0 – 0 hex digits in common –row 1 – 1 common hex digit in common Each column corresponds to (i+1)st digit that’s not in common –column 0 – first uncommon digit is 0 –column A – first uncommon digit is A Corresponding entries are [GUID, IP] pairs You go as far down the rows in routing table as possible When you can’t go anymore (no more matching digits), forward request to [GUID, IP] in the column containing the first uncommon digit

22 22 CMPT 401 Summer 2007 © A. Fedorova Pastry Routing: What’s the Next Hop? log 16 N rows Row 0 Row 1 Row 2 Row 3 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 7x7x 8x8x 9x9x axax bxbx cxcx dxdx exex fxfx 60x60x 61x61x 62x62x 63x63x 64x64x 66x66x 67x67x 68x68x 69x69x 6ax6ax 6bx6bx 6cx6cx 6dx6dx 6ex6ex 6fx6fx 650x650x 651x651x 652x652x 653x653x 654x654x 655x655x 656x656x 657x657x 658x658x 659x659x 65bx65bx 65cx65cx 65dx65dx 65ex65ex 65fx65fx 65a0x65a0x 65a2x65a2x 65a3x65a3x 65a4x65a4x 65a5x65a5x 65a6x65a6x 65a7x65a7x 65a8x65a8x 65a9x65a9x 65aax65aax 65abx65abx 65acx65acx 65adx65adx 65aex65aex 65afx65afx

23 23 CMPT 401 Summer 2007 © A. Fedorova Pastry: Routing Algorithm if (destination D is within range of our leaf set) forward to numerically closest member else let l = length of shared prefix let d = value of l+1-th digit in D’s address let R l d =table entry at row=l, column=d if (R l d exists) forward to IP address at R l d else forward to a known node that (a) shares at least as long a prefix (b) is numerically closer than this node

24 24 CMPT 401 Summer 2007 © A. Fedorova Let’s Play Pastry! User at node 65a1fc Wants to get to object with GUID d46a1c We will see how each next hop is found using a routing table or leaf set So, let’s start with routing table and leaf set at node 65a1fc

25 25 CMPT 401 Summer 2007 © A. Fedorova Node: 65a1fc Destination: d46a1c Leaf set: 65a123 65abba 65badd 65cafe GUID = d13da3 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 7x7x 8x8x 9x9x axax bxbx cxcx dxdx exex fxfx 60x60x 61x61x 62x62x 63x63x 64x64x 66x66x 67x67x 68x68x 69x69x 6ax6ax 6bx6bx 6cx6cx 6dx6dx 6ex6ex 6fx6fx 650x650x 651x651x 652x652x 653x653x 654x654x 655x655x 656x656x 657x657x 658x658x 659x659x 65bx65bx 65cx65cx 65dx65dx 65ex65ex 65fx65fx 65a0x65a0x 65a2x65a2x 65a3x65a3x 65a4x65a4x 65a5x65a5x 65a6x65a6x 65a7x65a7x 65a8x65a8x 65a9x65a9x 65aax65aax 65abx65abx 65acx65acx 65adx65adx 65aex65aex 65afx65afx

26 26 CMPT 401 Summer 2007 © A. Fedorova Node: d13da3 Destination: d46a1c Leaf set: d13555 d14abc da1367 dbcdd5 GUID = d4213f 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 6x6x 7x7x 8x8x 9x9x axax bxbx cxcx dexex fxfx d0xd0x d2xd2x d3xd3x d4xd4x d5xd5x d6xd6x d7xd7x d8xd8x d9xd9x daxdax dbxdbx dcxdcx ddxddx dexdex dfxdfx d10xd10x d11xd11x d12xd12x d14xd14x d15xd15x d16xd16x d17xd17x d18xd18x d19xd19x d1axd1ax d1bxd1bx d1cxd1cx d1dxd1dx d1exd1ex d1fxd1fx d130xd130x d131xd131x d132xd132x d133xd133x d134xd134x d135xd135x d136xd136x d137xd137x d138xd138x d139xd139x d13axd13ax d13bxd13bx d13cxd13cx d13exd13ex d13fxd13fx

27 27 CMPT 401 Summer 2007 © A. Fedorova Node: d4213f Destination: d46a1c Leaf set: d42cab d42fab dacabb ddaddd GUID = d462ba 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 6x6x 7x7x 8x8x 9x9x axax bxbx cxcx dexex fxfx d0xd0x d1xd1x d2xd2x d3xd3x d5xd5x d6xd6x d7xd7x d8xd8x d9xd9x daxdax dbxdbx dcxdcx ddxddx dexdex dfxdfx d40xd40x d41xd41x d43xd43x d44xd44x d45xd45x d46xd46x d47xd47x d48xd48x d49xd49x d4axd4ax d4bxd4bx d4cxd4cx d4dxd4dx d4exd4ex d4fxd4fx d420xd420x d422xd422x d423xd423x d424xd424x d425xd425x d426xd426x d427xd427x d428xd428x d429xd429x d42axd42ax d42bxd42bx d42cxd42cx d42dxd42dx d42exd42ex d42fxd42fx

28 28 CMPT 401 Summer 2007 © A. Fedorova 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 6x6x 7x7x 8x8x 9x9x axax bxbx cxcx dexex fxfx d0xd0x d1xd1x d2xd2x d3xd3x d5xd5x d6xd6x d7xd7x d8xd8x d9xd9x daxdax dbxdbx dcxdcx ddxddx dexdex dfxdfx d40xd40x d41xd41x d42xd42x d43xd43x d44xd44x d45xd45x d47xd47x d48xd48x d49xd49x d4axd4ax d4bxd4bx d4cxd4cx d4dxd4dx d4exd4ex d4fxd4fx d460xd460x d461xd461x d463xd463x d464xd464x d465xd465x d466xd466x d467xd467x d468xd468x d469xd469x d46ad46a d46bxd46bx d46cxd46cx d46dxd46dx d46exd46ex d46fxd46fx Node: d462ba Destination: d46a1c Leaf set: d46cab d46fab dacada deaddd GUID = empty? Forward to any GUID with longest common prefix that’s numerically closer than current node GUID = d469ab

29 29 CMPT 401 Summer 2007 © A. Fedorova Node: d469ab Destination: d46a1c Leaf set: d469ac d46a00 d46a1c dcadda 0x0x 1x1x 2x2x 3x3x 4x4x 5x5x 6x6x 7x7x 8x8x 9x9x axax bxbx cxcx dexex fxfx d0xd0x d1xd1x d2xd2x d3xd3x d5xd5x d6xd6x d7xd7x d8xd8x d9xd9x daxdax dbxdbx dcxdcx ddxddx dexdex dfxdfx d40xd40x d41xd41x d42xd42x d43xd43x d44xd44x d45xd45x d47xd47x d48xd48x d49xd49x d4axd4ax d4bxd4bx d4cxd4cx d4dxd4dx d4exd4ex d4fxd4fx d460xd460x d461xd461x d462xd462x d463xd463x d464xd464x d465xd465x d466xd466x d467xd467x d468xd468x d46axd46ax d46bxd46bx d46cxd46cx d46dxd46dx d46exd46ex d46fxd46fx We are done!

30 30 CMPT 401 Summer 2007 © A. Fedorova A New Node Joining Pastry Compute its own GUID X – apply SHA-1 hash function to its public key Get IP address of at least one Pastry node (publicly available) Find a nearby Pastry node A (by repeatedly querying nodes in a leaf set of a known Pastry node) Send a join message to A, with destination X A will route message to node Z numerically closest to X Nodes along the route are: B, C, … Each node on the route send to X a part of its routing table and leaf set X constructs its own routing table and leaf set, requests additional info if needed

31 31 CMPT 401 Summer 2007 © A. Fedorova Node Failure or Departure Repairs to leaf set –Members of leaf set are monitored with heartbeat messages –If a member has failed, The node searches for another node numerically closest the failed member –The node asks that other node for its leaf set adds members from that leaf set to its own leaf set –The node also informs its other neighbours of the failure Repairs to routing table –Done on “when discovered basis”

32 32 CMPT 401 Summer 2007 © A. Fedorova Pastry Evaluation: Experimental Setup Evaluated on a simulator A single machine simulates a large network of nodes Message passing replaced by simulated transmission delay Model join/leave behaviour of hosts IP delays and join/leave behaviour parameters and based on real measurements Simulator validated using a real installation of 52 nodes

33 33 CMPT 401 Summer 2007 © A. Fedorova Pastry Evaluation: Dependability With IP message loss rate of 0% –Pastry failed to deliver 1.5 in 100,000 requests (due to unavailability of destination host) –All requests that were delivered arrived at the correct node With IP message loss rate of 5% –Pastry lost 3.3 in 100,000 requests –1.6 in 100,000 requests were delivered to the wrong node

34 34 CMPT 401 Summer 2007 © A. Fedorova Pastry Evaluation: Performance Performance metric: relative delay penalty (RDP) RDP: ratio between delay in delivering request by the routing overlay and in delivering that request via UDP/IP A direct measure of the extra cost incurred in employing an overlay routing RDP in Pastry: –1.8 with zero network message loss –2.2 with 5% network message loss

35 35 CMPT 401 Summer 2007 © A. Fedorova Squirrel Web cache. Idea: P2P caching of web objects Cache web objects on nodes in a local network organized in a P2P network over Pastry Motivation: no need for a centralized proxy cache Each Squirrel node has a Pastry GUID Each URL has a Pastry GUID (computed by applying SHA-1 hash to the URL) Squirrel node whose GUID is numerically closest to the URL GUID becomes the home node for that URL, i.e., caches that URL Simulation-based evaluation concluded that performance is comparable to that of the centralized cache Squirrel was subsequently employed for real at a local network of 52 nodes

36 36 CMPT 401 Summer 2007 © A. Fedorova OceanStore Massive storage system Incrementally scalable persistent storage facility Replicated storage of both mutable and immutable objects Built on top of P2P middleware Tapestry (based on GUIDs, similar to Pastry) OceanStore objects: like files – data stored in a set of blocks Each object is an ordered sequence of immutable versions that are (in principle) kept forever Any update to an object results in the generation of a new version

37 37 CMPT 401 Summer 2007 © A. Fedorova OceanStore

38 38 CMPT 401 Summer 2007 © A. Fedorova OceanStore, Update Clients contact primary replicas to make update requests Primary replicas are powerful stable machines. They reach an agreement of accepting the update or not The update data will be sent to archive servers for permanent storages Meanwhile, the update data will be propagated to secondary replicas for queries issued by other clients Clients must periodically check for new copies

39 39 CMPT 401 Summer 2007 © A. Fedorova Summary P2P systems harness massive computing resources available at the edges of the Internet Early systems partly depended on a central server (Napster) or used unstructured routing, e.g., flooding, (Gnutella) Later it was identified that common requirements for P2P systems could be solved by providing P2P middleware (Pastry, Tapestry, Chord) P2P middleware enables routing, self organization, node arrival and departure, failure recovery Most P2P applications support sharing of immutable objects (Kazaa, BitTorrent) Some support mutable objects (OceanStore, Ivy) Other uses of P2P technology include Internet telephony (Skype)


Download ppt "CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XIV: P2P."

Similar presentations


Ads by Google