“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Evaluation of a Scalable P2P Lookup Protocol for Internet Applications
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
PDPTA03, Las Vegas, June S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord Valentin Mesaros 1, Bruno Carton 2, and Peter Van Roy 1 1.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Small-world Overlay P2P Network
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Pastry: Scalable, decentralized object location and routing for large-scale peer-to-peer systems Antony Rowstron and Peter Druschel Proc. of the 18th IFIP/ACM.
SkipNet Christian Schmidt-Madsen, Peter Tiedemann,
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
P2P Course, Structured systems 1 Skip Net (9/11/05)
P2P Course, Structured systems 1 Introduction (26/10/05)
Peer-to-Peer Networks Slides largely adopted from Ion Stoica’s lecture at UCB.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
SIMULATING A MOBILE PEER-TO-PEER NETWORK Simo Sibakov Department of Communications and Networking (Comnet) Helsinki University of Technology Supervisor:
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
1. Outline  Introduction  Different Mechanisms Broadcasting Multicasting Forward Pointers Home-based approach Distributed Hash Tables Hierarchical approaches.
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
Peer to Peer Network Design Discovery and Routing algorithms
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Plethora: A Locality Enhancing Peer-to-Peer Network Ronaldo Alves Ferreira Advisor: Ananth Grama Co-advisor: Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Improving and Generalizing Chord
S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord
Accessing nearby copies of replicated objects
5.2 FLAT NAMING.
A Scalable Peer-to-peer Lookup Service for Internet Applications
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
A. D. Sotiriou, P. Kalliaras, N. Mitrou
Presentation transcript:

“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou

Overview Novel distributed hash table architecture Supports key publishing and retrieval on top of an overlay network for content distribution Efficient algorithms based on a distributed routing table of constant size for each node Minimize traffic load

Related Work Plaxton, Rajaraman and Richa  Algorithm wasn’t developed for P2P systems  Based on the ground rule of comparing one byte at a time  Required knowledge of latencies between all nodes Tapestry  Variation of the Plaxton  Adjusted for P2P systems  Routing table of β*log β N neighbors, search of log β N maximum steps Chord applied a different approach  Placed nodes in a circular space  Maintained information only for a number of successor and predecessor nodes through a finger table  Finger table of O(logN) size CAN furthered on Pastry’s alternation and  Implied DHT in a d-dimensional Cartesian space based on a d-tore  Constantly divided space and distributed it amongst nodes  Maintained information about their neighbors  Constant O(d) table but required O(dN 1/d ) steps for lookups

Structure Overview Creation of an overlay network All inserting nodes are identified by a unique code  SHA-1 on combination of IP and computer name Main objective of the architecture  Insert and retain nodes in a simple and well structured manner  Allow querying and fetching of content Efficient Fault-tolerant Retain up-to-date information of a limited, constant number of neighboring nodes

Structure Overview Form of a 16-ary tree  Each node is placed in a hierarchy  1 parent node  16 child nodes  Each node operates autonomously  Further links for fault-tolerance Each level n withholding max 16 n+1 nodes The relation between a parent node at level n and a child node (level n+1) :  The n+1 first digits of the parent’s identifier are equal with the corresponding of the child’s  The n+2 digit of the child’s identifier determines the child’s position in the parent’s child list

Routing Table Three sets of neighborhood nodes  Basic Main table for routing  Upper Allows routing to nodes of higher level (when the parent node is unreachable)  Lower Allows routing to nodes of lower level (when child nodes fail) Each node is responsible to modify or fix its routing table when nodes  Enter  Leave  Fail to communicate Maximum steps required O(log b N)

Main Algorithms – Insert Contacting an already connected node and issuing a request for insertion The established node checks if the n+1 first digits of the identifier match its own, where n is the level the node resides  If not then the insertion message is forwarded to the node’s parent  If yes then the message is forwarded to the child with the n+2 digit common with that of the new node If such a child does not exist then the new node is placed as a child to the current node The new node is informed of his new neighbors and via versa

Main Algorithms – Publish If the content’s identifier doesn’t have the first n+1 digits same as the node then the publish message is forwarded to the parent node If they are matching, then it is forwarded to the child with the corresponding matching n+2 digit If no such child exists then the node publishes the content itself

Main Algorithms – Search The node first checks for the keyword in its list of published keywords  If it exists then the search terminates  If not, then it checks whether the first n+1 digits are identical to its own identifier If not then the message is forwarded to its parent If yes, then it’s forwarded to the child with corresponding n+2 digit matching If no such exists, then the search fails

Main Algorithms – Departure If the node has no children then all of its keywords are forwarded to its parent and it informs all its neighbors of its departure  If it has any child, then it randomly picks one and copies all of its neighborhood and keyword information to it before departing  The chosen child moves up a level and substitutes the departing node If the child has any child, then the previous step is repeated recursively until a node with no children is reached and the first step is then executed ending the algorithm

Enhanced Algorithms System liable to sudden node departures  Voluntary departure without calling appropriate mechanism  Sudden departures due to client errors  Network disconnections Treat all of the above cases in the same manner  Changes in the algorithms already presented  Allow the system to bypass node failures Most changes are based on using the upper and lower set  The upper set is utilized to forward messages to nodes of a higher level  The lower set for nodes on a lower level

Enhanced Algorithms– Parent Failure Forward requests consequently to:  The parent’s parent node ( field Up2 on the upper set)  The node to the right of the parent node (field Right2 on the upper set)  The node to the left of the parent node (field Left2 on the upper set) Whichever of the above succeeds first terminates the mechanism

Enhanced Algorithms– Child Failure Forward requests consequently to:  One of the child’s child (field Umbrella2 on the lower set)  The node on the right of the child (field Umbrella on the basic set)  The node on the left of the child (field Umbrella on the basic set)  A child of the node right of the issuing node (field Right3)  A child of the node left of the issuing node (field Left3) Whichever of the above succeeds first terminates the mechanism

Repair Mechanism We have designed a repair mechanism  Invoked whenever such a failure is detected Algorithm utilizes the delete algorithm in order to repair a failure to a child  All failures can be transformed into a child failure through contacting nodes in the neighboring table and forwarding a repair message  Once the appropriate node is reached and informed of the child failure, a variation of the delete algorithm is evoked in order to repair the failure Substituting the failed node with one of its children Deleting it if none is available Each node is responsible for checking its neighborhood table periodically  Issuing ping messages to all node entries  Invoking the repair mechanism whenever a failure is detected This mechanism increases the system’s stability and fault tolerance tremendously

Repair Mechanism Check if the node had children  If it didn’t have any then just contact all of its neighbors by utilizing the neighborhood table and inform them of the new structure  If it did then one of them must be in the Umbrella2 entry Pick a random entry in the Umbrella2 field and inform all neighbors of the change The new child is informed and gathers the appropriate new neighborhood settings from nearby nodes

Simulation Results Extended neurogrid simulator Implemented umbrella algorithms Two sets of results  Without repair mechanism  With repair mechanism Variable network size Random node failures

No Failures - Hops Prove the integrity of our design under normal conditions Conducted simulations with node populations varying from 10 nodes up to 6000 nodes Investigated the number of hops required for a successful insertion and lookup with a varying population of nodes  Number of hops grows logarithmically with the node population in all mechanisms

No Failures - Messages Investigated the overall traffic generated by our architecture Total messages per request  Low number of messages exchanged  Due to the small number of hops required for each successful request  Also due to the limited (constant) number of neighbors maintained by each node Total number increases linearly with the node population

Failures With No Repair Conducted a second set of simulations to test the system’s tolerability against node failures  Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5% For a rate of up to 22% of failing nodes, the success rate is kept high (over 80%)  Slowly degrades up to a mid-point of 50%  Onwards our system becomes unstable and success rates drop dramatically

Failures With Repair – Success Rate Conducted a third set or simulations with repair  Progressively caused node failures from 0 up to 80% of the total node population, in steps of 5%  3T, 6T and 20T repair periods, where T is a constant representing communication activity This ensures that an inactive node will not suffocate the network with repair messages Repair mechanism dramatically increases the success rate  Regardless of the node population

Failures With Repair – Messages Total amount was expected to increase  Remains almost constant for rate failures of up to 50%  Increases linearly from then on In all cases, the total per node average is kept reasonably low

Conclusions Novel protocol  Based on a distributed hash table  Supports key publishing  Retrieval on top of an overlay network For content distribution Analysed our system  Proved its correctiveness and efficacy Its main strengths are  Fixed-size routing table  Provides efficient routing in O(log b N) steps Even when more than half of the system’s population suddenly fails

Questions?