The Oceanstore Regenerative Wide-area Location Mechanism Ben Zhao John Kubiatowicz Anthony Joseph Endeavor Retreat, June 2000.

Slides:



Advertisements
Similar presentations
Tapestry: Scalable and Fault-tolerant Routing and Location Stanford Networking Seminar October 2001 Ben Y. Zhao
Advertisements

Tapestry: Decentralized Routing and Location SPAM Summer 2001 Ben Y. Zhao CS Division, U. C. Berkeley.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
1 Accessing nearby copies of replicated objects Greg Plaxton, Rajmohan Rajaraman, Andrea Richa SPAA 1997.
Rapid Mobility via Type Indirection Ben Y. Zhao, Ling Huang, Anthony D. Joseph, John D. Kubiatowicz Computer Science Division, UC Berkeley IPTPS 2004.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Scribe: A Large-Scale and Decentralized Application-Level Multicast Infrastructure Miguel Castro, Peter Druschel, Anne-Marie Kermarrec, and Antony L. T.
Probabilistic Aggregation in Distributed Networks Ling Huang, Ben Zhao, Anthony Joseph and John Kubiatowicz {hling, ravenben, adj,
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Scalable Adaptive Data Dissemination Under Heterogeneous Environment Yan Chen, John Kubiatowicz and Ben Zhao UC Berkeley.
Each mesh represents a single hop on the route to a given root. Sibling nodes maintain pointers to each other. Each referrer has pointers to the desired.
Tapestry: Wide-area Location and Routing Ben Y. Zhao John Kubiatowicz Anthony D. Joseph U. C. Berkeley.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter: Chunyuan Liao March 6, 2002 Ben Y.Zhao, John Kubiatowicz, and.
Secure routing for structured peer-to-peer overlay networks (by Castro et al.) Shariq Rizvi CS 294-4: Peer-to-Peer Systems.
Freenet A Distributed Anonymous Information Storage and Retrieval System I Clarke O Sandberg I Clarke O Sandberg B WileyT W Hong.
Weaving a Tapestry Distributed Algorithms for Secure Node Integration, Routing and Fault Handling Ben Y. Zhao (John Kubiatowicz, Anthony Joseph) Fault-tolerant.
Decentralized Location Services CS273 Guest Lecture April 24, 2001 Ben Y. Zhao.
Or, Providing High Availability and Adaptability in a Decentralized System Tapestry: Fault-resilient Wide-area Location and Routing Issues Facing Wide-area.
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1/17/01 Changing the Tapestry— Inserting and Deleting Nodes Kris Hildrum, UC Berkeley Joint work with John Kubiatowicz, Satish.
Tapestry: Finding Nearby Objects in Peer-to-Peer Networks Joint with: Ling Huang Anthony Joseph Robert Krauthgamer John Kubiatowicz Satish Rao Sean Rhea.
Tapestry: A Resilient Global-scale Overlay for Service Deployment Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
Tapestry An off-the-wall routing protocol? Presented by Peter, Erik, and Morten.
1 Freenet  Addition goals to file location: -Provide publisher anonymity, security -Resistant to attacks – a third party shouldn’t be able to deny the.
 Structured peer to peer overlay networks are resilient – but not secure.  Even a small fraction of malicious nodes may result in failure of correct.
Tapestry: Decentralized Routing and Location System Seminar S ‘01 Ben Y. Zhao CS Division, U. C. Berkeley.
Locality Aware Mechanisms for Large-scale Networks Ben Y. Zhao Anthony D. Joseph John D. Kubiatowicz UC Berkeley Future Directions in Distributed Computing.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Tapestry GTK Devaroy (07CS1012) Kintali Bala Kishan (07CS1024) G Rahul (07CS3009)
1 Plaxton Routing. 2 Introduction Plaxton routing is a scalable mechanism for accessing nearby copies of objects. Plaxton mesh is a data structure that.
Arnold N. Pears, CoRE Group Uppsala University 3 rd Swedish Networking Workshop Marholmen, September Why Tapestry is not Pastry Presenter.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
5.1 Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED.
Andreas Larsson, Philippas Tsigas SIROCCO Self-stabilizing (k,r)-Clustering in Clock Rate-limited Systems.
Brocade Landmark Routing on P2P Networks Gisik Kwon April 9, 2002.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 More on Plaxton routing There are n nodes, and log B n digits in the id, where B = 2 b The neighbor table of each node consists of - primary neighbors.
Tapestry: A Resilient Global-scale Overlay for Service Deployment 1 Ben Y. Zhao, Ling Huang, Jeremy Stribling, Sean C. Rhea, Anthony D. Joseph, and John.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Computer Networking P2P. Why P2P? Scaling: system scales with number of clients, by definition Eliminate centralization: Eliminate single point.
Information-Centric Networks10b-1 Week 10 / Paper 2 Hermes: a distributed event-based middleware architecture –P.R. Pietzuch, J.M. Bacon –ICDCS 2002 Workshops.
P2PSIP Security Analysis and evaluation draft-song-p2psip-security-eval-00 Song Yongchao Ben Y. Zhao
Peer to Peer Network Design Discovery and Routing algorithms
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Tapestry : An Infrastructure for Fault-tolerant Wide-area Location and Routing Presenter : Lee Youn Do Oct 5, 2005 Ben Y.Zhao, John Kubiatowicz, and Anthony.
Bruce Hammer, Steve Wallis, Raymond Ho
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Plaxton Routing. 2 History Greg Plaxton, Rajmohan Rajaraman, Andrea Richa. Accessing nearby copies of replicated objects, SPAA 1997 Used in several.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS791Aravind Elango Maintenance-Free Global Data Storage Sean Rhea, Chris Wells, Patrick Eaten, Dennis Geels, Ben Zhao, Hakim Weatherspoon and John Kubiatowicz.
OceanStore : An Architecture for Global-Scale Persistent Storage Jaewoo Kim, Youngho Yi, Minsik Cho.
Pastry Scalable, decentralized object locations and routing for large p2p systems.
Distributed Hash Tables
Accessing nearby copies of replicated objects
John D. Kubiatowicz UC Berkeley
Rapid Mobility via Type Indirection
Tapestry: Scalable and Fault-tolerant Routing and Location
Bridges Neil Tang 10/10/2008 CS440 Computer Networks.
A Scalable Peer-to-peer Lookup Service for Internet Applications
Presentation transcript:

The Oceanstore Regenerative Wide-area Location Mechanism Ben Zhao John Kubiatowicz Anthony Joseph Endeavor Retreat, June 2000

Wide-area Location Increasing scale more vital to distributed systems Existing wide-area location mechanisms: –SLP - WASRV extension –LDAP centroids –Berkeley SDS –Globe location system Unresolved issues: –True Scalability –Fault tolerance against: Single/multiple node failures Network partitions Data-corruption and malicious users Denial of Service attacks –Support for high updates / mobile data Oceanstore Approach –Wide-area location using Plaxton trees

Previous Work: Plaxton Trees Distributed tree structure where every node is the root of a tree Simple mapping from object ID to root ID of the tree it belongs to Nodes keep “nearest” neighbor pointers differing in 1 ID digit –Along with a referrer map of closest referrer nodes Insertion: –Insert object at local node –Proceed to root node hop by hop –Leave back-pointers to object at each hop Benefits: –Decouples tree traversal from any single node –Exploits locality with short-cutting back-pointers –Guaranteed # of hops O(Log(N)) where N = size of namespace Root Node Inserting Obj # Searching Obj #62942 Query: –Proceed to root node hop by hop –If intersect node w/ desired back-pointer, follow it –If root or best approximate node reached and no pointer to object, then it has not been inserted Search Client Object Location

Introducing Sibling Meshes Set of meta-tree structures that assist node-insertion and fault-recovery –Every node keeps n ptrs to “similar” nodes w.r.t. 1 property –Example: all nodes ending in 629 belong to a single mesh –Each node belongs to Log(N) meshes, where N = number of unique IDs in namespace –Meshes decrease in size as granularity becomes more fine Plaxton Trees / Ground Level Oceanstore Canopy Level 9 Level Single path to root Sibling pointers Single hops to root Each mesh represents a single hop on the route to a given root. Sibling nodes maintain pointers to each other. (optional) Each referrer has pointers to the desired node’s siblings

Building a Plaxton Grove Incremental node addition algorithm For new node N n to be added with nodeID D: –Do a hop by hop search for D –At each hop, visit X closest nodes –For each node N i in set X: Integrate next neighbor map from N i neighbor map Take referrer list from N i Measure distance between each referrer and N n If new distance shorter, notify referrer to point to N n –Stop: when no exact match for each digit of ID found Redirect those referrers that are looking for IDs closer to you to point to you New node N n At each hop, aggregate neighbor and referrer maps from closest nodes Domains of Influence Neighbor nodes New node N n ID Granularity

Node Removal / Failure Simple detection of pointer corruption and node failure Recovery from node failure and data corruption –Mark node pointer with invalid tag –Use next closest sibling of failed node Invalid pointers has second chance time-to-live –Failures expected to recover within finite timeframe –Entries marked invalid with countdown timer –Each request has some chance of being forwarded to invalid node, in order to check if recovery has been completed –Referrer tracks traffic to failed node and assigns each packet a “validation” probability –Restarted node notifies referrers to remove invalid tag –Nodes which fail to recover within timer period must reinsert as new nodes Node removals = intentional exits from system –Actively announce removal to referrers,invalidation skipped –Referrers maintain backups by requesting another sibling ptr

Self-tuning and Stability Self-maintenance build into searches: Self tuning of non-optimal routes –Keep running totals of subpath distances –Inform nodes of better routes Stability –Overlay multiple mappings of nodeIDs onto physical nodes –Single queries handled by multiplexing into query per map –Overlap provides additional security against single, multiple server failures, network partitions, and corrupted data Temporary map pointers –Referrer and Neighbor entries have time-to-live fields –Renewal by usage (implicit) or explicit renewal messages –Implicit priority queue where least often used paths can be “forgotten” in favor of more vital paths –Natural node recovery, wait for messages to renew maps

Node Replication Remove single node bottleneck –Critical for load balancing –Adds fault-tolerance at Root nodes Node replication –Copy Neighbor-mapping, then regenerate –Redirect referrer traffic Replicate Groups –Share referrer mappings –Use peer monitoring to detect node failure and redirect traffic as necessary

Malicious Users and DoS (Ongoing...) Misleading/malicious advertisement –Source validation at storage nodes –orthogonal mechanism ensures association between advertisement and principal of trust Denial of Service attacks –Overload of infrastructure nodes Routing and storage load distribution via node replication –DoS source identification Probabilistic source packet stamping (Savage et. al.) Invalidation propagation –Invalidations can be given by authoritative servers –Can propagate as datagram to referrers

More Ongoing Issues Support for mobility –Mobile data All back-pointers point to initial node (N init ) Location updates sent to previous node and N init All back-pointers other than from root can be updated to new position by current query Traveled node pointers time out via point expiration On failure, revert back to root, then to N init for current position –Mobile clients and asynchronous operations Chain location updates on visited nodes When expected asynchronous requests satisfied, node leaves chain, and informs previous node of forward link Oceanstore location as routing infrastructure