Distributed Lookup Systems

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Scalable Content-Addressable Network Lintao Liu
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Peer-to-Peer Networks and Distributed Hash Tables 2006.
Wide-Area Cooperative Storage with CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Sylvia Ratnasamy (UC Berkley Dissertation 2002) Paul Francis Mark Handley Richard Karp Scott Shenker A Scalable, Content Addressable Network Slides by.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Structured P2P Overlays. Consistent Hashing – the Basis of Structured P2P Intuition: –We want to build a distributed hash table where the number of buckets.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
1 Distributed Hash Table CS780-3 Lecture Notes In courtesy of Heng Yin.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Chord Advanced issues. Analysis Search takes O(log(N)) time –Proof 1 (intuition): At each step, distance between query and peer hosting the object reduces.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CSE 486/586 Distributed Systems Distributed Hash Tables
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
The Chord P2P Network Some slides taken from the original presentation by the authors.
CSE 486/586 Distributed Systems Distributed Hash Tables
Distributed Hash Tables
(slides by Nick Feamster)
DHT Routing Geometries and Chord
A Scalable content-addressable network
A Scalable, Content-Addressable Network
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
Peer-to-Peer Networks and Distributed Hash Tables
A Scalable, Content-Addressable Network
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
CSE 486/586 Distributed Systems Distributed Hash Tables
Presentation transcript:

Distributed Lookup Systems Chord: A Scalable Peer-to-peer Lookup Service for Internet App. A Scalable Content Addressable Network

Motivation How to find data in a distributed system? N2 N3 N1 N5 ? N4 Publisher Key=“LetItBe” Value=MP3 data Internet N5 ? N4 Client Lookup(“LetItBe”)

Applications Peer-to-peer systems Mirroring (web caching) Napster, Gnutella, Groove, FreeNet, … Large scale storage management systems Publius, OceanStore, PAST, Farsite, CFS ... Mirroring (web caching) Any wide area name resolution system

Outline Types of solutions Evaluation Criteria CAN and Chord basic idea insert/retrieve join/leave recovery from failures

Centralized Solution N2 N3 N1 DB N5 N4 Central server (Napster) Publisher Key=“LetItBe” Value=MP3 data Internet DB N5 N4 Client Lookup(“LetItBe”)

Distributed Solution (1) Flooding (Gnutella, Morpheus, etc.) N2 N3 N1 Publisher Internet N5 N4 Client Lookup(“LetItBe”) Worst case O(m) messages per lookup

Distributed Solution (2) Routed messages (Freenet, Tapestry, Chord, CAN, etc.) N2 N3 N1 Publisher Key=“LetItBe” Value=MP3 data Internet N5 N4 Client Lookup(“LetItBe”)

Routing Challenges Define a useful nearness metric Keep the hop count small Keep the routing table right size Stay robust despite robust changes in membership

Evaluation Scalability Latency Load balancing Robustness routing path length per-node state Latency Load balancing Robustness routing fault tolerance data availability

Content Addressable Network Basic idea Internet scale hash table Interface insert(key,value) value = retrieve(key) Table partitioned among many individual node

CAN - Solution virtual d-dimensional Cartesian coordinate space dynamically partitioned into zones, each “owned” by one node a key mapped into a point P (key, value) is stored at the node which owns the zone P belongs to

CAN - Example I

CAN - Example node I::insert(K,V) I

CAN - Example node I::insert(K,V) I (1) a = hx(K) x = a

CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) y = b

CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) (2) route(K,V)-> (a,b)

CAN - Example I node I::insert(K,V) (1) a = hx(K) b = hy(K) (K,V) (2) route(K,V)-> (a,b) (3) (a,b) stores (K,V) (K,V)

CAN - Routing each node maintains state for its neighbors message contains dst coordinates greedy forwarding to neighbor with coordinates closest to dst (a,b) (x,y)

CAN - Node Insertion I new node 1) discover some node “I” already in CAN

CAN – Node Insertion I (p,q) 2) pick random point in space new node

CAN – Node Insertion J I new node 3) I routes to (p,q), discovers node J

CAN - Node Insertion J new 4) split J’s zone in half… new owns one half

CAN – Node Failure Need to repair the space recover database soft-state updates use replication, rebuild database from replicas repair routing takeover algorithm

CAN – Takeover Algorithm Simple failures know your neighbor’s neighbors when a node fails, one of its neighbors takes over its zone More complex failure modes simultaneous failure of multiple adjacent nodes scoped flooding to discover neighbors hopefully, a rare event

CAN - Evaluation Scalability per node, number of neighbors is 2d average routing path is (dn1/d)/4 hops Can scale the network without increasing per-node state

CAN – Design Improvements increase dimensions, realities (reduce the path length) heuristics (reduce the per-CAN-hop latency) RTT-weighted routing multiple nodes per zone (peer nodes) deterministically replicate entries

CAN - Weaknesses Impossible to perform a fuzzy search Susceptible to malicious activity Maintain coherence of all the indexed data (Network overhead, Efficient distribution) Still relatively higher routing latency Poor performance w/o improvement

Chord Based on consistent hashing for key to node mapping standard hashing, e.g. x->ax+b (mod(p)) provide good balance across bins consistent hashing small change in the bucket set does not induce a total remapping of items to buckets

Node identifier = SHA-1(IP address) Chord IDs m bit identifier space for both keys and nodes Key identifier = SHA-1(key) Key=“LetItBe” ID=60 SHA-1 Node identifier = SHA-1(IP address) IP=“198.10.10.1” ID=123 SHA-1 Both are uniformly distributed

Chord – Consistent Hashing K5 IP=“198.10.10.1” N123 K20 Circular 7-bit ID space K101 N32 Key=“LetItBe” N90 K60 A key is stored at its successor: node with next higher ID

Chord – Basic Lookup Every node knows its successor in the ring N10 N10 Where is “LetItBe”? N123 Hash(“LetItBe”) = K60 N32 “N90 has K60” N55 K60 N90

Chord – “Finger Tables” Every node knows m other nodes in the ring Increase distance exponentially Finger i points to successor of n+2i N80 80 + 20 N112 N96 N16 80 + 21 80 + 22 80 + 23 80 + 24 80 + 25 80 + 26

Chord – “Finger Tables” (0) N32’s Finger Table N113 33..33 N40 34..35 N40 36..39 N40 40..47 N40 48..63 N52 64..95 N70 96..31 N102 N102 N32 N85 N40 N80 N79 N52 N70 N60

Chord – Lookup Algorithm (0) N32 N60 N79 N70 N113 N40 N52 N80 33..33 N40 34..35 N40 36..39 N40 40..47 N40 48..63 N52 64..95 N70 96..31 N102 N32’s Finger Table N85

Chord - Node Insertion Three step process: Initialize all fingers of new node Update fingers of existing nodes Transfer keys from successor to new node Less aggressive mechanism (lazy finger update): Initialize only the finger to successor node Periodically verify immediate successor, predecessor Periodically refresh finger table entries

Joining the Ring – Step 1 Initialize the new finger table locate any node p in the ring ask node p to lookup fingers of new node N36 N5 N20 N99 N36 1. Lookup(37,38,40,…,100,164) N40 N80 N60

Joining the Ring – Step 2 Updating fingers of existing nodes new node calls update function on existing nodes existing node can recursively update fingers of other nodes N5 N20 N99 N36 N40 N80 N60

Joining the Ring – Step 3 Transfer keys from successor node to new node N5 N20 N99 N36 K30 K38 Copy keys 21..36 from N40 to N36 N40 K30 K38 N80 N60

Chord – Handling Failures Use successor list each node know r immediate successors after failure will know first live successor correct successors guarantee correct lookups Guarantee is with some probability Can choose r to make probability of lookup failure arbitrarily small

Chord - Evaluation Efficient: O(Log N) messages per lookup N is the total number of servers Scalable: O(Log N) state per node Robust: survives massive changes in membership Assuming no malicious participants

Chord - Weakness NOT that simple (compared to CAN) Member joining is complicated aggressive mechanisms requires too many messages and updates no analysis of convergence in lazy finger mechanism Key management mechanism mixed between layers upper layer does insertion and handle node failures Chord transfer keys when node joins (no leave mechanism!) Routing table grows with # of members in group Worst case lookup can be slow

Summary Both systems: fully distributed and scalable efficient lookup robust simplicity (?) susceptible to malicious activity

How these related to OSD? Very similar if data is “public” If data is “private”, only a few locations are available for storing data Does OSD help (make it easy) for peer-to-peer computing? Any more comments?