IBM Haifa Research 1 Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert MorrisDavid, Liben-Nowell, David R. Karger, M. Frans Kaashoek,
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Prepared by Ali Yildiz (with minor modifications by Dennis Shasha)
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
Chord: A Scalable Peer-to- Peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Chord:A scalable peer-to-peer lookup service for internet applications
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
1 David Liben-Nowell, Hari Balakrishnan, David Karger Analysis of the Evolution of Peer-to-Peer Systems Speaker: Jan Conrad.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Wide-area cooperative storage with CFS
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Lecture 3 1.
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Presented by: Tianyu Li
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Lecture 2 Distributed Hash Table
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Advanced issues. Analysis Theorem. Search takes O (log N) time (Note that in general, 2 m may be much larger than N) Proof. After log N forwarding.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Md Tareq Adnan Centralized Approach : Server & Clients Slow content must traverse multiple backbones and long distances Unreliable.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.
The Chord P2P Network Some slides taken from the original presentation by the authors.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Magdalena Balazinska, Hari Balakrishnan, and David Karger
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
A Scalable Peer-to-peer Lookup Service for Internet Applications
(slides by Nick Feamster)
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
P2P: Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
Presentation transcript:

IBM Haifa Research 1 Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems

IBM Haifa Research 2 In File Systems  The App needs to know the path –/home/user/my pictures/…  The Filesystem looks up the directory structure to find the file’s inode, which reveals the physical location of the file’s data on disk.  At Cloud Scale the file system must get distributed –Millions of different users uploading / downloading files –Billions of files  This brings challenges that traditional file systems are not built for –File system metadata is usually a single point of failure –File systems must adhere POSIX semantics requiring strong consistency of the file system metadata  Something else is required here….

IBM Haifa Research 3 The Abstraction: Distributed hash table (DHT) Distributed hash table Distributed application get (key) data node …. put(key, data) Lookup service lookup(key)node IP address Application may be distributed over many nodes DHT distributes data storage over many nodes (File sharing) (DHash) (Chord) Picture taken from the lecture slides “Topics in Data-Intensive Computing Systems”, given at UBC (

IBM Haifa Research 4 The Requirements (Chord)  Scalability. Billions of keys stored on hundreds or millions of nodes. –Operations cannot take time that is substantially larger than logarithmic in the number of keys  Availability. –The lookup service must be able to function despite network partitions and node failures. Chord provides ‘best effort’ availability keeping ‘r’ replicas of each key. –Note: In the presence of many joins/failures data may be temporarily unavailable until the network stabilizes.  Load Balancing. –Resource usage is evenly distributed among the machines in the system. In a system with N nodes and K keys, each node holds approximately K/N of the keys.  Dynamism. –Nodes can join or leave the system without any downtime  NOT Covered, but interesting in some cases: –Authenticated Inserts / Access control. –Protection against malicious servers. –Stronger Consistency.

IBM Haifa Research 5 The API (Chord)  Insert (key, value) – Inserts key/value at r distinct nodes  Lookup (key) – Returns the value associated with the key  Update(key, newval)  Join(n) – Causes a node to add itself as a Chord node, where n is an existing node  Leave() – Leave the Chord system

IBM Haifa Research 6 The Chord Structure (Based on Consistent Hashing)  Choose a good hash function mapping to some m bit domain. Defining a 2 m space. –A popular choice is SHA1 mapping to 160 bit strings, defining a space of size  Assign each key and node in the system an ‘m bit identifier’ –For each node apply SHA1 on the node’s IP –For each human/application readable key apply SHA1 on that key  Given the identifiers, each key, k, is stored in a node whose identifier, id, is equal to or follows k, in the identifier space.

IBM Haifa Research 7 The Chord Structure with m=3 Peer-to-Peer Systems, Anthony D. Joseph, John Kubiatowicz, Berkely identifier circle identifier node X key successor(1) = 1 successor(2) = 3successor(6) = 0 If a node with id=7 now joins the system it would capture the key with identifier 6. Picture taken from a presentation by Anthony D. Joseph and John Kubiatowicz (Berkely) on peer-to-peer systems (

IBM Haifa Research 8 The Fundamental Properties of Consistent Hashing  For any set of N nodes and K keys, with high probability: –Each machine is responsible for at most (1+ε)K/N keys –When an (N+1) st machine joins or leaves the network, O(K/N) keys are moved (and only to or from the joining or leaving machine)  In practice a good hash function such as SHA1 should be sufficient to achieve the above  To achieve small ε, each physical node actually needs to run log N ‘virtual’ nodes each with independent identifier over the ring.

IBM Haifa Research 9 Routing in Chord (how do I get to the value of K=6)  Holding a complete table of the nodes and their ids should do the trick in O(1) time, but: –This is a large table to hold. –This is a constantly changing table.  Alternatively, let each node know only about its successor. –A search can take up to O(N) messages.  Turns out that holding a table of m entries (3 in our example, 160 in the SHA1 case) on each node suffices to locate any key with O(log N) messages.

IBM Haifa Research 10 The Routing Information in Chord  Each node holds a table that ‘divides’ the identifier circle into exponentially increasing intervals.  For each interval, keep the first node, whose identifier is equal to or follows the interval start point.  In addition, each node keeps a pointer to its immediate predecessor on the identifier circle [)[ ) [)

IBM Haifa Research 11 The Routing Information in Chord – Example [1,2) [2,4) [4,0) finger table startint.node Pred. 3 Succ. 1 finger table startint.node Pred [2,3) [3,5) [5,1) Succ. 3 finger table startint.node Pred [4,5) [5,7) [7,3) Succ. 0 Picture taken from a presentation by Anthony D. Joseph and John Kubiatowicz (Berkely) on peer-to-peer systems (

IBM Haifa Research 12 The Routing Information in Chord - Some notes  Each node stores information about only a small number of other nodes. –120 in the case of SHA1.  The amount of information maintained about other nodes falls exponentially with the distance between the two nodes.

IBM Haifa Research 13 The Finger Table – Formal Notations NotationDefinition Finger[k].start(n+2 k-1 ) mode 2 m, for 1≤k ≤ m Finger[k].interval[finger[k].start, finger[k+1].start), if 1 ≤ k<m [finger[k].start, n), for k=m Finger[k].nodeThe first node whose identifier is equal to of follows n.finger[k].start successorImidiate successor of node n on the identifier circle predecessorImmidiate predecessor of node n on the identifier circle  Node n holds the following finger table:

IBM Haifa Research 14 Routing in Chord – Pseudo Code // ask node n to find the successor of id n.find_successor(id) n’ = find_predecessor(id); return n’.successor; // search the local table for the closest predecessor of id n.closest_preceding_node(id) for i = m downto 1 if (finger[i].node  (n, id)) return finger[i].node; return n;

IBM Haifa Research 15 Routing in Chord – Pseudo Code // ask node n to find the predecessor of id n.find_predecessor(id) if (n == successor) return n // n is the only node in the network n’ = n while (id ∉ (n’, n’.successor] ) n’ = n’.closest_preceding_node(id) return n’

IBM Haifa Research 16 Routing in Chord – run time  Theorem: With high probability, the number of nodes that must be contacted to resolve a successor query in an N-node network is O(log N)  Denote: –n is the node wishing to resolve the successor of the key k –Let p be the predecessor of k. p is the actual node we are looking for. n X k p

IBM Haifa Research 17 Routing in Chord – run time  Theorem: With high probability, the number of nodes that must be contacted to resolve a successor query in an N-node network is O(log N)  Denote: –n is the node wishing to resolve the successor of the key k –Let p be the predecessor of k. p is the actual node we are looking for. –Suppose that p is in the i’th interval of n. -This does not mean that p actually appears in n’s finger table. -It merely suggests that this interval is not empty n X k p [ )

IBM Haifa Research 18 Routing in Chord – run time  Theorem: With high probability, the number of nodes that must be contacted to resolve a successor query in an N-node network is O(log N)  Denote: –n is the node wishing to resolve the successor of the key k –Let p be the predecessor of k. p is the actual node we are looking for. –Suppose that p is in the i’th interval of n. -This does not mean that p actually appears in n’s finger table. -It merely suggests that this interval is not empty –Let f be n.finger[i].node. That is, the first node in n’s i’th interval. –The distance between n and f is at least 2 i-1. However, the distance between f and p is at most 2 i-1 as both appear in the i th interval. –Conclusion: By doing a single hop the distance to the key reduced by at least half. n X k pf [ )

IBM Haifa Research 19 Routing in Chord – run time continue  The maximum distance between a node and a key is 2 m  At each hop, the distance is reduced by half –After log N steps the distance is at most 2 m /N  Since the nodes are evenly distributed (This is where the high probability comes from), the expected number of nodes in any interval of size 2 m /N is 1.  With high probability, the number of nodes in any interval of size 2 m /N is bounded by O(log N).  Even if we traverse those in single steps the total time would still be O(log N).

IBM Haifa Research 20 Maintaining the routing information  Maintaining / Initializing the routing information in the presence of continuous joins and departures of nodes is hard.  Solution: Periodical update of the finger table and the successor / predecessor –Turns out that maintaining the successor and predecessor in the presence of many changes is enough for queries to succeed, at the cost of the running time. // Verify n’s immediate pred/succ // Called periodically. n.stabilize() x = predecessor; x = x.successor; if (x  (predecessor, n)) predecessor = x x = successor; x = x.predecessor; if (x  (n, successor)) Finger[1].node = successor = x; nn.pred n.succn

IBM Haifa Research 21 References  Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications (2001) by Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan  Topics in Data-Intensive Computing Systems UBC (  Peer-to-Peer Systems, Anthony D. Joseph, John Kubiatowicz, Berkely course slides (