Consistent Hashing and Distributed Hash Table

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
IBM Haifa Research 1 Finding Data in the Cloud using Distributed Hash Tables (Chord) IBM Haifa Research Storage Systems.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Structured P2P Network Group14: Qiwei Zhang; Shi Yan; Dawei Ouyang; Boyu Sun.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Wide-Area Cooperative Storage with CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley.
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
CS 347Notes081 CS 347: Parallel and Distributed Data Management Notes 08: P2P Systems.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Md Tareq Adnan Centralized Approach : Server & Clients Slow content must traverse multiple backbones and long distances Unreliable.
CSE 486/586 Distributed Systems Distributed Hash Tables
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Magdalena Balazinska, Hari Balakrishnan, and David Karger
CSE 486/586 Distributed Systems Distributed Hash Tables
Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
Peer-to-Peer Data Management
(slides by Nick Feamster)
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
P2P Systems and Distributed Hash Tables
EECS 498 Introduction to Distributed Systems Fall 2017
Distributed Hash Tables
Chord and CFS Philip Skov Knudsen
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
P2P: Distributed Hash Tables
CSE 486/586 Distributed Systems Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
#02 Peer to Peer Networking
Presentation transcript:

Consistent Hashing and Distributed Hash Table Chen Qian Department of Computer Science & Engineering qian@ucsc.edu https://users.soe.ucsc.edu/~qian/

Motivating question Suppose we have a large amount of web pages to store over 100 servers. How can we know where a webpage (identified by a url) is stored, among the 100 servers? Do we have a method simpler than using a complete directory? 2

Simple Solution Using Hashing Let us use a mapping from all web pages to all servers A hash functions maps elements of a universe U to buckets A good hash function: Easy to compute Uniformly random 3

Say there are n servers, indexed by 0, 1, 2, …, n-1 Say there are n servers, indexed by 0, 1, 2, …, n-1. Then we can just store the Web page with URL x at the cache server named h(x) mod n. Problem: Suppose we add a new server and n is now 101. For an object x, it is very unlikely that h(x) mod 100 and h(x) mod 101 are the same number, Thus, changing n forces almost all objects to relocate. 4

Consistent Hashing Each object is mapped to the next bucket that appears in clockwise order on the unit circle. Server Object 5

Consistent Hashing Server Object Failure 6

Consistent Hashing Server Object 7

Consistent Hashing Server Object Add a server or the server is online again 8

The object and server names need to be hashed to the same range, such as 32-bit values. n servers partition the circle into n segments, with each server responsible for all objects in one of these segments. in expectation, adding the nth server causes only a 1/n fraction of the objects to relocate. 9

In practice Using binary search tree to find the server responsible for an object If you pick n random points on the circle, you're very unlikely to get a perfect partition of the circle into equal-sized segments. Decrease this variance is to make k “virtual copies" of each server s 10

Pros? Cons? 11

P2P: searching for information “Index” in P2P system: maps information to peer location (location = IP address & port number) “Index” may be a server, or may be a distributed system – it is an abstraction on this slide Each file that is shared using eMule is hashed using the MD4 algorithm. The top-level MD4 hash, file size, filename, and several secondary search attributes such as bit rate and codec are stored on eD2k servers and the serverless Kad network. The Kad network is a peer-to-peer (P2P) network which implements the Kademlia P2P overlay protocol. The majority of users on the Kad Network are also connected to servers on the eDonkey network, and Kad Network clients typically query known nodes on the eDonkey network in order to find an initial node on the Kad network.

Distributed Hash Table (DHT) DHT = distributed P2P database Database has (key, value) pairs; key: content name; value: IP address Peers query database with key database returns values that match the key Peers can also insert (key, value) pairs into database Finding “needles” requires that the P2P system be structured

Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Ion Stoica Robert Morris David Karger Frans Kaashoek Hari Balakrishnan Slides by Xiaozhou Li

distributed hash table Distributed application put(key, data) get (key) data Distributed hash table node …. DHT provides the information look up service for P2P applications. Nodes uniformly distributed across key space Nodes form an overlay network Nodes maintain list of neighbors in routing table Decoupled from physical network topology

Routed queries (Chord) N2 N1 N3 N4 Client Publisher Lookup(“beat it”) Key = “beat it” Value = MP3 data… N7 N6 N8 N9

Routing challenges Define a useful key nearness metric Keep the hop count small Keep the tables small Stay robust despite rapid change Chord: emphasizes efficiency and simplicity

Chord properties Efficient: O(log(N)) messages per lookup N is the total number of servers Scalable: O(log(N)) state per node Robust: survives massive failures Proofs are in paper / tech report Assuming no malicious participants

Chord overview Provides peer-to-peer hash lookup: Lookup(key)  return IP address Chord does not store the data How does Chord route lookups? How does Chord maintain routing tables?

Chord IDs Key identifier = SHA-1(key) Node identifier = SHA-1(IP address) Both are uniformly distributed Both exist in the same ID space How to map key IDs to node IDs? The heart of Chord protocol is “consistent hashing”

Review: consistent hashing for data partitioning and replication 1 1/2 hash(key1) E replication factor N=3 A C hash(key2) F B D A key is stored at its successor: node with next higher ID 21

Identifier to node mapping example Node 8 maps [5,8] Node 15 maps [9,15] Node 20 maps [16, 20] … Node 4 maps [59, 4] Each node maintains a pointer to its successor 4 58 8 15 44 20 35 32

Lookup Each node maintains its successor 4 Each node maintains its successor Route packet (ID, data) to the node responsible for ID using successor pointers 58 8 node=44 15 44 20 35 32

Join Operation Node with id=50 joins the ring via node 15 Node 50: send join(50) to node 15 Node 44: returns node 58 Node 50 updates its successor to 58 succ=4 pred=44 4 58 8 join(50) succ=nil succ=58 pred=nil 15 50 58 44 succ=58 20 pred=35 35 32 24

Periodic Stabilize Node 50: periodic stabilize 4 succ=4 Node 50: periodic stabilize Sends stabilize message to 58 Node 50: send notify message to 58 Update pred=44 pred=50 pred=44 4 stabilize(node=50) notify(node=50) 58 8 succ.pred=44 succ=58 pred=nil 15 50 44 succ=58 20 pred=35 35 32

Periodic Stabilize Node 44: periodic stabilize Asks 58 for pred (50) 4 succ=4 pred=50 Node 44: periodic stabilize Asks 58 for pred (50) Node 44 updates its successor to 50 4 58 succ.pred=50 stabilize(node=44) 8 succ=58 pred=nil 15 50 44 succ=58 succ=50 20 pred=35 35 32

Periodic Stabilize Node 44 has a new successor (50) 4 pred=50 Node 44 has a new successor (50) Node 44 sends a notify message to node 50 4 58 8 succ=58 pred=44 pred=nil notify(node=44) 15 50 44 succ=50 20 pred=35 35 32

Periodic Stabilize Converges! This completes the joining operation! pred=50 4 58 8 succ=58 50 pred=44 15 44 20 succ=50 35 32

Achieving Efficiency: finger tables Say m=7 Finger Table at 80 i ft[i] 0 96 1 96 2 96 3 96 4 96 5 112 6 20 (80 + 26) mod 27 = 16 112 80 + 25 20 96 80 + 24 32 80 + 23 80 + 22 80 + 21 45 80 + 20 80 ith entry at peer with id n is first peer with id >= Each node only stores O(log N) entries Each look up takes at most O(log N) hops

Achieving Robustness What if nodes FAIL? Ring robustness: each node maintains the k (> 1) immediate successors instead of only one successor If smallest successor does no respond, substitute the second entry in its successor list Unlikely all successors fail simultaneously Modifications to stabilize protocol (see paper!)

Cooperative File System (CFS) Block storage Availability / replication Authentication Caching Consistency Server selection Keyword search Lookup DHash distributed block store Chord Powerful lookup simplifies other mechanisms

Cooperative File System (cont.) Block storage Split each file into blocks and distribute those blocks over many servers Balance the load of serving popular files Data replication Replicate each block on k servers Increase availability Reduce latency (fetch from the server with least latency)

Cooperative File System (cont.) Caching Caches blocks along the lookup path Avoid overloading servers that hold popular data Load balance Different servers may have different capacities A real server may act as multiple virtual servers, by being hashed to several different IDs.

Problem of DHT No good solution to maintain both scalable and consistent finger table under Churn. Solution of BitTorrent Maintain trackers (servers) as DHT, which are more reliable Users queries trackers to get the locations of the file File sharing are not structured.

Next class Please read Chapter 3-4 of the textbook BEFORE Class 35

Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/ Thank You Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/