Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Scalable Content-Addressable Network Lintao Liu
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
Precept 6 Hashing & Partitioning 1 Peng Sun. Server Load Balancing Balance load across servers Normal techniques: Round-robin? 2.
PDPTA03, Las Vegas, June S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord Valentin Mesaros 1, Bruno Carton 2, and Peter Van Roy 1 1.
Flat Identifiers Jennifer Rexford Advanced Computer Networks Tuesdays/Thursdays 1:30pm-2:50pm.
Node Lookup in Peer-to-Peer Network P2P: Large connection of computers, without central control where typically each node has some information of interest.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Peer-to-Peer Distributed Search. Peer-to-Peer Networks A pure peer-to-peer network is a collection of nodes or peers that: 1.Are autonomous: participants.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
1 Overlay Networks Reading: 9.4 COS 461: Computer Networks Spring 2008 (MW 1:30-2:50 in COS 105) Jennifer Rexford Teaching Assistants: Sunghwan Ihm and.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Overlay network concept Case study: Distributed Hash table (DHT) Case study: Distributed Hash table (DHT)
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
OVERVIEW Lecture 6 Overlay Networks. 2 Focus at the application level.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
OVERVIEW Lecture 8 Distributed Hash Tables. Hash Table r Name-value pairs (or key-value pairs) m E.g,. “Mehmet Hadi Gunes” and m E.g.,
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
Md Tareq Adnan Centralized Approach : Server & Clients Slow content must traverse multiple backbones and long distances Unreliable.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CSE 486/586 Distributed Systems Distributed Hash Tables
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
The Chord P2P Network Some slides taken from the original presentation by the authors.
CSE 486/586 Distributed Systems Distributed Hash Tables
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
Distributed Hash Tables
(slides by Nick Feamster)
Improving and Generalizing Chord
S-Chord: Using Symmetry to Improve Lookup Efficiency in Chord
Distributed Hash Tables
DHT Routing Geometries and Chord
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
P2P Systems and Distributed Hash Tables
EECS 498 Introduction to Distributed Systems Fall 2017
Consistent Hashing and Distributed Hash Table
CSE 486/586 Distributed Systems Distributed Hash Tables
Presentation transcript:

Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford

Hash Table r Name-value pairs (or key-value pairs) m E.g,. “Mehmet Hadi Gunes” and m E.g., “ and the Web page m E.g., “HitSong.mp3” and “ ” r Hash table m Data structure that associates keys with values 2 lookup(key) valuekey value

Distributed Hash Table r Hash table spread over many nodes m Distributed over a wide area r Main design goals m Decentralization no central coordinator m Scalability efficient even with large # of nodes m Fault tolerance tolerate nodes joining/leaving 3

Distributed Hash Table r Two key design decisions m How do we map names on to nodes? m How do we route a request to that node? 4

Hash Functions r Hashing m Transform the key into a number m And use the number to index an array r Example hash function m Hash(x) = x mod 101, mapping to 0, 1, …, 100 r Challenges m What if there are more than 101 nodes? Fewer? m Which nodes correspond to each hash value? m What if nodes come and go over time? 5

Consistent Hashing r “view” = subset of hash buckets that are visible m For this conversation, “view” is O(n) neighbors m But don’t need strong consistency on views r Desired features m Balanced: in any one view, load is equal across buckets m Smoothness: little impact on hash bucket contents when buckets are added/removed m Spread: small set of hash buckets that may hold an object regardless of views m Load: across views, # objects assigned to hash bucket is small 6

Consistent Hashing Bucket 14 Construction – Assign each of C hash buckets to random points on mod 2 n circle; hash key size = n – Map object to random position on circle – Hash of object = closest clockwise bucket Desired features – Balanced: No bucket responsible for large number of objects – Smoothness: Addition of bucket does not cause movement among existing buckets – Spread and load: Small set of buckets that lie near object Similar to that later used in P2P Distributed Hash Tables (DHTs) In DHTs, each node only has partial view of neighbors

Consistent Hashing r Large, sparse identifier space (e.g., 128 bits) m Hash a set of keys x uniformly to large id space m Hash nodes to the id space as well 8 01 Hash(name)  object_id Hash(IP_address)  node_id Id space represented as a ring

Where to Store (Key, Value) Pair? r Mapping keys in a load-balanced way m Store the key at one or more nodes m Nodes with identifiers “close” to the key where distance is measured in the id space r Advantages m Even distribution m Few changes as nodes come and go… 9 Hash(name)  object_id Hash(IP_address)  node_id

Joins and Leaves of Nodes r Maintain a circularly linked list around the ring m Every node has a predecessor and successor 10 node pred succ

Joins and Leaves of Nodes r When an existing node leaves m Node copies its pairs to its predecessor m Predecessor points to node’s successor in the ring r When a node joins m Node does a lookup on its own id m And learns the node responsible for that id m This node becomes the new node’s successor m And the node can learn that node’s predecessor which will become the new node’s predecessor 11

Nodes Coming and Going r Small changes when nodes come and go m Only affects mapping of keys mapped to the node that comes or goes 12 Hash(name)  object_id Hash(IP_address)  node_id

How to Find the Nearest Node? r Need to find the closest node m To determine who should store (key, value) pair m To direct a future lookup(key) query to the node r Strawman solution: walk through linked list m Circular linked list of nodes in the ring m O(n) lookup time when n nodes in the ring r Alternative solution: m Jump further around ring m “Finger” table of additional overlay links 13

Links in the Overlay Topology r Trade-off between # of hops vs. # of neighbors m E.g., log(n) for both, where n is the number of nodes m E.g., such as overlay links 1/2, 1/4 1/8, … around the ring m Each hop traverses at least half of the remaining distance 14 1/2 1/4 1/8