File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Prepared by Ali Yildiz (with minor modifications by Dennis Shasha)
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
Chord: A Scalable Peer-to- Peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Distributed Lookup Systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
1 Koorde: A Simple Degree Optimal DHT Frans Kaashoek, David Karger MIT Brought to you by the IRIS project.
Effizientes Routing in P2P Netzwerken Chord: A Scalable Peer-to- peer Lookup Protocol for Internet Applications Dennis Schade.
DHTs and Peer-to-Peer Systems Supplemental Slides Aditya Akella 03/21/2007.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
 A P2P IRC Network Built on Top of the Kademlia Distributed Hash Table.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
1 Slides from Richard Yang with minor modification Peer-to-Peer Systems: DHT and Swarming.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CSE 486/586 Distributed Systems Distributed Hash Tables
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Chapter 29 Peer-to-Peer Paradigm Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
The Chord P2P Network Some slides taken from the original presentation by the authors.
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
Distributed Hash Tables
(slides by Nick Feamster)
DHT Routing Geometries and Chord
Consistent Hashing and Distributed Hash Table
P2P: Distributed Hash Tables
A Scalable Peer-to-peer Lookup Service for Internet Applications
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable Peer-to-peer Lookup Service for Internet Applications Partially based on The Impact of DHT Routing Geometry on Resilience and ProximityThe Impact of DHT Routing Geometry on Resilience and Proximity Partially based on Building a Low-latency, Proximity-aware DHT-Based P2P Network Some slides liberally borrowed from: Carnegie Melon Peer-2-Peer Petar Maymounkov and David Mazières’ Kademlia Talk, New York University 1

Peer-2-Peer – Distributed systems without any centralized control or hierarchical organization. – Long list of applications: Redundant storage Permanence Selection of nearby servers Anonymity, search, authentication, hierarchical naming and more – Core operation in most p2p systems is efficient location of data items 2

Outline 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 3

1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 4

Think Big /home/google/ One namespace, thousands of servers – Map each key (=filename) to a value (=server) – Hash table? Think again What if a new server joins? server fails? How to keep track of all servers? What about redundancy? And proximity? Not scalable, Centralized, Fault intolerant Lots of new problems to come up… 5

1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras 6

DHT: Overview Abstraction: a distributed “hash-table” (DHT) data structure: – put(id, item); – item = get(id); Scalable, Decentralized, Fault Tolerant Implementation: nodes in system form a distributed data structure – Can be Ring, Tree, Hypercube, Skip List, Butterfly Network,... 7

DHT: Overview (2) Many DHTs: 8

DHT: Overview (3) Good properties: – Distributed construction/maintenance – Load-balanced with uniform identifiers – O(log n) hops / neighbors per node – Provides underlying network proximity 9

Consistent Hashing When adding rows (servers) to hash-table, we don’t want all keys to change their mappings When adding the N th row, we want ~1/N of the keys to change their mappings. Is this achievable? Yes. 10

11 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

12

13

Chord: Overview Just one operation: item = get(id) Each node needs routing info about few other nodes O(logN) for lookup, O(log 2 N) for join/leave Simple, provable correctness, provable performance Apps built on top of it do the rest 14

Chord: Geometry Identifier space [1,N], example: binary strings Keys (filenames) and values (server IPs) on the same identifier space Keys & values evenly distributed Now, put this identifier space on a circle Consistent Hashing: A key is stored at its successor. 15

Chord: Geometry (2) A key is stored at its successor: node with next higher ID 16 N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 Get(5)=32 Get(20)=32 Get(80)=90 Who maps to 105? Nobody.

Chord: Back to Consistent Hashing “When adding the N th row, we want ~1/N of the keys to change their mappings.” (The problem, a few slides back) 17 N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 Get(5)=32 Get(20)=32 Get(80)=90 Who maps to 105? Nobody. N50 N15 Get(5)=3215 Get(20)=32 Get(80)=90 Who maps to 105? Nobody.

18 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

Chord: Basic Lookup 19 N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80” K80 Each node remembers only next node O(N) lookup time – no good! get(k): If (I have k) Return “ME” Else P  next node Return P.get(k)

Chord: “Finger Table” Previous lookup was O(N). We want O(logN) 20 N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node n’ such that n’ ≥ n + 2 i In other words, the i th finger of n points 1/2 n-i way around the ring i id+2 i succ = 81 __ = 82 __ = 84 __ Finger Table

Chord: “Finger Table” Lookups 21 N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 Entry i in the finger table of node n is the first node n’ such that n’ ≥ n + 2 i In other words, the i th finger of n points 1/2 n-i way around the ring get(k): If (I have k) Return “ME” Else P  next node Closest finger i ≤ k Return P.get(k) i id+2 i succ = 81 __ = 82 __ = 84 __ Finger Table

22 N65 N74 N81 N90 N2 N9 N19 N31 N49 K40 “Where is key 40?” i id+2 i succ = 66 N = 67 N = 29 N19 Finger Table i id+2 i succ 0 20 N N N49 Finger Table “40!” get(k): If (I have k) Return “ME” Else P  Closest finger i ≤ k Return P.get(k) Chord: “Finger Table” Lookups

Chord: Example 23 Assume an identifier space [0..8] Node n1 joins Responsible for all keys (Succ == successor) i id+2 i succ = = = 5 1 Succ. Table

Chord: Example 24 Node n2 joins i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Chord: Example 25 Node n0, n6 join i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table

Chord: Example 26 Nodes: n1, n2, n0, n6 Items: 1, i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table

Chord: Routing 27 Upon receiving a query for item id, a node: 1.Checks if it stores the item locally 2.If not, forwards query to largest node i in its finger table such that i ≤ id i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Items 1 i id+2 i succ Succ. Table query(7)

28 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

Chord: Node Join Node n joins: Need one existing node - n', in hand 1.Initialize fingers of n – Ask n' to look them up (logN fingers to init) 2.Update fingers of the rest – Few nodes need to be updated – Look them up and tell them n is new in town 3.Transfer keys 29

Chord: Improvements Every 30s, ask successor for its predecessor – Fix your own successor based on this Also, pick and verify a random finger – Rebuild finger table entries this way keep successor list of r successors – Deal with unexpected node failures – Can use these to replicate data 30

31 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

Chord: Performance 32 Routing table size? – Log N fingers Routing time? – Each hop expects to half the distance to the desired id => expect O(log N) hops. Node joins – Query for the fingers => O(log N) – Update other nodes’ fingers => O(log 2 N)

Chord: Performance (2) 33 Real time: Lookup time / #nodes

Chord: Performance (3) 34 Comparing to other DHTs

Chord: Performance (4) 35 Promises few O(logN) hops on the overlay – But, on the physical network, this can be quite far f f A Chord network with N(=8) nodes and m(=8)-bit key space

36 1. The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

Applications employing DHTs eMule (KAD implements Kademlia - a DHT) A anonymous network (≥ 2 mil downloads to day) BitTorrent (≥ beta) – Trackerless BitTorrent, allows anonymity (thank god) 1.Clients A & B handshake 2.A: “I have DHT, its on port X” 3.B: ping port X of A 4.B gets a reply => start adjusting - nodes, rows… 37

Kademlia (KAD) Distance between A and B is A XOR B Nodes are treated as leafs in binary tree Node’s position in A’s tree is determined by the longest postfix it shares with A – A’s ID: – B’s ID:

Kademlia: Postfix Tree Node’s position in A’s tree is determined by the longest postfix it shares with A (=> logN subtrees) 39 Node / Peer Our node common prefix: 001 common prefix: 00 common prefix: 0 No common prefix

Kademlia: Lookup 40 Consider a query for ID … initiated by node … Node / Peer Our node

Kademlia: K-Buckets 11…1100… ` Its binary tree is divided into a series of subtrees Consider routing table for a node with prefix 0011 A contact consist of The routing table is composed of a k-bucket s corresponding to each of these subtrees Consider a 2-bucket example, each bucket will have atleast 2 contacts for its key range Node / Peer Our node

Summary The Problem 2. Distributed hash tables (DHT) 3. Chord: a DHT scheme Geometry Lookup Node Joins Performance 4. Extras

Homework Load balance is achieved when all Servers in the Chord network are responsible for (roughly) the same amount of keys Still, with some probability, one server can be responsible for significantly more keys How can we lower the upper bound to the number of keys assigned to a server? Hint: Simulation 43