Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring 2007 1 Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert MorrisDavid, Liben-Nowell, David R. Karger, M. Frans Kaashoek,
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Prepared by Ali Yildiz (with minor modifications by Dennis Shasha)
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
Chord: A Scalable Peer-to- Peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Xiaowei Yang CompSci 356: Computer Network Architectures Lecture 22: Overlay Networks Xiaowei Yang
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Looking Up Data in P2P Systems Hari Balakrishnan M.Frans Kaashoek David Karger Robert Morris Ion Stoica.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Idit Keidar, Topics in Reliable Distributed Systems, Technion EE, Winter Topics in Reliable Distributed Systems Winter Dr.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications 吳俊興 國立高雄大學 資訊工程學系 Spring 2006 EEF582 – Internet Applications and Services 網路應用與服務.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications Lecture 3 1.
Effizientes Routing in P2P Netzwerken Chord: A Scalable Peer-to- peer Lookup Protocol for Internet Applications Dennis Schade.
Network Layer (3). Node lookup in p2p networks Section in the textbook. In a p2p network, each node may provide some kind of service for other.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
Structured P2P Overlays. Consistent Hashing – the Basis of Structured P2P Intuition: –We want to build a distributed hash table where the number of buckets.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
CS694 - DHT1 Distributed Hash Table Systems Hui Zhang University of Southern California.
CSE 486/586 Distributed Systems Distributed Hash Tables
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
1 Distributed Hash tables. 2 Overview r Objective  A distributed lookup service  Data items are distributed among n parties  Anyone in the network.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
The Chord P2P Network Some slides taken from the original presentation by the authors.
Peer-to-Peer Information Systems Week 12: Naming
The Chord P2P Network Some slides have been borrowed from the original presentation by the authors.
A Scalable Peer-to-peer Lookup Service for Internet Applications
(slides by Nick Feamster)
EE 122: Peer-to-Peer (P2P) Networks
DHT Routing Geometries and Chord
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
A Scalable Peer-to-peer Lookup Service for Internet Applications
Peer-to-Peer Information Systems Week 12: Naming
Presentation transcript:

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer (P2P) Lookup Systems Spring 2007 Idit Keidar

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Today’s Material: Papers Looking up Data in P2P Systems –Balakrishnan et al. Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications –Stoica et al.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring What Does Peer-to-Peer Mean? Characterization from CFP of IPTPS 2004: –decentralized, –self-organizing –distributed systems, –in which all or most communication is symmetric.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Typical Characteristics Lots of nodes (e.g., millions) Dynamic: frequent join, leave, failure Little or no infrastructure –no central server Communication possible between every pair of nodes (cf. the Internet) All nodes are “peers” – have same role; don’t have lots of resources

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring The Main Challenge To design and implement a robust and scalable distributed system composed of inexpensive, individually unreliable computers in unrelated administrative domains “Looking up Data in P2P Systems”, Balakrishnan et al., CACM Feb 2003

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring It All Started with Lookup Goal: Make billions of objects available to millions of concurrent users –e.g., music files Need a mechanism to keep track of them –map files to their locations First There was Napster –centralized server/database –pros and cons?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Traditional Scalability Solution Hierarchy –tree overlay: organize nodes into a spanning tree; communicate on links of the tree –structured lookup: know where to forward the query next –e.g., DNS. Pros and cons?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Overlay Networks A virtual structure imposed over the physical network (e.g., the Internet) –over the Internet, there is a (IP level) unicast channel between every pair of hosts –an overlay uses a fixed subset of these –nodes that have the capability to communicate directly with each other do not use it What is this good for?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Symmetric Lookup Algorithms All nodes were created equal No hierarchy –overlay not a tree So how does the search go? –depends….

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Searching in Overlay Networks Take I: Gnutella Build a decentralized unstructured overlay –each node has several neighbors; –holds several keys in its local database When asked to find a key X –check local database if X is known –if yes, return, if not, ask your neighbors What is the communication pattern?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Resolve Query by Flooding 2 3 S X Time-To-Live (TTL)=5 would have been enough

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring How Come It Works? Search is fast –what people care about People don’t care so much about wasting bandwidth –may change as ISPs start charging for bandwidth Scalability is limited –normally no more than ~40,000 peers Files are replicated many times so flooding with a small TTL usually finds the file –even if there are multiple connected components

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Take II: FastTrack, KaZaA, eDonkey Improve scalability by re-introducing a hierarchy –though not a tree –super-peers have more resources, more neighbors, know more keys –search goes through super-peers Pros and Cons?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Structured Lookup Overlays Many recent academic systems – –CAN, Chord, D2B, Kademlia, Koorde, Pastry, Tapestry, Viceroy, … OverNet based on the Kademlia algorithm Symmetric, no hierarchy Decentralized self management Structured overlay – data stored in a defined place, search goes on a defined path Implement Distributed Hash Table (DHT) abstraction

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Reminder: Hashing Data structure supporting the operations: –void insert( key, item ) –item search( key ) Implementation uses hash function for mapping keys to array cells Expected search time O(1) –provided that there are few collisions

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Distributed Hash Tables (DHTs) Nodes store table entries –the role of array cells Good abstraction for lookup? Why? Requirements for an application being able to use DHTs? –data identified with unique keys. –nodes can (agree to) store keys for each other location of object or actual object

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring The DHT Service Interface lookup( key ) returns the location of the node currently responsible for this key key is usually numeric (in some range)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Using the DHT Interface How do you publish a file? How do you find a file?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring What Does a DHT Implementation Need to Do? Map keys to nodes –needs to be dynamic as nodes join and leave –how does this affect the service interface? Route a request to the appropriate node –routing on the overlay

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Lookup Example K V insert (K 1,V 1 ) K V (K 1,V 1 ) lookup(K 1 )

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Mapping Keys to Nodes Goal: load balancing –why? Typical approach: –give an m-bit identifier to each node and each key (e.g., using SHA-1 on the key, IP address) –map key to node whose id is “close” to the key (need distance function). –how is load balancing achieved?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Routing Issues Each node must be able to forward each lookup query to a node closer to the destination Maintain routing tables adaptively –each node knows some other nodes –must adapt to changes (joins, leaves, failures) –goals?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Handling Join/Leave When a node joins it needs to assume responsibility for some keys –ask the application to move these keys to it –how many keys will need to be moved? When a nodes fails or leaves, its keys have to be moved to others –what else is needed in order to implement this?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring P2P System Interface Lookup Join Move keys

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Chord Stoica, Morris, Karger, Kaashoek, and Balakrishnan

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Chord Logical Structure m-bit ID space (2 m IDs), usually m=160. Think of nodes as organized in a logical ring according to their IDs. N1 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Consistent Hashing: Assigning Keys to Nodes Key k is assigned to first node whose ID equals or follows k – successor(k) N1 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 K54

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Moving Keys upon Join/Leave When a node joins, it becomes responsible for some keys previously assigned to its successor –local change –assuming load is balanced, how many keys should move? And what happens when a node leaves?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Consistent Hashing Guarantees For any set of N nodes and K keys, with high probability: –Each node is responsible for at most (1 +  )K/N keys –When an (N + 1)st node joins or leaves the network, responsibility for O(K/N) keys changes hands (and only to or from the joining or leaving node). When consistent hashing is implemented as described above, the bound is with  = O(logN). The consistent hashing paper shows that can be reduced to an arbitrarily small constant by having each node run (logN) virtual nodes, each with its own identifier.

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Simple Routing Solutions Each node knows only its successor. –routing around the circle Each node knows all other nodes –O(1) routing –cost?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Chord Skiplist Routing Each node has “fingers” to nodes ½ way around the ID space from it, ¼ the way… finger[i] at n contains successor(n+2 i-1 ) successor is finger[1] N0 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56 How many fingers in the finger table?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Example: Chord Fingers N0 N10 N21 N30 N47 finger[1..4] N72 N82 N90 N114 finger[5] finger[6] finger[7] m entries log N distinct fingers with high probability

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Chord Data Structures (At Each Node) Finger table First finger is successor Predecessor

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Forwarding Queries Query for key k is forwarded to finger with highest ID not exceeding k K54 Lookup( K54 ) N0 N8 N10 N14 N21 N30 N38 N42 N48 N51 N56

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring How long does it take? Remote Procedure Call (RPC)

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Routing Time Node n looks up a key stored at node p p is in n’s ith interval: p  ((n+2 i-1 )mod 2 m, (n+2 i )mod 2 m ] n contacts f=finger[i] –the interval is not empty (because p is in it) so: f  ((n+2 i-1 )mod 2 m, (n+2 i )mod 2 m ] –RPC f f is at least 2 i-1 away from n p is at most 2 i-1 away from f The distance is halved: maximum m steps

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Routing Time Refined Assuming uniform node distribution around the circle, the number of nodes in the search space is halved at each step: –expected number of steps: log N Note that: –m = 160 –for 1,000,000 nodes, log N = 20

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Joining Chord Goals? Required steps: –find your successor –initialize finger table and predecessor –notify other nodes that need to change their finger table and predecessor pointer O(log 2 N) –learn the keys that you are responsible for; notify others that you assume control over them

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Join Algorithm: Take II Observation: for correctness, successors suffice –fingers only needed for performance Upon join, update successor only Periodically, –check that successors and predecessors are consistent –fix fingers

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Creation and Join

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Join Example joiner finds successor gets keys stabilize fixes successor stabilize fixes predecessor

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Join Stabilization Guarantee If any sequence of join operations is executed interleaved with stabilizations, then at some time after the last join the successor pointers will form a cycle on all the nodes in the network. Model assumptions?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Performance with Concurrent Joins If we take a stable network with N nodes with correct finger pointers, and another set of up to N nodes joins the network, and all successor pointers (but perhaps not all finger pointers) are correct, then lookups will still take O(logN) time with high probability. Model assumptions?

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Failure Handling Periodically fixing fingers List of r successors instead of one successor Periodically probing predecessors:

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring The Model? Reliable messages among correct nodes –No network partitions Node failures can be accurately detected! Properties hold as long as failure is bounded: –If we use a successor list of length r = (logN) in a network that is initially stable, and then every node fails with probability 1/2, then with high probability find successor returns the closest living successor to the query key. –In a network that is initially stable, if every node then fails with probability 1/2, then the expected time to execute find successor is O(logN).

Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring What About Moving Keys? Left up to the application Solution: keep soft state, refreshed periodically –every refresh operation performs lookup(key) before storing the key in the right place How can we increase reliability for the time between failure and refresh?