P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.

Slides:



Advertisements
Similar presentations
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Advertisements

Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Clayton Sullivan PEER-TO-PEER NETWORKS. INTRODUCTION What is a Peer-To-Peer Network A Peer Application Overlay Network Network Architecture and System.
P2P Systems and Distributed Hash Tables Section COS 461: Computer Networks Spring 2011 Mike Freedman
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Efficient Content Location Using Interest-based Locality in Peer-to-Peer Systems Presented by: Lin Wing Kai.
Distributed Lookup Systems
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Object Naming & Content based Object Search 2/3/2003.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
A Survey of Peer-to-Peer Content Distribution Technologies Stephanos Androutsellis-Theotokis and Diomidis Spinellis ACM Computing Surveys, December 2004.
Distributed Systems Concepts and Design Chapter 10: Peer-to-Peer Systems Bruce Hammer, Steve Wallis, Raymond Ho.
1 P2P Computing. 2 What is P2P? Server-Client model.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Introduction of P2P systems
Overlay network concept Case study: Distributed Hash table (DHT) Case study: Distributed Hash table (DHT)
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Network Computing Laboratory Scalable File Sharing System Using Distributed Hash Table Idea Proposal April 14, 2005 Presentation by Jaesun Han.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Super-peer Network. Motivation: Search in P2P Centralised (Napster) Flooding (Gnutella)  Essentially a breadth-first search using TTLs Distributed Hash.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
PEER TO PEER (P2P) NETWORK By: Linda Rockson 11/28/06.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Lecture 12 Distributed Hash Tables CPE 401/601 Computer Network Systems slides are modified from Jennifer Rexford.
CSE 486/586, Spring 2014 CSE 486/586 Distributed Systems Distributed Hash Tables Steve Ko Computer Sciences and Engineering University at Buffalo.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Bruce Hammer, Steve Wallis, Raymond Ho
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
Malugo – a scalable peer-to-peer storage system..
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Distributed Hash Tables
Peer-to-Peer Data Management
(slides by Nick Feamster)
EE 122: Peer-to-Peer (P2P) Networks
DHT Routing Geometries and Chord
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
P2P Systems and Distributed Hash Tables
Consistent Hashing and Distributed Hash Table
Presentation transcript:

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004

Outline Motivations Early Architectures DHT: Chord Conclusions

What is P2P P2P means Peer-to-Peer and has many connotations Common use: filestealing systems (Gnutella, eDonkey) and user-centric networks (ICQ, Yahoo! Messenger) Our focus: many independent data providers, large-scale systems, non- existent management

Motivations IM (Instant Messaging) 50 million users identified by username. To connect to user A we have to resolve username A to her ip-address Centralized solution: fault-tolerance, load balancing?

Motivations PM (Profile management): Amazon, Google, Yahoo! 100 million users store, retrieve and update their profiles 1000 computers are available for PM How to build fast, robust, fault-tolerant storage?

Motivations File-sharing(stealing) networks 20 million users share files identified by keywords How to build efficient file search? High churn rate! Average lifetime of a node in the networks = 2 hours

What is common? High Churn Few guarantees on transport, storage, etc. Huge optimization space Network bottlenecks & other resource constraints No administrative organizations

Early P2P I: Client-Server Napster xyz.mp3 ? xyz.mp3

Early P2P I: Client-Server Napster C-S search xyz.mp3

Early P2P I: Client-Server Napster C-S search xyz.mp3 ? xyz.mp3

Early P2P I: Client-Server Napster C-S search “pt2pt” file xfer xyz.mp3 ? xyz.mp3

Early P2P I: Client-Server Napster C-S search “pt2pt” file xfer xyz.mp3 ? xyz.mp3

Early P2P I: Client Server Server assigns work units My machine info

Early P2P I: Client Server Server assigns work units Task: f(x)

Early P2P I: Client Server Server assigns work units Result: f(x) 60 TeraFLOPS!

Early P2P II: Flooding on Overlays xyz.mp3 ? xyz.mp3 An overlay network. “Unstructured”.

Early P2P II: Flooding on Overlays xyz.mp3 ? xyz.mp3 Flooding

Early P2P II: Flooding on Overlays xyz.mp3 ? xyz.mp3 Flooding

Early P2P II: Flooding on Overlays xyz.mp3

Early P2P II.v: “Ultrapeers” Ultrapeers can be installed (KaZaA) or self-promoted (Gnutella)

What is a DHT? Hash Table data structure that maps “keys” to “values” essential building block in software systems Distributed Hash Table (DHT) similar, but spread across the Internet Interface insert(key, value) lookup(key)

How? Every DHT node supports a single operation: Given key as input; route messages toward node holding key

K V DHT in action

K V DHT in action

K V DHT in action Operation: take key as input; route messages to node holding key

K V DHT in action: put() insert(K 1,V 1 ) Operation: take key as input; route messages to node holding key

K V DHT in action: put() Operation: take key as input; route messages to node holding key insert(K 1,V 1 )

(K 1,V 1 ) K V DHT in action: put() Operation: take key as input; route messages to node holding key

retrieve (K 1 ) K V DHT in action: get() Operation: take key as input; route messages to node holding key

retrieve (K 1 ) K V Iterative vs. Recursive Routing Operation: take key as input; route messages to node holding key Previously showed recursive. Another option: iterative

DHT Design Goals An “overlay” network with: Flexible mapping of keys to physical nodes Small network diameter Small degree (fanout) Local routing decisions Robustness to churn Routing flexibility Not considered here: Robustness (erasure codes, replication) Security, privacy

An Example DHT: Chord Assume n = 2 m nodes for a moment A “complete” Chord ring We’ll generalize shortly

An Example DHT: Chord

Overlayed 2 k -Gons

Routing in Chord At most one of each Gon E.g. 1-to-0

Routing in Chord At most one of each Gon E.g. 1-to-0

Routing in Chord At most one of each Gon E.g. 1-to-0

Routing in Chord At most one of each Gon E.g. 1-to-0

Routing in Chord At most one of each Gon E.g. 1-to-0

Routing in Chord At most one of each Gon E.g. 1-to-0 What happened? We constructed the binary number 15! Routing from x to y is like computing y - x mod n by summing powers of Diameter: log n (1 hop per gon type) Degree: log n (one outlink per gon type)

Joining the Chord Ring Need IP of some node Pick a random ID (e.g. SHA-1(IP)) Send msg to current owner of that ID That’s your predecessor

Joining the Chord Ring Need IP of some node Pick a random ID (e.g. SHA- 1(IP)) Send msg to current owner of that ID That’s your predecessor Update pred/succ links Once the ring is in place, all is well! Inform app to move data appropriately Search to install “fingers” of varying powers of 2 Or just copy from pred/succ and check! Inbound fingers fixed lazily

Conclusions DHT implements log n lookup, insert, Log2 n newNode, DeleteNode Next part: P2P database systems