What is a P2P system? A distributed system architecture: No centralized control Nodes are symmetric in function Large number of unreliable nodes Enabled.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
Kademlia: A Peer-to-peer Information System Based on the XOR Metric Petar Mayamounkov David Mazières A few slides are taken from the authors’ original.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Distributed Hash Tables: Chord Brad Karp (with many slides contributed by Robert Morris) UCL Computer Science CS M038 / GZ06 27 th January, 2009.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Common approach 1. Define space: assign random ID (160-bit) to each node and key 2. Define a metric topology in this space,  that is, the space of keys.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Scalable Resource Information Service for Computational Grids Nian-Feng Tzeng Center for Advanced Computer Studies University of Louisiana at Lafayette.
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
Chord and CFS Philip Skov Knudsen Niels Teglsbo Jensen Mads Lundemann
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
File Sharing : Hash/Lookup Yossi Shasho (HW in last slide) Based on Chord: A Scalable Peer-to-peer Lookup Service for Internet ApplicationsChord: A Scalable.
Lecture 10 Naming services for flat namespaces. EECE 411: Design of Distributed Software Applications Logistics / reminders Project Send Samer and me.
A Backup System built from a Peer-to-Peer Distributed Hash Table Russ Cox joint work with Josh Cates, Frank Dabek, Frans Kaashoek, Robert Morris,
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Storage management and caching in PAST PRESENTED BY BASKAR RETHINASABAPATHI 1.
DISTRIBUTED HASH TABLES Building large-scale, robust distributed applications Frans Kaashoek Joint work with: H. Balakrishnan, P.
Wide-Area Cooperative Storage with CFS Robert Morris Frank Dabek, M. Frans Kaashoek, David Karger, Ion Stoica MIT and Berkeley.
DHTs and Peer-to-Peer Systems Supplemental Slides Aditya Akella 03/21/2007.
Wide-area cooperative storage with CFS Frank Dabek, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica.
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord+DHash+Ivy: Building Principled Peer-to-Peer Systems Robert Morris Joint work with F. Kaashoek, D. Karger, I. Stoica, H. Balakrishnan,
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Dr. Yingwu Zhu.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Distributed Hash Tables CPE 401 / 601 Computer Network Systems Modified from Ashwin Bharambe and Robert Morris.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
1 9/30/2009 Peer-to-Peer Systems: Unstructured. Admin. r Programming assignment 1 linked on the schedule page 2.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
Peer-to-peer computing research: a fad? Frans Kaashoek Joint work with: H. Balakrishnan, P. Druschel, J. Hellerstein, D. Karger, R.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Large Scale Sharing Marco F. Duarte COMP 520: Distributed Systems September 19, 2004.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
Peer-to-Peer (P2P) File Systems. P2P File Systems CS 5204 – Fall, Peer-to-Peer Systems Definition: “Peer-to-peer systems can be characterized as.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
Peer-to-Peer Information Systems Week 12: Naming
DHT Routing Geometries and Chord
Peer-to-Peer (P2P) File Systems
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
Peer-to-Peer Storage Systems
Distributed Hash Tables
Peer-to-Peer Information Systems Week 12: Naming
Kademlia: A Peer-to-peer Information System Based on the XOR Metric
Presentation transcript:

What is a P2P system? A distributed system architecture: No centralized control Nodes are symmetric in function Large number of unreliable nodes Enabled by technology improvements Node Internet

How to build critical services? Many critical services use Internet Hospitals, government agencies, etc. These services need to be robust Node and communication failures Load fluctuations (e.g., flash crowds) Attacks (including DDoS)

The promise of P2P computing Reliability: no central point of failure Many replicas Geographic distribution High capacity through parallelism: Many disks Many network connections Many CPUs Automatic configuration Useful in public and proprietary settings

Traditional distributed computing: client/server Successful architecture, and will continue to be so Tremendous engineering necessary to make server farms scalable and robust Server Client Internet

The abstraction: Distributed hash table (DHT) Distributed hash table Distributed application get (key) data node …. put(key, data) Lookup service lookup(key)node IP address Application may be distributed over many nodes DHT distributes data storage over many nodes (File sharing) (DHash) (Chord)

A DHT has a good interface Put(key, value) and get(key)  value Simple interface! API supports a wide range of applications DHT imposes no structure/meaning on keys Key/value pairs are persistent and global Can store keys in other DHT values And thus build complex data structures

A DHT makes a good shared infrastructure Many applications can share one DHT service Much as applications share the Internet Eases deployment of new applications Pools resources from many participants Efficient due to statistical multiplexing Fault-tolerant due to geographic distribution

Recent DHT-based projects File sharing [CFS, OceanStore, PAST, Ivy, …] Web cache [Squirrel,..] Archival/Backup store [HiveNet, Mojo, Pastiche] Censor-resistant stores [Eternity, FreeNet,..] DB query and indexing [PIER, …] Event notification [Scribe] Naming systems [ChordDNS, Twine,..] Communication primitives [I3, …] Common thread: data is location-independent

Roadmap One application: CFS/DHash One structured overlay: Chord Alternatives: Other solutions Geometry and performance The interface Applications

CFS: Cooperative file sharing DHT used as a robust block store Client of DHT implements file system Read-only: CFS, PAST Read-write: OceanStore, Ivy Distributed hash tables File system get (key) block node …. put (key, block)

CFS Design

File representation: self-authenticating data Signed blocks: – Root blocks; Chord ID = H(publisher's public key) Unsigned blocks – Directory blocks, inode blocks, data blocks; – Chord ID = H(block contents) 995: key=901 key=732 Signature File System key=995 ………… “a.txt” ID=144 key=431 key=795 … … (root block) (directory blocks) (i-node block) (data) 901= SHA = SHA-1 431=SHA-1

DHT distributes blocks by hashing IDs Internet Node A Node C Node B Node D 995: key=901 key=732 Signature Block 732 Block : key=407 key=992 key=705 Signature Block 992 Block 407 Block 705 DHT replicates blocks for fault tolerance DHT caches popular blocks for load balance

DHT implementation challenges 1.Scalable lookup 2.Balance load (flash crowds) 3.Handling failures 4.Coping with systems in flux 5.Network-awareness for performance 6.Robustness with untrusted participants 7.Programming abstraction 8.Heterogeneity 9.Anonymity 10.Indexing Goal: simple, provably-good algorithms

1. The lookup problem Internet N1N1 N2N2 N3N3 N6N6 N5N5 N4N4 Publisher Put (Key=sha-1(data), Value=data…) Client Get(key=sha-1(data)) ? Get() is a lookup followed by check Put() is a lookup followed by a store

Centralized lookup (Napster) Client Lookup(“title”) N6N6 N9N9 N7N7 DB N8N8 N3N3 N2N2 N1N1 SetLoc(“title”, N4) Simple, but O( N ) state and a single point of failure Key=“title” Value=file data… N4N4

Flooded queries (Gnutella) N4N4 Client N6N6 N9N9 N7N7 N8N8 N3N3 N2N2 N1N1 Robust, but worst case O( N ) messages per lookup Key=“title” Value=MP3 data… Lookup(“title”)

Algorithms based on routing Map keys to nodes in a load-balanced way Hash keys and nodes into a string of digit Assign key to “closest” node Examples: CAN, Chord, Kademlia, Pastry, Tapestry, Viceroy, …. K20 K5 K80 Circular ID space N32 N90 N105 N60 Forward a lookup for a key to a closer node Join: insert node in ring

Chord’s routing table: fingers N80 ½ ¼ 1/8 1/16 1/32 1/64 1/128

Lookups take O( log(N) ) hops N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19 Lookup: route to closest predecessor

Can we do better? Caching Exploit flexibility at the geometry level Iterative vs. recursive lookups

2. Balance load N32 N10 N5 N20 N110 N99 N80 N60 Lookup(K19) K19 Hash function balances keys over nodes For popular keys, cache along the path K19

Why Caching Works Well N20 Only O(log N) nodes have fingers pointing to N20 This limits the single-block load on N20

3. Handling failures: redundancy N32 N10 N5 N20 N110 N99 N80 N60 Each node knows IP addresses of next r nodes Each key is replicated at next r nodes N40 K19

Lookups find replicas N40 N10 N5 N20 N110 N99 N80 N60 N50 Block 17 N Lookup(BlockID=17) RPCs: 1. Lookup step 2. Get successor list 3. Failed block fetch 4. Block fetch

First Live Successor Manages Replicas N40 N10 N5 N20 N110 N99 N80 N60 N50 Block 17 N68 Copy of 17 Node can locally determine that it is the first live successor

4. Systems in flux Lookup takes log(N) hops If system is stable But, system is never stable! What we desire are theorems of the type: 1.In the almost-ideal state, ….log(N)… 2.System maintains almost-ideal state as nodes join and fail

Half-life [Liben-Nowell 2002] Doubling time: time for N joins Halfing time: time for N/2 old nodes to fail Half life: MIN(doubling-time, halfing-time) N nodes N new nodes join N/2 old nodes leave

Applying half life For any node u in any P2P network: If u wishes to stay connected with high probability, then, on average, u must be notified about  (log N) new nodes per half life And so on, …

5. Optimize routing to reduce latency Nodes close on ring, but far away in Internet Goal: put nodes in routing table that result in few hops and low latency N20 N41 N80 N40

“close” metric impacts choice of nearby nodes Chord’s numerical close and (original) routing table restrict choice Should new nodes be able to choose their own ID Other allows for more choice (e.g., prefix based, XOR) N32N103 N105 N60 N06 USA Europe USA Far east USA K104

6. Malicious participants Attacker denies service Flood DHT with data Attacker returns incorrect data [detectable] Self-authenticating data Attacker denies data exists [liveness] Bad node is responsible, but says no Bad node supplies incorrect routing info Bad nodes make a bad ring, and good node joins it Basic approach: use redundancy

Sybil attack [Douceur 02] Attacker creates multiple identities Attacker controls enough nodes to foil the redundancy N32 N10 N5 N20 N110 N99 N80 N60 N40  Need a way to control creation of node IDs

One solution: secure node IDs Every node has a public key Certificate authority signs public key of good nodes Every node signs and verifies messages Quotas per publisher

Another solution: exploit practical byzantine protocols A core set of servers is pre-configured with keys and perform admission control [OceanStore] The servers achieve consensus with a practical byzantine recovery protocol [Castro and Liskov ’99 and ’00] The servers serialize updates [OceanStore] or assign secure node Ids [Configuration service] N32N103 N105 N60 N06 N N N N

A more decentralized solution: weak secure node IDs ID = SHA-1 (IP-address node) Assumption: attacker controls limited IP addresses Before using a node, challenge it to verify its ID

Using weak secure node IDS Detect malicious nodes Define verifiable system properties Each node has a successor Data is stored at its successor Allow querier to observe lookup progress Each hop should bring the query closer Cross check routing tables with random queries Recovery: assume limited number of bad nodes Quota per node ID

Summary