T-110.5140 Network Application Frameworks and XML Distributed Hash Tables (DHTs) 19.2.2008 Sasu Tarkoma.

Slides:



Advertisements
Similar presentations
Internet Indirection Infrastructure (i3 ) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002 Presented by:
Advertisements

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Peer-to-Peer (P2P) Distributed Storage 1Dennis Kafura – CS5204 – Operating Systems.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
Technische Universität Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei Liao.
The Chord P2P Network Some slides have been borowed from the original presentation by the authors.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Chord: A scalable peer-to- peer lookup service for Internet applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan.
1 1 Chord: A scalable Peer-to-peer Lookup Service for Internet Applications Dariotaki Roula
Chord A Scalable Peer-to-peer Lookup Service for Internet Applications
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Presented by Elisavet Kozyri. A distributed application architecture that partitions tasks or work loads between peers Main actions: Find the owner of.
10/31/2007cs6221 Internet Indirection Infrastructure ( i3 ) Paper By Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Sharma Sonesh Sharma.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Internet Indirection Infrastructure Ion Stoica UC Berkeley.
Introduction to Peer-to-Peer (P2P) Systems Gabi Kliot - Computer Science Department, Technion Concurrent and Distributed Computing Course 28/06/2006 The.
Chord: A Scalable Peer-to-Peer Lookup Protocol for Internet Applications Stoica et al. Presented by Tam Chantem March 30, 2007.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Peer-to-Peer.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
Or, Providing Scalable, Decentralized Location and Routing Network Services Tapestry: Fault-tolerant Wide-area Application Infrastructure Motivation and.
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Internet Indirection Infrastructure (i3) Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana UC Berkeley SIGCOMM 2002.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord & CFS Presenter: Gang ZhouNov. 11th, University of Virginia.
Information-Centric Networks07a-1 Week 7 / Paper 1 Internet Indirection Infrastructure –Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh.
1 Reading Report 5 Yin Chen 2 Mar 2004 Reference: Chord: A Scalable Peer-To-Peer Lookup Service for Internet Applications, Ion Stoica, Robert Morris, david.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 Distributed Hash Tables (DHTs) Lars Jørgen Lillehovde Jo Grimstad Bang Distributed Hash Tables (DHTs)
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
Information-Centric Networks06b-1 Week 6 / Paper 2 A layered naming architecture for the Internet –Hari Balakrishnan, Karthik Lakshminarayanan, Sylvia.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
SIGCOMM 2001 Lecture slides by Dr. Yingwu Zhu Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Lecture 2 Distributed Hash Table
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Paper Survey of DHT Distributed Hash Table. Usages Directory service  Very little amount of information, such as URI, metadata, … Storage  Data, such.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
T Network Application Frameworks and XML Distributed Hash Tables (DHTs) and multi-homing Sasu Tarkoma Based on slides by Pekka Nikander.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Chord Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber Google,
Idit Keidar, Principles of Reliable Distributed Systems, Technion EE, Spring Principles of Reliable Distributed Systems Lecture 2: Distributed Hash.
Peer to Peer Network Design Discovery and Routing algorithms
Information-Centric Networks Section # 6.2: Evolved Naming & Resolution Instructor: George Xylomenos Department: Informatics.
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Nick McKeown CS244 Lecture 17 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications [Stoica et al 2001]
Peer-to-Peer (P2P) File Systems. P2P File Systems CS 5204 – Fall, Peer-to-Peer Systems Definition: “Peer-to-peer systems can be characterized as.
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications * CS587x Lecture Department of Computer Science Iowa State University *I. Stoica,
The Chord P2P Network Some slides taken from the original presentation by the authors.
Internet Indirection Infrastructure (i3)
T Network Application Frameworks and XML Distributed Hash Tables (DHTs) Sasu Tarkoma.
Peer-to-Peer Data Management
(slides by Nick Feamster)
Building Peer-to-Peer Systems with Chord, a Distributed Lookup Service
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Presentation transcript:

T Network Application Frameworks and XML Distributed Hash Tables (DHTs) Sasu Tarkoma

Contents n Motivation n Terminology n Overlays u Clusters and wide-area n Applications for DHTs u Internet Indirection Infrastructure (i3) u Delegation Oriented Architecture (DOA) n Next week u Multi-homing, mobility, and the Host Identity Protocol

Motivation for Overlays n Directories are needed u Name resolution & lookup u Mobility support F Fast updates n Required properties u Fast updates u Scalability n DNS has limitations u Update latency u Administration n One solution u Overlay networks n Alternative: DynDNS u Still centralized management u Inflexible, reverse DNS?

Terminology n Peer-to-peer (P2P) u Different from client-server model u Each peer has both client/server features n Overlay networks u Routing systems that run on top of another network, such as the Internet. n Distributed Hash Tables (DHT) u An algorithm for creating efficient distributed hash tables (lookup structures) u Used to implement overlay networks n Typical features of P2P / overlays u Scalability, resilience, high availability, and they tolerate frequent peer connections and disconnections

Peer-to-peer in more detail n A P2P system is distributed u No centralized control u Nodes are symmetric in functionality n Large faction of nodes are unreliable u Nodes come and go n P2P enabled by evolution in data communications and technology n Current challenges: u Security (zombie networks,trojans), IPR issues n P2P systems are decentralized overlays

Evolution of P2P systems n Started from centralized servers u Napster F Centralized directory F Single point of failure n Second generation used flooding u Gnutella F Local directory for each peer F High cost, worst-case O(N) messages for lookup n Research systems use DHTs u Chord, Tapestry, CAN,.. u Decentralization, scalability

Challenges n Challenges in the wide-area u Scalability u Increasing number of users, requests, files, traffic u Resilience: more components -> more failures u Management: intermittent resource availability -> complex management

Overlay Networks n Origin in Peer-to-Peer (P2P) n Builds upon Distributed Hash Tables (DHTs) n Easy to deploy u No changes to routers or TCP/IP stack u Typically on application layer n Overlay properties u Resilience u Fault-tolerance u Scalability

Upper layers Overlay Congestion End-to-end Routing DNS names, custom identifiers Overlay addresses IP addresses Routing paths

DHT interfaces n DHTs offer typically two functions u put(key, value) u get(key) --> value u delete(key) n Supports wide range of applications u Similar interface to UDP/IP F Send(IP address, data) F Receive(IP address) --> data n No restrictions are imposed on the semantics of values and keys n An arbitrary data blob can be hashed to a key n Key/value pairs are persistent and global

Distributed applications Distributed Hash Table (DHT) Node put(key, value)get(key) value DHT balances keys and data across nodes

Some DHT applications n File sharing n Web caching n Censor-resistant data storage n Event notification n Naming systems n Query and indexing n Communication primitives n Backup storage n Web archive

Cluster vs. Wide-area n Clusters are u single, secure, controlled, administrative domains u engineered to avoid network partitions u low-latency, high-throughput SANs u predictable behaviour, controlled environment n Wide-area u Heterogeneous networks u Unpredictable delays and packet drops u Multiple administrative domains u Network partitions possible

Wide-area requirements n Easy deployment n Scalability to millions of nodes and billions of data items n Availability u Copes with routine faults n Self-configuring, adaptive to network changes n Takes locality into account

Distributed Data Structures (DSS) n DHTs are an example of DSS n DHT algorithms are available for clusters and wide-area environments u They are different! n Cluster-based solutions u Ninja u LH* and variants n Wide-area solutions u Chord, Tapestry,.. u Flat DHTs, peers are equal u Maintain a subset of peers in a routing table

Distributed Data Structures (DSS) n Ninja project (UCB) u New storage layer for cluster services u Partition conventional data structure across nodes in a cluster u Replicate partitions with replica groups in cluster F Availability u Sync replicas to disk (durability) n Other DSS for data / clusters u LH* Linear Hashing for Distributed Files u Redundant versions for high-availability

Cluster-based Distributed Hash Tables (DHT) n Directory for non-hierarchical data n Several different ways to implement n A distributed hash table u Each “brick” maintains a partial map F “local” keys and values u Overlay addresses used to direct to the right “brick” F “remote” key to the brick using hashing n Resilience through parallel, unrelated mappings

client service DSS lib service DSS lib storage “brick” storage “brick” storage “brick” storage “brick” storage “brick” storage “brick” SAN Service interacts with DSS lib Hash table API Redundant, low latency, high throughput network Brick = single- node, durable hash table, replicated clients interact with any service “front-end”

Chord n Chord is an overlay algorithm from MIT u Stoica et. al., SIGCOMM 2001 n Chord is a lookup structure (a directory) u Resembles binary search n Uses consistent hashing to map keys to nodes u Keys are hashed to m-bit identifiers u Nodes have m-bit identifiers F IP-address is hashed u SHA-1 is used as the baseline algorithm n Support for rapid joins and leaves u Churn u Maintains routing tables

Chord routing I n A node has a well determined place within the ring n Identifiers are ordered on an identifier circle modulo 2 m u The Chord ring with m-bit identifiers n A node has a predecessor and a successor n A node stores the keys between its predecessor and itself u The (key, value) is stored on the successor node of key n A routing table (finger table) keeps track of other nodes

Finger Table n Each node maintains a routing table with at most m entries n The i:th entry of the table at node n contains the identity of the first node, s, that succeeds n by at least 2 i-1 on the identifier circle n s = successor(n + 2 i-1 ) u The i:th finger of node n

N1 N8 N14 N21 N32 N38 N42 N51 N56 2 m FingerMaps to Real node 1,2, x+1,x+2,x+4 x+8 x+16 x+32 N14 N21 N32 N42 m=6 for j=1,...,m the fingers of p+2 j-1 Predecessor node

Chord routing II n Routing steps u check whether the key k is found between n and the successor of n u if not, forward the request to the closest finger preceding k n Each knows a lot about nearby nodes and less about nodes farther away n The target node will be eventually found

Chord lookup N1 N8 N14 N21 N32 N38 N42 N51 N56 2 m -1 0 m=6

Invariants n Two invariants: u Each node's successor is correctly maintained. u For every key k, node successor(k) is responsible for k. n A node stores the keys between its predecessor and itself u The (key, value) is stored on the successor node of key

Join n A new node n joins n Needs to know an existing node n’ n Three steps u 1. Initialize the predecessor and fingers of node u 2. Update the fingers and predecessors of existing nodes to reflect the addition of n u 3. Notify the higher layer software and transfer keys n Leave uses steps 2. (update removal) and 3. (relocate keys)

1. Initialize routing information n Initialize the predecessor and fingers of the new node n n n asks n’ to look predecessor and fingers u One predecessor and m fingers n Look up predecessor u Requires log (N) time, one lookup n Look up each finger (at most m fingers) u log (N), we have Log N * Log N u O(Log 2 N) time

Steps 2. And 3. n 2. Updating fingers of existing nodes u Existing nodes must be updated to reflect the new node (any keys that are transferred to the new node) u Performed counter clock-wise on the circle F Algorithm takes i:th finger of n and walks in the counter-clock-wise direction until it encounters a node whose i:th finger precedes n F Node n will become the i:th finger of this node u O(Log 2 N) time n 3. Transfer keys u Keys are transferred only from the node immediately following n

Chord Properties n Each node is responsible for K/N keys (K is the number of keys, N is the number of nodes) n When a node joins or leaves the network only O(K/N) keys will be relocated n Lookups take O(log N) messages n To re-establish routing invariants after join/leave O(log 2 N) messages are needed

Tapestry n DHT developed at UCB u Zhao et. al., UC Berkeley TR 2001 n Used in OceanStore u Secure, wide-area storage service n Tree-like geometry n Suffix - based hypercube u 160 bits identifiers n Suffix routing from A to B u hop(h) shares suffix with B of length digits n Tapestry Core API: u publishObject(ObjectID,[serverID]) u routeMsgToObject(ObjectID) u routeMsgToNode(NodeID)

Suffix routing 0312 routes to 1643 via > > > > hop: shares 1 suffix with hop: shares 2 suffix with hop: shares 3 suffix with 1643 Rounting table with b*logb(N) entries Entry(i,j) – pointer to the neighbour j+(i-1) suffix

Pastry I n A DHT based on a circular flat identifier space n Prefix-routing u Message is sent towards a node which is numerically closest to the target node u Procedure is repeated until the node is found u Prefix match: number of identical digits before the first differing digit u Prefix match increases by every hop n Similar performance to Chord

Pastry II n Routing a message u If leaf set has the prefix --> send to local u else send to the identifier in the routing table with the longest common prefix (longer then the current node) u else query leaf set for a numerically closer node with the same prefix match as the current node u Used in event notification (Scribe)

Security Considerations n Malicious nodes u Attacker floods DHT with data u Attacker returns incorrect data F self-authenticating data u Attacker denies data exists or supplies incorrect routing info n Basic solution: using redundancy u k-redundant networks n What if attackers have quorum? u Need a way to control creation of node Ids u Solution: secure node identifiers F Use public keys

Applications for DHTs n DHTs are used as a basic building block for an application-level infrastructure u Internet Indirection Infrastructure (i3) F New forwarding infrastructure based on Chord u DOA (Delegation Oriented Architecture) F New naming and addressing infrastructure based on overlays

Internet Indirection Infrastructure (i3) n A DHT - based overlay network u Based on Chord n Aims to provide more flexible communication model than current IP addressing n Also a forwarding infrastructure u i3 packets are sent to identifiers u each identifier is routed to the i3 node responsible for that identifier u the node maintains triggers that are installed by receivers u when a matching trigger is found the packet is forwarded to the receiver

i3 II n An i3 identifier may be bound to a host, object, or a session n i3 has been extended with ROAM u Robust Overlay Architecture for Mobility u Allows end hosts to control the placement of rendezvous-points (indirection points) for efficient routing and handovers u Legacy application support F user level proxy for encapsulating IP packets to i3 packets

Source: R inserts a trigger (id, R) and receives all packets with identifier id. the host changes its address from R1 to R2, it updates its trigger from (id, R1) to (id, R2). Mobility is transparent for the sender

Source: A multicast tree using a hierarchy of triggers

Source: Anycast using the longest matching prefix rule.

Source: Sender-driven service composition using a stack of identifiers Receiver-driven service composition using a stack of identifiers

Layered Naming Architecture n Presented in paper: u A Layered Naming Architecture for the Internet, Balakrishnan et al. SIGCOMM 2004 n Service Identifiers (SIDs) are host-independent data names n End-point Identifiers (EIDs) are location- independent host names n Protocols bind to names and resolve them u Applications use SIDs as handles n SIDs and EIDs should be flat u Stable-bame principle: A stable name should not impose restrictions on the entity it names n Inspiration: HIP + i3 + Semantic Free Referencing n Prototype: Delegation Oriented Architecture (DOA)

IP Transport App session User level descriptors (search query..) Search returns SIDsSIDs are resolved to EIDs Resolves EIDs to IP IP HDREIDTCPSID Transport App session Bind to EID Use SID as handle

DOA cont. n DOA header is located between IP and TCP headers and carries source and destination EIDs n The mapping service maps EIDs to IP addresses n This allows the introduction of various middleboxes to the routing of packets u Service chain, end-point chain n Outsourcing of intermediaries u Ideally clients may select the most useful network elements to use n Differences to i3 u Not a forwarding infrastructure

OpenDHT n A publicly accessible distributed hash table (DHT) service. n OpenDHT runs on a collection of nodes on PlanetLab. n Client do not need to participate as DHT nodes. Bamboo DHT n A test bed infrastructure u Open infrastructure for puts and gets u Organizing clients that handle application upcalls n n OpenDHT: A Public DHT Service and Its Uses. Sean Rhea, Brighten Godfrey, Brad Karp, John Kubiatowicz, Sylvia Ratnasamy, Scott Shenker, Ion Stoica, and Harlan Yu. Proceedings of ACM SIGCOMM 2005, August 2005.

Summary n Mobility and multi-homing require directories u Scalability, low-latency updates n Overlay networks have been proposed u Searching, storing, routing, notification,.. u Lookup (Chord, Tapestry, Pastry), coordination primitives (i3), middlebox support (DOA) u Logarithmic scalability, decentralised,… n Many applications for overlays u Lookup, rendezvous, data distribution and dissemination, coordination, service composition, general indirection support n Deployment open. PlanetLab.

Discussion Points n Why most popular P2P programs have simple algorithms? n How secure should P2P / overlays be? n Will DHTs be eventually deployed in commercial software? n Will DHTs be part of future core Internet?