LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University

Slides:



Advertisements
Similar presentations
Where to leave the data ? – Parallel systems – Scalable Distributed Data Structures – Dynamic Hash Table (P2P)
Advertisements

P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Peer to Peer and Distributed Hash Tables
Pastry Peter Druschel, Rice University Antony Rowstron, Microsoft Research UK Some slides are borrowed from the original presentation by the authors.
Digital Library Service – An overview Introduction System Architecture Components and their functionalities Experimental Results.
Scalable Content-Addressable Network Lintao Liu
External Memory Hashing. Model of Computation Data stored on disk(s) Minimum transfer unit: a page = b bytes or B records (or block) N records -> N/B.
CHORD – peer to peer lookup protocol Shankar Karthik Vaithianathan & Aravind Sivaraman University of Central Florida.
Rim Moussa University Paris 9 Dauphine Experimental Performance Analysis of LH* RS Parity Management Workshop on Distributed Data Structures: WDAS 2002.
Technische Universität Yimei Liao Chemnitz Kurt Tutschku Vertretung - Professur Rechner- netze und verteilte Systeme Chord - A Distributed Hash Table Yimei.
CHORD: A Peer-to-Peer Lookup Service CHORD: A Peer-to-Peer Lookup Service Ion StoicaRobert Morris David R. Karger M. Frans Kaashoek Hari Balakrishnan Presented.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Speaker: Cathrin Weiß 11/23/2004 Proseminar Peer-to-Peer Information Systems.
Ion Stoica, Robert Morris, David Liben-Nowell, David R. Karger, M
Chord: A Scalable Peer-to-Peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
Robert Morris, M. Frans Kaashoek, David Karger, Hari Balakrishnan, Ion Stoica, David Liben-Nowell, Frank Dabek Chord: A scalable peer-to-peer look-up protocol.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion StoicaRobert Morris David Liben-NowellDavid R. Karger M. Frans KaashoekFrank.
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Peer-to-Peer Networks João Guerreiro Truong Cong Thanh Department of Information Technology Uppsala University.
Peer to Peer File Sharing Huseyin Ozgur TAN. What is Peer-to-Peer?  Every node is designed to(but may not by user choice) provide some service that helps.
1 Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Robert Morris Ion Stoica, David Karger, M. Frans Kaashoek, Hari Balakrishnan.
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
Scalable and Distributed Similarity Search in Metric Spaces Michal Batko Claudio Gennaro Pavel Zezula.
Spring 2003CS 4611 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
Distributed Lookup Systems
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek and Hari alakrishnan.
Structure Overlay Networks and Chord Presentation by Todd Gardner Figures from: Ion Stoica, Robert Morris, David Liben- Nowell, David R. Karger, M. Frans.
1 Client-Server versus P2P  Client-server Computing  Purpose, definition, characteristics  Relationship to the GRID  Research issues  P2P Computing.
Secure Overlay Services Adam Hathcock Information Assurance Lab Auburn University.
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
Peer To Peer Distributed Systems Pete Keleher. Why Distributed Systems? l Aggregate resources! –memory –disk –CPU cycles l Proximity to physical stuff.
Wide-area cooperative storage with CFS
1 Peer-to-Peer Networks Outline Survey Self-organizing overlay network File system on top of P2P network Contributions from Peter Druschel.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
CSE 461 University of Washington1 Topic Peer-to-peer content delivery – Runs without dedicated infrastructure – BitTorrent as an example Peer.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
INTRODUCTION TO PEER TO PEER NETWORKS Z.M. Joseph CSE 6392 – DB Exploration Spring 2006 CSE, UT Arlington.
Using Algebraic Signatures in Storage Applications Thomas Schwarz, S.J. Associate Professor, Santa Clara University Associate, SSRC UCSC Storage Systems.
1 Pattern Matching Using n-grams With Algebraic Signatures Witold Litwin[1], Riad Mokadem1, Philippe Rigaux1 & Thomas Schwarz[2] [1] Université Paris Dauphine.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
Content Overlays (Nick Feamster). 2 Content Overlays Distributed content storage and retrieval Two primary approaches: –Structured overlay –Unstructured.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
1 High-Availability LH* Schemes with Mirroring W. Litwin, M.-A. Neimat U. Paris 9 & HPL Palo-Alto
Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT and Berkeley presented by Daniel Figueiredo Chord: A Scalable Peer-to-peer.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
1 Pattern Matching Using n-gram Sampling Of Cumulative Algebraic Signatures : Preliminary Results Witold Litwin[1], Riad Mokadem1, Philippe Rigaux1 & Thomas.
Presentation 1 By: Hitesh Chheda 2/2/2010. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan MIT Laboratory for Computer Science.
1 High-Availability in Scalable Distributed Data Structures W. Litwin.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications.
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan Presented.
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
Peer to Peer A Survey and comparison of peer-to-peer overlay network schemes And so on… Chulhyun Park
1 Scalable Distributed Data Structures Part 2 Witold Litwin Paris 9
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
Peer to Peer Network Design Discovery and Routing algorithms
Algorithms and Techniques in Structured Scalable Peer-to-Peer Networks
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
Introduction and File Structures Database System Implementation CSE 507 Some slides adapted from R. Elmasri and S. Navathe, Fundamentals of Database Systems,
1 Secure Peer-to-Peer File Sharing Frans Kaashoek, David Karger, Robert Morris, Ion Stoica, Hari Balakrishnan MIT Laboratory.
CS 425 / ECE 428 Distributed Systems Fall 2015 Indranil Gupta (Indy) Peer-to-peer Systems All slides © IG.
LH*RSP2P: A Scalable Distributed Data Structure for P2P Environment
Implementation issues
Soror SAHRI Witold LITWIN Thomas SCHWARTZ
MIT LCS Proceedings of the 2001 ACM SIGCOMM Conference
Consistent Hashing and Distributed Hash Table
Presentation transcript:

LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment W. LITWIN CERIA Laboratory H.YAKOUBEN Paris Dauphine University Paris Dauphine University T. SCHWARZ Santa Clara University (USA)

LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment 2 Plan  Objective  Overview: SDDS & P2P  LH* RS P2P  Architecture  Addressing  Properties  Churn Management  Conclusion

Design a new SDDS for a structured P2P environment A High available Data Structure and treatment of CHURN LH*RS P2P key search requires at most one forwarding message LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment 3 Objective High availability to deal with churn At most one forwarding message for key search or insert or scan (fastest known performance) Very Large Scalable Files

A File of records identified by keys SDDS client nodes face the applications and send queries to SDDS server nodes No centralized addressing Servers contain application or parity data  In buckets Overflowing servers split on new servers Servers do not notify clients about splits SDDS (1993) LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment 4

Clients use images of the file state for addressing  Key based  Range queries  Scans  … Images get adjusted towards the file state during queries by Image Adjustment Messages  Triggered by incorrect addressing by the client IAMs reflect the file evolution by splits or, rarely, merges. IAMs reflect also the location changes because of failures and recovery SDDS (1993) 5 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

6 SDDS Typology LH* sa SDDS(1993) Data Structures Classics Tree m-d Tree 1-d Tree RP*, k-RP*, SD-Rtree, DRT*, LH*, DDH, EH*, Hash High Availability 1-dimensional d-dimensional IH*… LH* rs LH*s Alg. Sign… k-Availability Security LH* m LH* g LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Structured P2P Schemes CHORD... BATON VBI-Tree

New Peer 7 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Clients Growth through splits under inserts Peer SDDS Expansion

8 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Clients SDDS Client Image Evolution Image Adjustment Messaging

Available at CERIA site Announced at DbWorld Managing LH* RS and RP* files  In distributed RAM  Uner Windows  Over 1gbs Ethernet Various functions  Response time reaching 30 microsec  Up to 300 times faster than disk files SDDS 2007 Prototype 9

Autonomous nodes store and search data By flooding in early systems  Freenet, Napster, Gnutella… Structured P2P reduce the flooding  Using decentralized data structures Distributed Hash Table (DHT) especially  Few folks know the concept is due to B. Devine  FODO 93  Chord, P-tree, VBI, Baton… Structured P2P schemes are specific SDDS schemes P2P (1995 ?) 10

11 Client Address Calculus a’  h i’ (C ) ; /* a’ is the address of peer destination of the key C*/ if a’ < n’ then a  h i’+1 (C ) ; LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Global Addressing Rule a  h i (C ) ; /* a is the address of peer destination of the key C*/ if a < n then a  h i+1 (C ) ; /* (i, n) state of an SDDS file, they are only known to the file coordinator node LH* RS P2P Addressing h i (C ) = C mod 2 i

File starts with i = 0 and n = 0 and a single data bucket 0 Every bucket m keeps the bucket level j of hash function h i last used to split, j = 0 initially. Overflowing bucket m alerts the coordinator Coordinator notifies bucket n to split Bucket n applies h i + 1 About half of keys migrates to new bucket n + 2 i Bucket n and the new one set j = j + 1 Coordinator performs  n = n + 1  if n = 2 i then i = i + 1 and n = 0 LH* RS P2P File Expansion 12

Architecture based on LH* RS 13 LH* RS P2P j i’ n’ Client Part Server Part LH* P2P Peer LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment LH* RS P2P Peer LH* RS Client LH* RS DB Candidate Peer Client & Spare Storage LH* RS P2P Peer LH* RS Client LH* RS PB Pupils

14 i’ = j-1 ; /* j value before the split n‘ = a +1 /* a is the splitting bucket if n’ = 2 i’ then i’ = j + 1 ; n’ = 0 ; LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer & Pupil Image Adjustment During Peer Split

15 Before splitting Coordinator Peer (CP) P0 j=2 i’=1 n’=1 P2 P1 i=1 n=1 j=1 i’=1 n’=0 j=2 i’=1 n’=1 After splitting j=2 i’=2 n’=0 CP P0 j=2 i’=1 n’=1 P2 P1 i=2 n=0 j=2 i’=2 n’=0 j=2 i’=1 n’=1 P3 i’= j =1; n’= m+1= 1+1; If n’=2 1 then n’=0; i’= i’+1 and (i’, n’)= (2,0) LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Example

16 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Server Address Calculus a’  h j (C ) ; if a’= a then exit/* Bucket a is the correct one else send C to bucket a /* Forwarding to bucket a’ exit;  Simpler and faster than for LH*  As only one forwarding is possible

17 i’  j - 1, n’  a + 1 ; if n’ >2 i’ then n’  0 ; i’  i’ + 1 ; LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer Image Adjustment by IAM  IAM comes from the correct bucket  Bucket a is the forwarding one  Bucket level j is that of the correct bucket  0f the forwarding one as well Same algorithm as for the adjustment of the local client and of pupils after a split

18 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment PC i =3 n=2 P0 j=4 i’=3 n’=1 Pairs P4 j=3 i’=2 n’=1 P9 j=4 i’=3 n’=2 P1 j=4 i’=3 n’=2 9 9 Checking and forward the key using A2 Peer Image Adjustment by IAM IAM a = 1 j = 4

19 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment PC i =3 n=2 P0 j=4 i’=3 n’=1 Pairs P4 j=3 i’= 3 n’= 2 P9 j=4 i’=3 n’=2 P1 j=4 i’=3 n’=2 9 9 Peer Image Adjustment by IAM IAM a = 1 j = 4

20 Example of the File Expansion PC i=2 n=2 P0 j=3 i’=2 n’=1 Peers P2 j=2 i’=1 n’=1 P5 j=3 i’=2 n’=2 Candidate Peer i’=0 n’=0 P6 j=3 i’=2 n’=3 i=2 n=3 i’=2 n’=3 j=3 Pupil i’=2 n’=1 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Assign a Tutor for Candidate Peer: LH-hash of the client IP Address TUTOR, Update Pupil LH* RS P2P

21 1. The maximal number of forwarding messages for the key search is one. 2. The maximal number of rounds for the scan search can be two. 3. The worst case addressing performance of LH* RS P2P as defined by Property 1 is the fastest possible for any SDDS or a practical structured P2P addressing scheme. LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Properties of LH* RS P2P :

22 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Proof Property 1 nn+2 i’ 2 i’ a’ 0 a  Case 1 : i’ = i and n’ < n  Peer a addresses peer a’, using its image (i’,n’) from last split  No IAM came since. j = i’+1 j = i’ No forwarding a+2 i’

23 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Forwarding possible for any address a’ between (a, n)  Case 1 : i’ = i and n’ < n  Peer a addresses peer a’, using its image (i’,n’) from last split  No IAM came since. nn+2 i’ 2 i’ a’ 0 a j = i’+1 j = i’ a+2 i’ Proof Property 1

24 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Forwarding possible for any address a’ beyond [n, a]  Case 2 : i = i’ + 1 and n < n’  Peer a addresses peer a’, using its image (i’,n’) from last split  No IAM came since. n n+2 i’+1 2 i’ a’ 0 a j = i’+2 j = i’+1 j = i’ 2 i’+1 Proof Property 1

25 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer a sends the scan to all buckets in its image Including its image (i’, n’) Receiving peer a’ can have bucket level j as in the image j (a) = j’ (a) No forwarding of the scan Or, bucket a’ split Once and only once j (a) = j’ (a) + 1 See the figs for the key address calculus Proof Property 2

26 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer a’ forwards the scan to its (only) child No child can have a child Peer a would first need to split again as well Every peer gets thus the scan and only once There at worst two rounds Proof Property 2

27 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment The only faster worst case performance is zero forwarding messages Every split has to be notified then to every peer It would be against the scalability goal of every SDDS & structured P2P scheme Proof Property 2

28 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment LH* RS P2P Churn Management Bucket reliability group with k parity buckets protect against up to k bucket failures per group Parity PeerData PeerRank Parity RecordData Record Tutoring records

29 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer leaves with notice Coordinator Peer j i’,n’ j j j P0P0 PmPm PlPl Candidate Peer Notification … … Say that’s OK LH* RS P2P Churn Management

30 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer leaves without notice or fails Coordinator Peer j i’,n’ j j j P l-1 PmPm PlPl Parity Peer Query Forward LH* RS Bucket Recovery LH* RS P2P Churn Management

31 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Peer leaves without notice or fails Coordinator Peer j i’,n’ j j j P l-1 PmPm PlPl Parity Peer Answer LH* RS Bucket Recovery LH* RS P2P Churn Management

32 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Sure Search : Protects against outdated server read (transient communication or peer failure) Coordinator Peer j i’,n’ j j j P l-1 PmPm PlPl Parity Peer Query Answer j j i’,n’ PlPl LH* RS P2P Churn Management

33 Conclusion LH* RS P2P require at most one forward message when addressing error occur Is the fastest known SDDS and P2P key based addressing algorithm Protects efficiently against churn Allows to manage very large scalable files Should have numerous applications LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

34 Current & Future Work LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Implementation of the peer node architecture and the tutoring functions  Using existing LH* RS prototype  Created by Rim Moussa & shown at VLDB 2004 Performance Analysis Variants

35 END LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment Work partly funded by the IST eGov-Bus project Thank you for Your Attention

36 [1] Adina Crainiceanu, Prakash Linga, Johannes Gehrke, and Jayavel Shanmugasundaram. Querying Peer-to-Peer Networks Using P-Trees. In Proceedings of the Seventh International Workshop on the Web and Databases (WebDB 2004)., June [2] Bolosky W. J, Douceur J. R, Howell J. The Farsite Project: A Retrospective. Operating System Review, April 2007, p [3] Devine R. Design and Implementation of DDH: A Distributed Dynamic Hashing Algorithm, Proc. Of the 4 th Intl. Foundation of Data Organisation and Algorithms –FODO, [4] Litwin, W. Neimat, M-A., Schneider, D. LH*: Linear Hashing for Distributed Files. ACM- SIGMOD Int. Conf. On Management of Data, 93. [5] Litwin, W., Neimat, M-A., Schneider, D. LH*: A Scalable Distributed Data Structure. ACM- TODS, (Dec., 1996). [6] Litwin, W., Neimat, M-A. High Availability LH* Schemes with Mirroring, Intl. Conf on Cooperating systems,, IEEE Press [7] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Distributed Data Storage. Proc of 30 th VLDB Conference,, [8] Litwin, W. Moussa R, Schwarz T. LH*rs- A Highly Available Scalable Distributed Data Structure. ACM-TODS, Sept [9] Steven D. Gribble, Eric A. Brewer, Joseph M. Hellerstein, and David Culler. Scalable, Distributed Data Structures for Internet Service Construction, Proceedings of the Fourth Symposium on Operating Systems Design and Implementation (OSDI 2000) [10]Stoica, Morris, Karger, Kaashoek, Balakrishma. CHORD : A scalable Peer to Peer Lookup Service for Internet Application. SIGCOMM’O, August 27-31, 2001, References LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment

37 LH* RS P2P : A Scalable Distributed Data Structure for P2P Environment