An Optimal Broadcast Algorithm for Content-Addressable Networks Ludovic Henrio Fabrice Huet Justine Rochas 1 18/12/2013 - OPODIS (Nice)

Slides:



Advertisements
Similar presentations
Scalable Content-Addressable Network Lintao Liu
Advertisements

Great Theoretical Ideas in Computer Science for Some.
Lecture 7-2 : Distributed Algorithms for Sorting Courtesy : Michael J. Quinn, Parallel Programming in C with MPI and OpenMP (chapter 14)
Massively Distributed Database Systems Distributed Hash Spring 2014 Ki-Joune Li Pusan National University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker Presented by Greg Nims.
Thomas ZahnCST1 Seminar: Information Management in the Web Query Processing Over Peer- to-Peer Data Sharing Systems (UC Santa Barbara)
A Scalable Content Addressable Network (CAN)
1 One Torus to Rule Them All: Multi-dimensional Queries in P2P Systems Prasanna Ganesan Beverly Yang Hector Garcia-Molina Stanford University.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network (CAN) ACIRI U.C.Berkeley Tahoe Networks.
A Scalable Content-Addressable Network Authors: S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker University of California, Berkeley Presenter:
Chapter 3: Data Storage and Access Methods
LSDS-IR’08, October 30, Peer-to-Peer Similarity Search over Widely Distributed Document Collections Christos Doulkeridis 1, Kjetil Nørvåg 2, Michalis.
1 A Scalable Content- Addressable Network S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker Proceedings of ACM SIGCOMM ’01 Sections: 3.5 & 3.7.
presented by Hasan SÖZER1 Scalable P2P Search Daniel A. Menascé George Mason University.
1 An Empirical Study on Large-Scale Content-Based Image Retrieval Group Meeting Presented by Wyman
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Content Addressable Networks. CAN Associate with each node and item a unique id in a d-dimensional space Goals –Scales to hundreds of thousands of nodes.
1 CS 194: Distributed Systems Distributed Hash Tables Scott Shenker and Ion Stoica Computer Science Division Department of Electrical Engineering and Computer.
SQUARE Scalable Quorum-based Atomic Memory with Local Reconfiguration Vincent Gramoli, Emmanuelle Anceaume, Antonino Virgillito.
P2P Course, Structured systems 1 Introduction (26/10/05)
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
ICDE A Peer-to-peer Framework for Caching Range Queries Ozgur D. Sahin Abhishek Gupta Divyakant Agrawal Amr El Abbadi Department of Computer Science.
“Umbrella”: A novel fixed-size DHT protocol A.D. Sotiriou.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Knight’s Tour Distributed Problem Solving Knight’s Tour Yoav Kasorla Izhaq Shohat.
CS401 presentation1 Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility Takahiro Hara Presented by Mingsheng Peng (Proc. IEEE.
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker A Scalable, Content- Addressable Network ACIRI U.C.Berkeley Tahoe Networks 1.
1 A scalable Content- Addressable Network Sylvia Rathnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker Pirammanayagam Manickavasagam.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Other Structured P2P Systems CAN, BATON Lecture 4 1.
COCONET: Co-Operative Cache driven Overlay NETwork for p2p VoD streaming Abhishek Bhattacharya, Zhenyu Yang & Deng Pan.
Paraskevi Raftopoulou 1,2 Paraskevi Raftopoulou 1,2 and Euripides G.M. Petrakis 2 1 Max-Planck Institute for Informatics, Saarbruecken, Germany
CONTENT ADDRESSABLE NETWORK Sylvia Ratsanamy, Mark Handley Paul Francis, Richard Karp Scott Shenker.
GeoGrid: A scalable Location Service Network Authors: J.Zhang, G.Zhang, L.Liu Georgia Institute of Technology presented by Olga Weiss Com S 587x, Fall.
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
Skyline Queries Against Mobile Lightweight Devices in MANETs Zhiyong Huang 1 Christian S. Jensen 2 Hua Lu 1 Beng Chin Ooi 1 1 National University of Singapore,
CS4231 Parallel and Distributed Algorithms AY 2006/2007 Semester 2 Lecture 10 Instructor: Haifeng YU.
Project 2 Presentation & Demo Course: Distributed Systems By Pooja Singhal 11/22/
Decomposing Data-Centric Storage Query Hot-Spots in Sensor Netwokrs Mohamed Aly, Panos K. Chrysanthis, and Kirk Pruhs University of Pittsburgh Proceeding.
Vincent Matossian September 21st 2001 ECE 579 An Overview of Decentralized Discovery mechanisms.
Content Addressable Network CAN. The CAN is essentially a distributed Internet-scale hash table that maps file names to their location in the network.
A Scalable Content-Addressable Network (CAN) Seminar “Peer-to-peer Information Systems” Speaker Vladimir Eske Advisor Dr. Ralf Schenkel November 2003.
Zone Sharing: A Hot-Spots Decomposition Scheme for Data-Centric Storage in Sensor Networks Mohamed Aly Nicholas Morsillo Panos K. Chrysanthis Kirk Pruhs.
An IP Address Based Caching Scheme for Peer-to-Peer Networks Ronaldo Alves Ferreira Joint work with Ananth Grama and Suresh Jagannathan Department of Computer.
RF network in SoC1 SoC Test Architecture with RF/Wireless Connectivity 1. D. Zhao, S. Upadhyaya, M. Margala, “A new SoC test architecture with RF/wireless.
1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
A Mechanized Model for CAN Protocols Context and objectives Our mechanized model Results Conclusions and Future Works Francesco Bongiovanni and Ludovic.
Content Addressable Networks CAN is a distributed infrastructure, that provides hash table-like functionality on Internet-like scales. Keys hashed into.
InterConnection Network Topologies to Minimize graph diameter: Low Diameter Regular graphs and Physical Wire Length Constrained networks Nilesh Choudhury.
Efficient Semantic Based Content Search in P2P Network Heng Tao Shen, Yan Feng Shu, and Bei Yu.
Scalable Content- Addressable Networks Prepared by Kuhan Paramsothy March 5, 2007.
P2P Group Meeting (ICS/FORTH) Monday, 28 March, 2005 A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp,
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
Universität Stuttgart Institute of Parallel and Distributed Systems (IPVS) Universitätsstraße 38 D Stuttgart Voronoi Overlay Networks Pavel Skvortsov.
PRIN WOMEN PROJECT Research Unit: University of Naples Federico II G. Ferraiuolo
BATON A Balanced Tree Structure for Peer-to-Peer Networks H. V. Jagadish, Beng Chin Ooi, Quang Hieu Vu.
CS 6401 Overlay Networks Outline Overlay networks overview Routing overlays Resilient Overlay Networks Content Distribution Networks.
Topologically-Aware Overlay Construction and Sever Selection Sylvia Ratnasamy, Mark Handley, Richard Karp, Scott Shenker.
Peer-to-Peer Networks 03 CAN (Content Addressable Network) Christian Schindelhauer Technical Faculty Computer-Networks and Telematics University of Freiburg.
LOOKING UP DATA IN P2P SYSTEMS Hari Balakrishnan M. Frans Kaashoek David Karger Robert Morris Ion Stoica MIT LCS.
Self-stabilizing energy-efficient multicast for MANETs.
Two Peer-to-Peer Networking Approaches Ken Calvert Net Seminar, 23 October 2001 Note: Many slides “borrowed” from S. Ratnasamy’s Qualifying Exam talk.
Energy Efficient Data Management for Wireless Sensor Networks with Data Sink Failure Hyunyoung Lee, Kyoungsook Lee, Lan Lin and Andreas Klappenecker †
23 1 Christian Böhm 1, Florian Krebs 2, and Hans-Peter Kriegel 2 1 University for Health Informatics and Technology, Innsbruck 2 University of Munich Optimal.
RE-Tree: An Efficient Index Structure for Regular Expressions
A Scalable content-addressable network
A Scalable Content Addressable Network
Efficient Processing of Top-k Spatial Preference Queries
Donghui Zhang, Tian Xia Northeastern University
Presentation transcript:

An Optimal Broadcast Algorithm for Content-Addressable Networks Ludovic Henrio Fabrice Huet Justine Rochas 1 18/12/ OPODIS (Nice)

Background Efficient Algorithm Experiments 2

General Motivation – RDF Storage  Context  Web Semantic: RDF data  Challenge  Store and retrieve RDF data  Large scale setting  Our solution  Content Addressable Network 3

Content-Addressable Networks (CAN)  Overlay network  Nodes are peers A E B C D 01 1 dim #1 dim #2 4  Structured organization  Multidimensional Cartesian space  Entirely partitioned  A zone is managed by one peer  A zone = a (hyper)rectangle  Neighborhood based on adjacent zones  Routing = successively approaching value in all dimensions

Problem: Cost of Queries 5 2 queries over 2 variables: conjunction of two 2- dimensional broadcast 1 query over 2 variables 1 query over 1 variable Naive broadcast does not scale OK NOT OK

 Duplicated messages  11 peers  40 messages !  How to eliminate duplicates?  For each peer P  Find the peer that is reponsible for sending the message to P E 01 1 dim #1 dim #2 Problem: Duplicated Messages 6

Existing Solutions  Use the CAN structure to route messages  Meghdoot [1] « upperLeft » predicate  M-CAN [2]  M-CAN principles  Initiator peer sends to all neighbors  Other peers forward to neighbors on  Same dimension on opposite side  Lower dimensions on all sides  Forwarding on the last dimension depends on a constraint 7 [1] A. Gupta, O. D. Sahin, D. Agrawal, A. El Abbadi: Meghdoot: Content-Based Publish/Subscribe over P2P Networks. Middleware 2004 Meghdoot: start from a corner A B C

M-CAN Execution INIT 8 Corner Constraint Message Message that leads to duplication [2] S. Ratnasamy, M. Handley, R. M. Karp, S. Shenker: Application-Level Multicast Using Content-Addressable Networks. Networked Group Communication 2001

Preliminary Work  Existence of an optimal algorithm proved [3]  A solution to exhibit existence  Valid for a very generic definition of CAN  Not efficient (execution time)  Parallelize messages sending only when reaching a « border » 9 [3] Francesco Bongiovanni, Ludovic Henrio: A Mechanized Model for CAN Protocols. FASE 2013

Background Efficient Algorithm Experiments 10

Hypothesis and Goals  CAN = adjacent rectangles  No additional structure  Tolerate churns between two Bcast  Not implementation-dependent  Do not tolerate churns during Bcast  Optimal in number of messages and good parallelization 11 A spanning tree INIT

Efficient Algorithm – Principle  Removes all duplicates  In all dimensions  How ?  Uses the corner constraint  Plus a spatial constraint  A set of fixed values  Reduce the problem  Applies recursively 12 spatial constraint in 3D CAN spatial constraint in 2D CAN

 Observation #1  Easy to forward in 1D  Observation #2  Only one zone touches a corner  Idea of the algorithm  Suppose an efficient broadcast in dimension N  Apply on a hyperplane of dimension N - 1  Send to both sides of this hyperplane using the corner constraint  Repeat until the hyperplane is just a line (dimension 1) Efficient Algorithm 13

Efficient Algorithm – Execution INIT 14 Corner Constraint Message Message that leads to duplication Spatial Constraint

Efficient Algorithm – Properties  Proved to be correct  All peers receive a broadcast message at least once  Proved to be minimal  All peers receive a broadcast message at most once  Elements of proof – When receiving on dimension D:  dim < D  spatial constraint is satisfied  For dim = D  ascending or descending direction  dim > D  corner constraint is satisfied 15 This algorithm is optimal All peers receive a broadcast message exactly once

Background Efficient Algorithm Experiments 16

Experimental Setup  Using the Grid5000 platform  Multisite experimentation  Deployment  From 50 to 1500 peers  Up to 200 physical machines  CAN setting  Successively split zones in half  Zone to split is chosen randomly 17 A C B

Number of messages 18 Maximum gain of 5.3 MB

Number of messages 19

Execution Time 20 Significant speedup

Conclusion: Broadcast on CAN  We found an optimal solution  Proved to be correct and optimal  Efficient on large scale settings  Support range multicast  Currently in use in the EventCloud project [4]  Management of RDF data  Algorithm used for one year  Tested and approved ! 21 [4] EventCloud A range multicast

22 dim #1 dim #3 dim #2