Navneet Kumar Pandey 1 Stéphane Weiss 1 Roman Vitenberg 1 Kaiwen Zhang 2 Hans-Arno Jacobsen 2 2 University of Toronto 1 University of Oslo Minimizing the.

Slides:

Advertisements

Similar presentations

February 20, Spatio-Temporal Bandwidth Reuse: A Centralized Scheduling Mechanism for Wireless Mesh Networks Mahbub Alam Prof. Choong Seon Hong.

Advertisements

Energy-Efficient Distributed Algorithms for Ad hoc Wireless Networks Gopal Pandurangan Department of Computer Science Purdue University.

Supporting Cooperative Caching in Disruption Tolerant Networks

PODC 2007 © 2007 IBM Corporation Constructing Scalable Overlays for Pub/Sub With Many Topics Problems, Algorithms, and Evaluation G. Chockler, R. Melamed,

Efficient Event-based Resource Discovery Wei Yan*, Songlin Hu*, Vinod Muthusamy +, Hans-Arno Jacobsen +, Li Zha* * Chinese Academy of Sciences, Beijing.

Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.

Resource Management §A resource can be a logical, such as a shared file, or physical, such as a CPU (a node of the distributed system). One of the functions.

Fast Algorithms For Hierarchical Range Histogram Constructions

CSLI 5350G - Pervasive and Mobile Computing Week 3 - Paper Presentation “RPB-MD: Providing robust message dissemination for vehicular ad hoc networks”

Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,

Gossip Scheduling for Periodic Streams in Ad-hoc WSNs Ercan Ucan, Nathanael Thompson, Indranil Gupta Department of Computer Science University of Illinois.

Forwarding Redundancy in Opportunistic Mobile Networks: Investigation and Elimination Wei Gao 1, Qinghua Li 2 and Guohong Cao 3 1 The University of Tennessee,

Small-Scale Peer-to-Peer Publish/Subscribe

Transactional Mobility in Distributed Content-Based Publish/Subscribe Systems Songlin Hu*, Vinod Muthusamy +, Guoli Li +, Hans-Arno Jacobsen + * Chinese.

Selective Dissemination of Streaming XML By Hyun Jin Moon, Hetal Thakkar.

©NEC Laboratories America 1 Hui Zhang Samrat Ganguly Sudeept Bhatnagar Rauf Izmailov NEC Labs America Abhishek Sharma University of Southern California.

Naming in Wireless Sensor Networks. 2 Sensor Naming  Exploiting application-specific naming and in- network processing for building efficient scalable.

Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:

Minimum Maximum Degree Publish-Subscribe Overlay Network Design Melih Onus TOBB Ekonomi ve Teknoloji Üniversitesi, 28 Mayıs 2009.

Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.

A Cross Layer Approach for Power Heterogeneous Ad hoc Networks Vasudev Shah and Srikanth Krishnamurthy ICDCS 2005.

1 Algorithms for Bandwidth Efficient Multicast Routing in Multi-channel Multi-radio Wireless Mesh Networks Hoang Lan Nguyen and Uyen Trang Nguyen Presenter:

Distributed Quality-of-Service Routing of Best Constrained Shortest Paths. Abdelhamid MELLOUK, Said HOCEINI, Farid BAGUENINE, Mustapha CHEURFA Computers.

Effects of Routing Computations in Content-Based Routing Networks with Mobile Data Sources Vinod Muthusamy, Milenko Petrovic, Hans-Arno Jacobsen University.

Sidewinder A Predictive Data Forwarding Protocol for Mobile Wireless Sensor Networks Matt Keally 1, Gang Zhou 1, Guoliang Xing 2 1 College of William and.

A Theoretical Study of Optimization Techniques Used in Registration Area Based Location Management: Models and Online Algorithms Sandeep K. S. Gupta Goran.

Trust-based Multi-Objective Optimization for Node-to-Task Assignment in Coalition Networks 1 Jin-Hee Cho, Ing-Ray Chen, Yating Wang, and Kevin S. Chan.

Publisher Mobility in Distributed Publish/Subscribe Systems Vinod Muthusamy, Milenko Petrovic, Dapeng Gao, Hans-Arno Jacobsen University of Toronto June.

Consensus-based Distributed Estimation in Camera Networks - A. T. Kamal, J. A. Farrell, A. K. Roy-Chowdhury University of California, Riverside

MIDDLEWARE SYSTEMS RESEARCH GROUP Denial of Service in Content-based Publish/Subscribe Systems M.A.Sc. Candidate: Alex Wun Thesis Supervisor: Hans-Arno.

1 Fast Failure Recovery in Distributed Graph Processing Systems Yanyan Shen, Gang Chen, H.V. Jagadish, Wei Lu, Beng Chin Ooi, Bogdan Marius Tudor.

Gil EinzigerRoy Friedman Computer Science Department Technion.

CSE 6590 Fall 2010 Routing Metrics for Wireless Mesh Networks 1 4 October, 2015.

Power Save Mechanisms for Multi-Hop Wireless Networks Matthew J. Miller and Nitin H. Vaidya University of Illinois at Urbana-Champaign BROADNETS October.

Mohamed Hefeeda 1 School of Computing Science Simon Fraser University, Canada Video Streaming over Cooperative Wireless Networks Mohamed Hefeeda (Joint.

Content-Based Routing in Mobile Ad Hoc Networks Milenko Petrovic, Vinod Muthusamy, Hans-Arno Jacobsen University of Toronto July 18, 2005 MobiQuitous 2005.

Structuring P2P networks for efficient searching Rishi Kant and Abderrahim Laabid Abderrahim Laabid.

Chi-Cheng Lin, Winona State University CS 313 Introduction to Computer Networking & Telecommunication Chapter 5 Network Layer.

ECO-DNS: Expected Consistency Optimization for DNS Chen Stephanos Matsumoto Adrian Perrig © 2013 Stephanos Matsumoto1.

DISTRIBUTED EVENT AGGREGATION FOR CONTENT-BASED PUBLISH/SUBSCRIBE SYSTEMS Navneet Kumar Pandey 1 Stéphane Weiss 1 Roman Vitenberg 1 Kaiwen Zhang 2 Hans-Arno.

Dynamic Load Balancing in Distributed Content-based Publish/Subscribe Alex K. Y. Cheung & Hans-Arno Jacobsen University of Toronto November 30 th, 2006.

MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Total Order in Content-based Publish/Subscribe Systems Joint work with: Vinod Muthusamy, Hans-Arno Jacobsen.

Classification and Analysis of Distributed Event Filtering Algorithms Sven Bittner Dr. Annika Hinze University of Waikato New Zealand Presentation at CoopIS.

Paper # – 2009 A Comparison of Heterogeneous Video Multicast schemes: Layered encoding or Stream Replication Authors: Taehyun Kim and Mostafa H.

TASHKENT UNIVERSITY OF INFORMATION TECHNOLOGIES Lesson №18 Telecommunication software design for analyzing and control packets on the networks by using.

2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.

User-Centric Data Dissemination in Disruption Tolerant Networks Wei Gao and Guohong Cao Dept. of Computer Science and Engineering Pennsylvania State University.

MIDDLEWARE SYSTEMS RESEARCH GROUP Adaptive Content-based Routing In General Overlay Topologies Guoli Li, Vinod Muthusamy Hans-Arno Jacobsen Middleware.

Dual-Region Location Management for Mobile Ad Hoc Networks Yinan Li, Ing-ray Chen, Ding-chau Wang Presented by Youyou Cao.

BARD / April BARD: Bayesian-Assisted Resource Discovery Fred Stann (USC/ISI) Joint Work With John Heidemann (USC/ISI) April 9, 2004.

Minimal Broker Overlay Design for Content-Based Publish/Subscribe Systems Naweed Tajuddin Balasubramaneyam Maniymaran Hans-Arno Jacobsen University of.

CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.

The 30th International Conference on Distributed Computing Systems June 2010, Genoa, Italy Parameterized Maximum and Average Degree Approximation in Topic-based.

Peer-to-Peer Result Dissemination in High-Volume Data Filtering Shariq Rizvi and Paul Burstein CS 294-4: Peer-to-Peer Systems.

Data Structures and Algorithms in Parallel Computing Lecture 7.

Client Assignment in Content Dissemination Networks for Dynamic Data Shetal Shah Krithi Ramamritham Indian Institute of Technology Bombay Chinya Ravishankar.

Peter R Pietzuch and Jean Bacon Peer-to-Peer Overlay Networks in an Event-Based Middleware DEBS’03, San Diego, CA, USA,

MIDDLEWARE SYSTEMS RESEARCH GROUP MSRG.ORG Distributed Ranked Data Dissemination in Social Networks Joint work with: Mo Sadoghi Vinod Muthusamy Hans-Arno.

Load Balanced Link Reversal Routing in Mobile Wireless Ad Hoc Networks Nabhendra Bisnik, Alhussein Abouzeid ECSE Department RPI Costas Busch CSCI Department.

COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dr. Xiao Qin Auburn University

COMP7330/7336 Advanced Parallel and Distributed Computing Task Partitioning Dynamic Mapping Dr. Xiao Qin Auburn University

Congestion Avoidance with Incremental Filter Aggregation in Content-Based Routing Networks Mingwen Chen 1, Songlin Hu 1, Vinod Muthusamy 2, Hans-Arno Jacobsen.

Network Topologies for Scalable Multi-User Virtual Environments Lingrui Liang.

International Conference on Data Engineering (ICDE 2016)

Introduction to Wireless Sensor Networks

A Study of Group-Tree Matching in Large Scale Group Communications

Navneet Kumar Pandey1 Stéphane Weiss1 Roman Vitenberg1

High Throughput Route Selection in Multi-Rate Ad Hoc Wireless Networks

Kevin Lee & Adam Piechowicz 10/10/2009

Small-Scale Peer-to-Peer Publish/Subscribe

Presentation transcript:

Navneet Kumar Pandey 1 Stéphane Weiss 1 Roman Vitenberg 1 Kaiwen Zhang 2 Hans-Arno Jacobsen 2 2 University of Toronto 1 University of Oslo Minimizing the communication cost of aggregation in Publish/Subscribe systems

Aggregation in Pub/Sub systems Motivation: Stock Market Application 2 Content provider: Stock exchanges Content provider: Stock exchanges Aggregate subscription: Stock indicators (e.g. MACD) Aggregate subscription: Stock indicators (e.g. MACD) Content subscriber: Brokers, buyers Content subscriber: Brokers, buyers Non-aggregate subscription: Stock updates

Motivation: Intelligent Transport System (ITS) Information providers: road sensors, crowdsourced mobile apps Information seekers: commuters, police, first responders, radio networks etc. 3 Aggregate subscriptions Count number of cars passing a street light per hour Average speed of cars on a road segment per day Non-aggregate subscriptions Accident reports Traffic violation reports

Objective: Aggregation in Pub/Sub 4 Pub/sub is well known for efficient content filtering and dissemination for distributed event sources and sinks. However, pub/sub does not support aggregation, which is required in emerging applications. Our primary objective is to retain the traditional pub/sub focus on low communication cost, while adding support for aggregation. It is more communication- and computation-efficient than running two separate system for pub/sub and aggregation.

Contributions: aggregation in pub/sub 5 We introduce and formalize the problem of minimizing communication for aggregation in pub/sub. We present a solution which is optimal under complete knowledge of publications and subscriptions by a broker. We evaluate the trade-off between comm. and comp. costs for these two solutions. By reducing the problem to a minimum-vertex-cover over bipartite graphs, we show that it is solvable in polynomial time. We propose an alternative algorithm which is less computationally expensive.

BIBI P[val,8] A[val, >,4] S[val, >,3] BpBp BqBq BSBS BIBI B Broker Subscription Delivery Tree (SDT) Background: Advertisement-based pub/sub model 6 Our design choice: To maximize communication efficiency, we reuse dissemination flow i.e. SDTs.

Proposal: aggregation in Pub/Sub system 7 Aggregate Subscription: {, operator, duration (ω), shift size (δ)} NWR 1 NWR 2 subscription Time (in hours) Notification window ranges (NWR) Pub 1 Pub 2 Pub 3 ω δ Ex: { RoadID = 101, speed > 10, op=‘avg’, ω = 2 hour, δ = 1 hour}

Challenge: Distribute the computation across the brokers 8 Result load 2 Publication load 3 Pub 1 Publication message Result message Aggregation Decision Broker Pub 2 Pub 3 Res 1 Res 2 Pub 1 Pub 2 Pub 3 SDT Broker NWR 1 NWR 2 subscription Time Pub 1 Pub 2 Pub 3 NWR 1 subscription Pub 1 Pub 2 Pub Time NWR 2 NWR 3 NWR 4 Res 1 Res 2 Res 3 Res 4 Result load 4 Publication load 3 Local aggregation decision by each broker on an SDT for each NWR: Aggregate or forward incoming publications for that NWR.

Trade-off: multiple factors affect the decision 9 Increasing parameterFavors Publication matching rateAggregate Number of matched NWRsForward Overlap among aggregate subscriptionsForward Ratio between aggregate and regular subscriptionsAggregate Challenges: No global knowledge about topology. SDTs are beyond control of the aggregation scheme. SDTs get changed dynamically during the execution.

Unique challenges compared to other aggregation systems 10 Aggregation in pub/subOther aggregation systems Topology is not known to individual broker nodes Require global view of the topology Publication sources and sinks are dynamic Require a priori knowledge of publication sources Brokers are loosely coupledNeed control layer SDTs are dynamic and outside of control of aggregation scheme Demand a static query plan Publications come at an irregular rateOptimized for continuous data streams

Problem formulation: Minimum-Notification for Aggregation (MNA) Objective: Given the set of subscriptions and, set of incoming messages which includes both publications and previously aggregated results minimize the number of notifications i.e. publication and aggregation results sent by a broker. 11 : an NWR n : a Publication p NaNa NaNa NbNb NbNb NcNc NcNc P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 NpNp NpNp PpPp PpPp : matching of a publication to NWR NWR a NWR c P1P1 P2P2 P3P3 P4P4 NWR b

Optimal solution Unrealistic assumption: Brokers have information about, all the matching publications all the NWRs within entire execution this information is available a priori. 12 NaNa NaNa NbNb NbNb NcNc NcNc P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 NaNa NaNa NbNb NbNb NcNc NcNc Idea: Each broker constructs undirected bipartite graph (matching graph), And computes the minimum vertex cover. a minimum vertex cover = {N a, N b, N c }

Computation cost: Practical solution: Aggregation Decision, optimal with Complete Knowledge (ADOCK) Idea: Making decision with partial knowledge. Decisions are made based on current state of publications, NWRs and their interconnectivity. Implication: suboptimal decision. 13 NaNa NaNa NbNb NbNb NcNc NcNc P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 #Subscriptions Difference between Optimal and ADOCK in % 3.53%0.88%4.29%3.27% P1P1 P1P1 P2P2 P2P2 Communication cost : Close to the optimal solution in the experiments. |N|: #NWRs, |P|: #publications, deg A (N) : average degree of NWR vertices Scalability issue: Computation cost grows more than quadratically with the number of NWRs. P3P3 P3P3 P4P4 P4P4

14 < 1(2/3) Practical solution: Weighted Aggregation Decision (WAD) NaNa NaNa NbNb NbNb NcNc NcNc P1P1 P1P1 P2P2 P2P2 P3P3 P3P3 P4P4 P4P4 weight 1/3 1/2 Forward Aggregate P1P1 P1P1 P2P2 P2P2 NbNb NbNb NcNc NcNc Low computation cost: O(deg A (N) x |N|) Reduce the number of vertices used for making a decision Vertices within only 2 hops from the NWR will affect the decision. Similar to ADOCK, take a decision per NWR 1.Assign a weight to a publication vertex which is inverse of its degree. 2.Compute cumulative weight of an NWR from matching publications. 3.Aggregate matching publications if cumulative degree ≥ 1. Idea: Steps: ≥ 11 1

Experimental setup Implemented in Java over the PADRES framework Topology: 16 brokers – Combination of publisher-edge only, subscriber-edge only and mixed brokers Real life datasets: Traffic dataset from the ONE-ITS service 1 Yahoo! Finance Stock dataset Metrics: Communication: Number of messages exchanged Computation: Total computation overhead Existing baseline: per Broker Adaptive Technique (BAT) 15 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 1

Varying number of publications 16 Setting: Stock dataset Computation costCommunication cost Trade-off between WAD and ADOCK over communication and computation cost. WAD is up-to 73% faster than ADOCK at the expense of up-to 22% increase in communication cost. BAT sends more messages than either of the proposed solutions.

Varying number of subscriptions 17 Computation cost Communication cost ADOCK’s communication cost is around 12% lower than WAD’s. However, ADOCK’s computation overhead is more than twice that of WAD. This is also supported by analytical findings ADOCK WAD

Impact of sliding windows 18 A higher sliding parameter increases the NWR interconnectivity and makes the decision graph big. ADOCK is up-to four times slower than WAD, while WAD is sending up-to 27% extra messages.

Key lessons from experiments Our results confirm that interconnectivity is the key reason for the trade-off between computation and communication cost. Trade-off is substantially affected by these factors: Publication matching rate. Number of matching NWRs. Overlap among aggregate subscriptions. Ratio between aggregate and regular subscriptions. Recommendation ADOCK is preferred, if the system expects moderate amount of subscriptions with high selectivity. Otherwise, WAD is recommended. 19

Conclusions 20 We formalize the MNA problem and reduce it to Minimum Vertex Cover over a bipartite graph. We provide two solutions: communication efficient ADOCK and computation efficient WAD. We experimentally demonstrate the trade-off between computation and communication cost in these approaches.

Thank you! 21 For questions & comments