Operator Placement for In-Network Stream Query Processing U. Srivastava, K. Mungala, and J. Widom, PODS 2005 ICS280 class presentation by Iosif Lazaridis.

Slides:



Advertisements
Similar presentations
Composite Subset Measures Lei Chen, Paul Barford, Bee-Chung Chen, Vinod Yegneswaran University of Wisconsin - Madison Raghu Ramakrishnan Yahoo! Research.
Advertisements

1 Efficient and Robust Streaming Provisioning in VPNs Z. Morley Mao David Johnson Oliver Spatscheck Kobus van der Merwe Jia Wang.
1 Distributed Adaptive Sampling, Forwarding, and Routing Algorithms for Wireless Visual Sensor Networks Johnsen Kho, Long Tran-Thanh, Alex Rogers, Nicholas.
Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani.
June 3, 2015Windows Scheduling Problems for Broadcast System 1 Amotz Bar-Noy, and Richard E. Ladner Presented by Qiaosheng Shi.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 52 Database Systems I Relational Algebra.
Approximating Sensor Network Queries Using In-Network Summaries Alexandra Meliou Carlos Guestrin Joseph Hellerstein.
1 Distributed Databases Chapter Two Types of Applications that Access Distributed Databases The application accesses data at the level of SQL statements.
Operator Placement for In-Network Stream Query Processing.
UCB Notes on Optical Networks Jean Walrand EECS. UCB Outline Dynamic Configuration? Wavelength Assignment Too Much Bandwidth?
Cache Placement in Sensor Networks Under Update Cost Constraint Bin Tang, Samir Das and Himanshu Gupta Department of Computer Science Stony Brook University.
An Efficient Clustering-based Heuristic for Data Gathering and Aggregation in Sensor Networks Wireless Communications and Networking (WCNC 2003). IEEE,
Tributaries and Deltas: Efficient and Robust Aggregation in Sensor Network Streams Amit Manjhi, Suman Nath, Phillip B. Gibbons Carnegie Mellon University.
Exploiting Correlated Attributes in Acquisitional Query Processing Amol Deshpande University of Maryland Joint work with Carlos Sam
1 An Evaluation of Multi-resolution Storage for Sensor Networks D. Ganesan, B. Greenstein, D. Perelyubskiy, D. Estrin, J. Heidemann ACM SenSys 2003.
Probabilistic Data Aggregation Ling Huang, Ben Zhao, Anthony Joseph Sahara Retreat January, 2004.
Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein,
Online Data Gathering for Maximizing Network Lifetime in Sensor Networks IEEE transactions on Mobile Computing Weifa Liang, YuZhen Liu.
Toward Optimal Network Fault Correction via End-to-End Inference Patrick P. C. Lee, Vishal Misra, Dan Rubenstein Distributed Network Analysis (DNA) Lab.
1 Distributed Databases Chapter What is a Distributed Database? Database whose relations reside on different sites Database some of whose relations.
Networks: Performance Measures1 Network Performance Measures.
CSCI 5708: Query Processing I Pusheng Zhang University of Minnesota Feb 3, 2004.
Neural Networks. Background - Neural Networks can be : Biological - Biological models Artificial - Artificial models - Desire to produce artificial systems.
Algorithms for Self-Organization and Adaptive Service Placement in Dynamic Distributed Systems Artur Andrzejak, Sven Graupner,Vadim Kotov, Holger Trinks.
MAXIMIZING SPECTRUM UTILIZATION OF COGNITIVE RADIO NETWORKS USING CHANNEL ALLOCATION AND POWER CONTROL Anh Tuan Hoang and Ying-Chang Liang Vehicular Technology.
Optimizing Queries and Diverse Data Sources Laura M. Hass Donald Kossman Edward L. Wimmers Jun Yang Presented By Siddhartha Dasari.
1 Meeyoung Cha, Sue Moon, Chong-Dae Park Aman Shaikh Placing Relay Nodes for Intra-Domain Path Diversity To appear in IEEE INFOCOM 2006.
On the Construction of Data Aggregation Tree with Minimum Energy Cost in Wireless Sensor Networks: NP-Completeness and Approximation Algorithms National.
Common Devices Used In Computer Networks
A Unified Modeling Framework for Distributed Resource Allocation of General Fork and Join Processing Networks in ACM SIGMETRICS
Network Aware Resource Allocation in Distributed Clouds.
Overlay Network Physical LayerR : router Overlay Layer N R R R R R N.
Sensor Database System Sultan Alhazmi
Compound Inequalities
Integrating Scale Out and Fault Tolerance in Stream Processing using Operator State Management Author: Raul Castro Fernandez, Matteo Migliavacca, et al.
REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
1 REED: Robust, Efficient Filtering and Event Detection in Sensor Networks Daniel Abadi, Samuel Madden, Wolfgang Lindner MIT United States VLDB 2005.
SIMPLE: Stable Increased Throughput Multi-hop Link Efficient Protocol For WBANs Qaisar Nadeem Department of Electrical Engineering Comsats Institute of.
Load Shedding Techniques for Data Stream Systems Brian Babcock Mayur Datar Rajeev Motwani Stanford University.
Secure In-Network Aggregation for Wireless Sensor Networks
Simultaneous routing and resource allocation via dual decomposition AUTHOR: Lin Xiao, Student Member, IEEE, Mikael Johansson, Member, IEEE, and Stephen.
CS6321 Query Optimization Over Web Services Utkarsh Kamesh Jennifer Rajeev Shrivastava Munagala Wisdom Motwani Presented By Ajay Kumar Sarda.
Maximizing Lifetime per Unit Cost in Wireless Sensor Networks
From Theory to Practice: Efficient Join Query Processing in a Parallel Database System Shumo Chu, Magdalena Balazinska and Dan Suciu Database Group, CSE,
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
@ Carnegie Mellon Databases 1 Finding Frequent Items in Distributed Data Streams Amit Manjhi V. Shkapenyuk, K. Dhamdhere, C. Olston Carnegie Mellon University.
Bing Wang, Wei Wei, Hieu Dinh, Wei Zeng, Krishna R. Pattipati (Fellow IEEE) IEEE Transactions on Mobile Computing, March 2012.
Holistic Twig Joins: Optimal XML Pattern Matching Nicholas Bruno, Nick Koudas, Divesh Srivastava ACM SIGMOD 02 Presented by: Li Wei, Dragomir Yankov.
Data-Driven Processing in Sensor Networks Adam Silberstein, Rebecca Braynard, Gregory Filpus, Gavino Puggioni, Alan Gelfand, Kamesh Munagala, Jun Yang.
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
REED : Robust, Efficient Filtering and Event Detection in Sensor Network Daniel J. Abadi, Samuel Madden, Wolfgang Lindner Proceedings of the 31st VLDB.
Bab 5 Classification: Alternative Techniques Part 4 Artificial Neural Networks Based Classifer.
Placing Relay Nodes for Intra-Domain Path Diversity Meeyoung Cha Sue Moon Chong-Dae Park Aman Shaikh Proc. of IEEE INFOCOM 2006 Speaker 游鎮鴻.
Safety Guarantee of Continuous Join Queries over Punctuated Data Streams Hua-Gang Li *, Songting Chen, Junichi Tatemura Divykant Agrawal, K. Selcuk Candan.
Construction of Optimal Data Aggregation Trees for Wireless Sensor Networks Deying Li, Jiannong Cao, Ming Liu, and Yuan Zheng Computer Communications and.
Chapter 13: Query Processing
1 VLDB, Background What is important for the user.
Yong Yao Johannes Gehrke Jie Li Nov. 20, 2008 CS662 Paper Presentation.
International Conference on Data Engineering (ICDE 2016)
Introduction to Wireless Sensor Networks
Software Testing and Maintenance 1
Data Streaming in Computer Networking
Chapter 12: Query Processing
Computer Network Performance Measures
Cse 344 May 4th – Map/Reduce.
Computer Network Performance Measures
Evaluate the limit: {image} Choose the correct answer from the following:
Presentation transcript:

Operator Placement for In-Network Stream Query Processing U. Srivastava, K. Mungala, and J. Widom, PODS 2005 ICS280 class presentation by Iosif Lazaridis (Winter 2005)

Problem Motivation

Previous Solutions Push all data to the server: queries are processed there –Does not utilize in-network resources Push simple filters to the leaf nodes –e.g., “select all values >3” Perform aggregation in intermediate nodes But what about expensive operations? –e.g., filters over image data, or operations involving remote lookups

Basic System Model Let s(F) be the selectivity of a filter –i.e., the fraction of tuples it allows to pass Let c(F,i) be the per-tuple cost of a filter at level i –It is c(F,i+1)=γ j c(F,i) Let l i be the cost of network transmission of a tuple from N i to N i+1

Basic theorem: Rank Placing filters in order of increasing rank is optimal: rank(F) = cost(F) / (1-selectivity(F)) Intuition: –Evaluate “cheap” filters early –Evaluate very “strict” filters early

Problem Statement n filters and m levels of hierarchy Hence: m n possible filter placements Problem: choose optimal plan from m n different choices A greedy and an optimal algorithm

Greedy Algorithm Let c(P,i) be the cost of plan P incurred at node i –i.e., the cost of applying the filter and transmitting the results to i+1 –Greedy Algorithm: minimize c(P,1) by choosing a set of filters F 1 from total set F then minimize c(P,2) by choosing F 2 from F- F 1 etc. –Choose all filters with rank less than l 1

Example FilterSelectivityCostRank F1F F2F F3F Then, evaluate {F 3, F 2 } in node 1 Cost = 1+0.5*3+0.5*0.6*15=7 Better than e.g., {} (cost=15) or {F 3, F 2, F 1 } (cost = 1+0.5*3+0.5*0.6*0.8*15=9.1) 12l 1 =15 Three filters: {F1, F2, F3}

Why Greedy is not optimal FilterSelectivityCost(1)Cost(2) F1F F2F F3F Previous plan {F 3, F 2 } then {F 1 } has total cost = 7+0.5*0.6*8=9.4 Consider plan {F 3, F 2, F 1 } then {} (total cost=9.1)

Optimal Algorithm Model a link as a filter with selectivity γ i and cost l i Each node has an “incoming” and an “outgoing” link –Evaluate all filters with rank between the ranks of incoming and outgoing transmission“filters” If the rank of the incoming link is greater than of the outgoing link –Optimally “short-circuit” node = don’t evaluate any filters on the node

Processing Joins Two input streams R, S with rates r 1, r 2 Output stream consists of tuples (r,s) with r in R and s in S Join cost = ar 1 +br 2 +cr 1 r 2 –Order filters that apply on r and s separately –Order filters that apply to (r,s) Example: “temperature>10 and temperature 100 and temperature+0.5*pressure>120” Join Rate r 1 Rate r 2 Filters F 1 Filters F 2 temperature>10 pressure>100 temperature<20 Filters F 1,2 temperature+0.5*pressure>120

Conclusions Systematic way to push filters into the network, taking into account their relative cost and the capabilities of nodes Perhaps does not take into account practical issues such as broadcast communication or faults Interesting to see practical values for γ, c, s in a real deployment.