Adaptive Ordering of Pipelined Stream Filters S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom In Proc. of SIGMOD 2004, June 2004.

Slides:



Advertisements
Similar presentations
A Preliminary Attempt ECEn 670 Semester Project Wei Dang Jacob Frogget Poisson Processes and Maximum Likelihood Estimator for Cache Replacement.
Advertisements

A Paper on RANDOM SAMPLING OVER JOINS by SURAJIT CHAUDHARI RAJEEV MOTWANI VIVEK NARASAYYA PRESENTED BY, JEEVAN KUMAR GOGINENI SARANYA GOTTIPATI.
Active Learning for Streaming Networked Data Zhilin Yang, Jie Tang, Yutao Zhang Computer Science Department, Tsinghua University.
Efficient Constraint Monitoring Using Adaptive Thresholds Srinivas Kashyap, IBM T. J. Watson Research Center Jeyashankar Ramamirtham, Netcore Solutions.
IPOG: A General Strategy for T-Way Software Testing
Adaptive Monitoring of Bursty Data Streams Brian Babcock, Shivnath Babu, Mayur Datar, and Rajeev Motwani.
Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.
Static Optimization of Conjunctive Queries with Sliding Windows over Infinite Streams Presented by: Andy Mason and Sheng Zhong Ahmed M.Ayad and Jeffrey.
New Sampling-Based Summary Statistics for Improving Approximate Query Answers P. B. Gibbons and Y. Matias (ACM SIGMOD 1998) Rongfang Li Feb 2007.
SIGMOD 2006University of Alberta1 Approximately Detecting Duplicates for Streaming Data using Stable Bloom Filters Presented by Fan Deng Joint work with.
Query Optimization over Web Services Utkarsh Srivastava Jennifer Widom Jennifer Widom Kamesh Munagala Rajeev Motwani.
IntroductionAQP FamiliesComparisonNew IdeasConclusions Adaptive Query Processing in the Looking Glass Shivnath Babu (Stanford Univ.) Pedro Bizarro (Univ.
Sharing Aggregate Computation for Distributed Queries Ryan Huebsch, UC Berkeley Minos Garofalakis, Yahoo! Research † Joe Hellerstein, UC Berkeley Ion Stoica,
Adaptive Sampling for Sensor Networks Ankur Jain ٭ and Edward Y. Chang University of California, Santa Barbara DMSN 2004.
Operator Placement for In-Network Stream Query Processing.
Dynamic Plan Migration for Continuous Query over Data Streams Yali Zhu, Elke Rundensteiner and George Heineman Database System Research Group Worcester.
SWiM Panel on Engine Implementation Jennifer Widom.
Approximate data collection in sensor networks the appeal of probabilistic models David Chu Amol Deshpande Joe Hellerstein Wei Hong ICDE 2006 Atlanta,
Flow Algorithms for Two Pipelined Filtering Problems Anne Condon, University of British Columbia Amol Deshpande, University of Maryland Lisa Hellerstein,
Chain: Operator Scheduling for Memory Minimization in Data Stream Systems Authors: Brian Babcock, Shivnath Babu, Mayur Datar, and Rajeev Motwani (Dept.
Effective Gaussian mixture learning for video background subtraction Dar-Shyang Lee, Member, IEEE.
An Adaptive Multi-Objective Scheduling Selection Framework For Continuous Query Processing Timothy M. Sutherland Bradford Pielech Yali Zhu Luping Ding.
Reliable Transport Layers in Wireless Networks Mark Perillo Electrical and Computer Engineering.
Adaptive Stream Processing using Dynamic Batch Sizing Tathagata Das, Yuan Zhong, Ion Stoica, Scott Shenker.
SIGMOD'061 Energy-Efficient Monitoring of Extreme Values in Sensor Networks Adam Silberstein Kamesh Munagala Jun Yang Duke University.
Adaptive Sampling  Based on a hot-list algorithm by Gibbons and Matias (SIGMOD 1998)  Sample elements from the input set Frequently occurring elements.
Models and Issues in Data Streaming Presented By :- Ankur Jain Department of Computer Science 6/23/03 A list of relevant papers is available at
Adaptive Processing in Data Stream Systems Shivnath Babu stanfordstreamdatamanager Stanford University.
BUFFALO: Bloom Filter Forwarding Architecture for Large Organizations Minlan Yu Princeton University Joint work with Alex Fabrikant,
Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.
Introduction to Adaptive Digital Filters Algorithms
施賀傑 何承恩 TelegraphCQ. Outline Introduction Data Movement Implies Adaptivity Telegraph - an Ancestor of TelegraphCQ Adaptive Building.
Mean Field Inference in Dependency Networks: An Empirical Study Daniel Lowd and Arash Shamaei University of Oregon.
Efficient and Scalable Computation of the Energy and Makespan Pareto Front for Heterogeneous Computing Systems Kyle M. Tarplee 1, Ryan Friese 1, Anthony.
Smita Vijayakumar Qian Zhu Gagan Agrawal 1.  Background  Data Streams  Virtualization  Dynamic Resource Allocation  Accuracy Adaptation  Research.
Abhilash Thekkilakattil, Radu Dobrin, Sasikumar Punnekkat Mälardalen Real-time Research Center, Mälardalen University Västerås, Sweden Towards Preemption.
Clustering Moving Objects in Spatial Networks Jidong Chen, Caifeng Lai, Xiaofeng Meng, Renmin University of China Jianliang Xu, and Haibo Hu Hong Kong.
1 STREAM: The Stanford Data Stream Management System STanfordstREamdatAManager 陳盈君 吳哲維 林冠良.
Evaluation of interest points and descriptors. Introduction Quantitative evaluation of interest point detectors –points / regions at the same relative.
Adaptive Query Processing in Data Stream Systems Paper written by Shivnath Babu Kamesh Munagala, Rajeev Motwani, Jennifer Widom stanfordstreamdatamanager.
New Sampling-Based Summary Statistics for Improving Approximate Query Answers Yinghui Wang
BARD / April BARD: Bayesian-Assisted Resource Discovery Fred Stann (USC/ISI) Joint Work With John Heidemann (USC/ISI) April 9, 2004.
CS6321 Query Optimization Over Web Services Utkarsh Kamesh Jennifer Rajeev Shrivastava Munagala Wisdom Motwani Presented By Ajay Kumar Sarda.
Answering Top-k Queries Using Views Gautam Das (Univ. of Texas), Dimitrios Gunopulos (Univ. of California Riverside), Nick Koudas (Univ. of Toronto), Dimitris.
QoS Supported Clustered Query Processing in Large Collaboration of Heterogeneous Sensor Networks Debraj De and Lifeng Sang Ohio State University Workshop.
Eddies: Continuously Adaptive Query Processing Ross Rosemark.
ICDCS 2014 Madrid, Spain 30 June-3 July 2014
Professors: Eng. Diego Barral Eng. Mariano Llamedo Soria Julian Bruno
Evaluation of gene-expression clustering via mutual information distance measure Ido Priness, Oded Maimon and Irad Ben-Gal BMC Bioinformatics, 2007.
BarrierWatch: Characterizing Multithreaded Workloads across and within Program-Defined Epochs Socrates Demetriades and Sangyeun Cho Computer Frontiers.
Adaptivity in continuous query systems Luis A. Sotomayor & Zhiguo Xu Professor Carlo Zaniolo CS240B - Spring 2003.
HASE: A Hybrid Approach to Selectivity Estimation for Conjunctive Queries Xiaohui Yu University of Toronto Joint work with Nick Koudas.
Adaptive Ordering of Pipelined Stream Filters Babu, Motwani, Munagala, Nishizawa, and Widom SIGMOD 2004 Jun 13-18, 2004 presented by Joshua Lee Mingzhu.
Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.
Dynamic Bandwidth Reservation in Cellular Networks Using Road Topology Based Mobility Predictions InfoCom 2004 Speaker : Bo-Chun Wang
Stream Data Operator Ordering  Query Optimization Query Index.
Adaptive Processing in Data Stream Systems Shivnath Babu stanfordstreamdatamanager Stanford University.
Kalman Filter and Data Streaming Presented By :- Ankur Jain Department of Computer Science 7/21/03.
1 On Demand Classification of Data Streams Charu C. Aggarwal Jiawei Han Philip S. Yu Proc Int. Conf. on Knowledge Discovery and Data Mining (KDD'04),
Jan 27, Digital Preservation Seminar1 Effective Page Refresh Policies for Web Crawlers Written By: Junghoo Cho & Hector Garcia-Molina Presenter:
ICICLES: Self-tuning Samples for Approximate Query Answering By Venkatesh Ganti, Mong Li Lee, and Raghu Ramakrishnan Shruti P. Gopinath CSE 6339.
Heuristic Optimization Methods
Parallel Programming By J. H. Wang May 2, 2017.
Augmented Sketch: Faster and More Accurate Stream Processing
A Framework for Automatic Resource and Accuracy Management in A Cloud Environment Smita Vijayakumar.
Adaptation Behavior of Pipelined Adaptive Filters
CompSci Self-Managing Systems
Smita Vijayakumar Qian Zhu Gagan Agrawal
Brian Babcock, Shivnath Babu, Mayur Datar, and Rajeev Motwani
Laura Bright David Maier Portland State University
Presentation transcript:

Adaptive Ordering of Pipelined Stream Filters S. Babu, R. Motwani, K. Munagala, I. Nishizawa, and J. Widom In Proc. of SIGMOD 2004, June 2004

Outline Introduction Ordering of filters Adaptive ordering algorithm Preliminaries Algorithms Experimental evaluation Conclusion

Introduction Why “Stream”?  Many modern applications deal with data that Many modern applications Is updated continuously Needs to be processed in real-time Why “Adaptive”?  Arrival characteristics of streams may vary significantly over time Characteristic of filters  Direct  Commutative

Ordering of filters Challenges  Selectivities across filters may be correlated.  An exhaustive algorithm becomes infeasible. Greedy approach  Arrival characteristics of streams may vary significantly over time Adaptive approach

Adaptive ordering algorithm Three-way tradeoff  Run-time overhead Monitor the changes of statistics Determine when to change the current order  Convergence properties If data and filter characteristics stabilize, adaptive algorithm should converge to a solution with desirable properties  Speed of adaptivity The time the convergence takes when data and filter characteristics change

Preliminaries symbolmeaning query input stream filters stream tuple ordering conditional drop probability Processing time per tuple total cost

Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm

(Static) Greedy algorithm Greedy approach

A-GREEDY Profiler

A-GREEDY Reoptimizer violation

A-GREEDY Algorithm (1/2)

A-GREEDY Algorithm (2/2)

A-GREEDY properties Convergence  Good  Run-time overhead  Profile-tuple creation  Profile-window maintenance  Matrix-view update  Detection and correction of GI violations Adaptivity  rapidly

Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm

The SWEEP Algorithm Rotating over

SWEEP properties Convergence  Good Run-time overhead  A-Greedy:  Sweep: Adaptivity  Violations may remain undetected for a relatively long time  Up to stages

Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm

The INDEPENDENT Algorithm Assume the filters are independent  frequently in database literature  seldom true in practice 

INDEPENDENT properties Convergence  independent converge to the optimal ordering  dependent can be times worse than the GI ordering Run-time overhead  lower than A-Greedy Adaptivity  rapidly

Algorithms A-GREEDY algorithm The SWEEP algorithm The INDEPENDENT algorithm The LOCALSWAP algorithm

The LOCALSWAPS Algorithm Detect  a swap between adjacent filters in would improve performance 

LOCALSWAP properties Convergence  dependent on the way characteristics change  in some case, times higher cost than GI ordering Run-time overhead  lower than A-Greedy Adaptivity  may take more time to converge  may get stuck in a local optima

Experimental evaluation Three parts  Convergence experiments  Run-time overhead experiments  Adaptivity experiments

Convergence experiments Factors  Number of filters  Filter selectivities  Cost of filters  Correlation among filters Comparison  Optimal  A-Greedy  Independent

Convergence experiments (number) Factors  Number of filters

Convergence experiments (selectivity) Factors  Filter selectivities

Convergence experiments (correlation) Factors  Correlation among filters

Run-time overhead experiments Factors  Number of filters  Filter selectivities  Cost of filters Comparison  Optimal  A-Greedy  Sweep  LocalSwaps  Independent

Run-time overhead experiments (time) Factors  Spending time ≥ 98%

Run-time overhead experiments (number) Factors  Number of filters

Run-time overhead experiments (cost) Factors  Cost of filters

Adaptivity experiments Factors  Rate of change  Cost of filters Comparison  A-Greedy  Sweep  LocalSwaps  Independent

Adaptivity experiments (speed)

Adaptivity experiments (rate) Factors  Rate of change

Adaptivity experiments (cost) Tradeoff  Run-time overhead ↔ adaptivity

Conclusion Different points along the tradeoff spectrum  Fast Adaptivity A-Greedy  Low run-time overhead Slow adaptivity, good convergence  Sweep Independent filters  Independent Swap between adjacent filters, unpredicted convergence  LocalSwaps

Appendix

stable

correlation factor Γ filters are divided into groups  Each contains Γ filters Filters in  different groups independent  the same groups 80% the same result of input tuples  most correlated  completely independent

Example 1

Example 2 For, Total cost = 20 Input total Cost

A-GREEDY one of + +permutation of others cost: Example … … 1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, … … INDEPENDENT permutation of + cost:

A-GREEDY + permutation of others cost: Example 7 1, 2, 3, …, 100, 1, 2, …, 100, 1, 2, … LOCALSWAP cost: x x xx x x x Before, After, /100 ?