Presentation is loading. Please wait.

Presentation is loading. Please wait.

Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data.

Similar presentations


Presentation on theme: "Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data."— Presentation transcript:

1 Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data Management Technologies Lab Department of Computer Science University of Pittsburgh HDMS 2005

2 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 2 Motivation Price < $5

3 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 3 Motivation Web Server Wish List 1 Price < $5 Title = Star Wars I Price < $5 TP T Wish List 2 P Price < $7 Input data streams Price = $8   Price = $4  Price = $8Price = $4Price = $2  Price < $5  Price = $5

4 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 4 Continuous Queries A continuous query is a long-running standing query whose execution is triggered with the arrival of new data The output of a continuous query is a continuous data stream (e.g., e-mails or update personalized Web page) A Web server should propagate updates to users as soon as they become available, however delays occur due to: 1) Time spent by a query processing updates 2) Time spent by a query waiting to be executed

5 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 5 Scheduling Multiple Continuous Queries The execution order of continuous queries determines the overall behavior of the system. For example: In Aurora: the Minimum Latency scheduler reduces response time [Carney et. al., VLDB’03] In STREAM: the Chain scheduler minimizes memory usage [Babcock et. al., SIGMOD’03] Problem Statement: Devise a policy for scheduling the execution of multiple continuous queries (MCQ) with the objective of maximizing the overall quality of data (QoD) of output data streams

6 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 6 Outline Motivation Quality of Data (QoD) Freshness-Aware Scheduling of MCQ (FAS-MCQ) Experimental Evaluation Conclusions and Future Work

7 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 7 Quality of Data QoD based on freshness (deviation from the ideal) At any time instance, the output data stream is fresh when it matches the ideal one, otherwise it is stale TxTx TyTy t1t1 t2t2 t3t3 Delay (t i )= wait time + processing time Freshness = 1- (t 1 + t 2 + t 3 )/(T y - T x ) Ideal output data stream Actual output data stream Fresh Stale

8 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 8 Outline Motivation Quality of Data (QoD) Freshness-Aware Scheduling of MCQ (FAS-MCQ) Experimental Evaluation Conclusions and Future Work

9 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 9 How updates are processed Q1Q1 N1N1 W1W1 C1C1 S 1 =1 Cost of executing operator Q1 Selectivity of operator Q1 Wait time of top update in input queue for operator Q1 Number of pending updates Incoming updates Operator Q1

10 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 10 Freshness-Aware Scheduling of MCQ (FAS-MCQ) Compute loss in freshness (L) under two policies: Policy X: first Q 1 then Q 2 Policy Y: first Q 2 then Q 1 L X = (W 1 + N 1 *C 1 ) + (W 2 + N 2 *C 2 + N 1 *C 1 ) Current loss in freshness Loss until processing pending tuples Q 1 ’s loss Loss due to waiting for Q 1 to finish execution Q 2 ’s loss Q1Q1 N1N1 W1W1 C1C1 Q2Q2 N2N2 W2W2 C2C2 S 1 =1S 2 =1

11 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 11 Freshness-Aware Scheduling of MCQs Under policy X: first Q 1 then Q 2 L X = (W 1 + N 1 *C 1 ) + (W 2 + N 1 *C 1 + N 2 *C 2 ) Under policy Y: first Q 2 then Q 1 L Y = (W 2 + N 2 *C 2 ) + (W 1 + N 2 *C 2 + N 1 *C 1 ) For L X N 1 * C 1 < N 2 * C 2 Q1Q1 N1N1 W1W1 C1C1 Q2Q2 N2N2 W2W2 C2C2 Priority of Q i = 1/(N i * C i ) S 1 =1S 2 =1

12 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 12 Impact of Selectivity A query is a tree of operators Each query operator is associated with: Cost (c): processing time Selectivity (s): probability of producing an output after processing an input update Maximum cost: C = c 1 + c 2 + c 3 Total selectivity: S = s 1 * s 2 * s 3 Average/Expected Cost: C avg = c 1 + (c 2 * s 1 ) + (c 3 * s 1 * s 2 ) 1 2 3

13 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 13 Selectivity-Aware FAS-MCQ Priority of Q i = 1/(N i * C i avg ) But we need to consider selectivity: If S i = 0, no appending and the output data stream is fresh If S i = 1, appending and the output data stream is stale Compute the staleness probability (P i ) = 1 - (1-S i ) N i Q1Q1 N1N1 W1W1 C 1 avg Q2Q2 N2N2 W2W2 C 2 avg S1S1 S2S2 Priority of Q i = P i / (N i * C i avg )

14 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 14 Summary Priority of Q i = P i / (N i * C i avg ) FAS-MCQ behaves as follows: If all queries have the same P (staleness probability) and N (number of pending updates), then FAS-MCQ selects the query with the lowest cost If all queries have the same P and C (expected cost) then FAS-MCQ selects the query with the lowest number of pending updates If all queries have the same N and C, then FAS-MCQ selects the query with the highest staleness probability

15 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 15 Intuitions underlying FAS-MCQ Priority of Q i = P i / (N i * C i avg ) The priority of a query increases if it has: Small processing cost (C i ), Small number of pending updates (N i ), High staleness probability (P i )

16 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 16 Outline Motivation Quality of Data (QoD) Freshness-Aware Scheduling of MCQ (FAS-MCQ) Experimental Evaluation Conclusions and Future Work

17 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 17 Simulation Testbed Simulated the execution of 250 continuous queries with variable costs and selectivities 10 input data streams following Poisson distribution Half of the input streams are bursty Experiments to show: Impact of utilization, Impact of Selectivity, (Impact of burstiness), Fairness

18 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 18 Scheduling Algorithms FAS-MCQ: Freshness-Aware Scheduling of Multiple Continuous Queries RR: Round-Robin (Aurora) SRPT: Shortest Remaining Processsing Time FCFS: First-Come First-Served

19 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 19 Impact of Utilization (Selectivity = 1) 22% Quality of Data

20 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 20 Impact of Selectivity 50% Quality of Data

21 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 21 Fairness Standard Deviation for QoD

22 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 22 Conclusions We proposed a policy for freshness-aware scheduling of multiple continuous queries. Our policy exploits the properties of continuous queries (i.e., cost and selectivity) as well as the properties of input data streams (variability of updates) We showed experimentally that our proposed policy outperforms the traditional scheduling policies Future: study multi-stream queries and different definitions for QoD

23 HDMS 2005 -  Athens, Greece -  Sharaf, Labrinidis, Chrysanthis, Pruhs University of Pittsburgh 23   Advanced Data Management Technologies Lab http://db.cs.pitt.edu


Download ppt "Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data."

Similar presentations


Ads by Google