Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management.

Slides:



Advertisements
Similar presentations
Chapter 9 Uniprocessor Scheduling Operating Systems: Internals and Design Principles, 6/E William Stallings Patricia Roy Manatee Community College, Venice,
Advertisements

Topic : Process Management Lecture By: Rupinder Kaur Lecturer IT, SRS Govt. Polytechnic College for Girls,Ludhiana.
CS 149: Operating Systems February 3 Class Meeting
Operating Systems: Introduction n 1. Historical Development n 2. The OS as a Resource Manager n 3. Definitions n 4. The Process.
CPU Scheduling Questions answered in this lecture: What is scheduling vs. allocation? What is preemptive vs. non-preemptive scheduling? What are FCFS,
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 17 Scheduling III.
Silberschatz, Galvin and Gagne  2002 Modified for CSCI 399, Royden, Operating System Concepts Operating Systems Lecture 19 Scheduling IV.
Chapter 5 CPU Scheduling. CPU Scheduling Topics: Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling.
Operating Systems 1 K. Salah Module 2.1: CPU Scheduling Scheduling Types Scheduling Criteria Scheduling Algorithms Performance Evaluation.
Scheduling in Batch Systems
CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.
CS 104 Introduction to Computer Science and Graphics Problems Operating Systems (2) Process Management 10/03/2008 Yang Song (Prepared by Yang Song and.
1 Uniprocessor Scheduling Chapter 9. 2 Aims of Scheduling Assign processes to be executed by the processor(s) Response time Throughput Processor efficiency.
Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.
Job scheduling Queue discipline.
1Chapter 05, Fall 2008 CPU Scheduling The CPU scheduler (sometimes called the dispatcher or short-term scheduler): Selects a process from the ready queue.
1 Uniprocessor Scheduling Chapter 9. 2 Aim of Scheduling Main Job: Assign processes to be executed by the processor(s) and processes to be loaded in main.
Computer Architecture and Operating Systems CS 3230: Operating System Section Lecture OS-3 CPU Scheduling Department of Computer Science and Software Engineering.
Chapter 4 Processor Management
Chapter 9 Uniprocessor Scheduling Spring, 2011 School of Computer Science & Engineering Chung-Ang University.
Freshness-Aware Scheduling of Continuous Queries in the Dynamic Web Mohamed A. Sharaf Alexandros Labrinidis Panos K. Chrysanthis Kirk Pruhs Advanced Data.
1 Scheduling Processes. 2 Processes Each process has state, that includes its text and data, procedure call stack, etc. This state resides in memory.
CPU S CHEDULING Lecture: Operating System Concepts Lecturer: Pooja Sharma Computer Science Department, Punjabi University, Patiala.
Scheduling Strategies Operating Systems Spring 2004 Class #10.
1 Server Scheduling in the L p norm Nikhil Bansal (CMU) Kirk Pruhs (Univ. of Pittsburgh)
Uniprocessor Scheduling Chapter 9. Aim of Scheduling To improve: Response time: time it takes a system to react to a given input Turnaround Time (TAT)
CPU Scheduling CSCI 444/544 Operating Systems Fall 2008.
2.5 Scheduling Given a multiprogramming system. Given a multiprogramming system. Many times when more than 1 process is waiting for the CPU (in the ready.
Lecture 7: Scheduling preemptive/non-preemptive scheduler CPU bursts
Uniprocessor Scheduling
1 11/29/2015 Chapter 6: CPU Scheduling l Basic Concepts l Scheduling Criteria l Scheduling Algorithms l Multiple-Processor Scheduling l Real-Time Scheduling.
2.5 Scheduling. Given a multiprogramming system, there are many times when more than 1 process is waiting for the CPU (in the ready queue). Given a multiprogramming.
Uniprocessor Scheduling Chapter 9. Aim of Scheduling Assign processes to be executed by the processor or processors: –Response time –Throughput –Processor.
Silberschatz and Galvin  Operating System Concepts Module 5: CPU Scheduling Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor.
UNIT: User-ceNtrIc Transaction Management in Web-Database Systems Huiming Qu, Alexandros Labrinidis, Daniel Mosse Advanced Data Management Technologies.
Lecture Topics: 11/15 CPU scheduling: –Scheduling goals and algorithms.
CPU Scheduling Operating Systems CS 550. Last Time Deadlock Detection and Recovery Methods to handle deadlock – Ignore it! – Detect and Recover – Avoidance.
Operating Systems Scheduling. Scheduling Short term scheduler (CPU Scheduler) –Whenever the CPU becomes idle, a process must be selected for execution.
Chapter 4 CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms Multiple-Processor Scheduling Real-Time Scheduling Algorithm Evaluation.
CPU Scheduling CS Introduction to Operating Systems.
CPU scheduling.  Single Process  one process at a time  Maximum CPU utilization obtained with multiprogramming  CPU idle :waiting time is wasted 2.
1 Lecture 5: CPU Scheduling Operating System Fall 2006.
Chapter 7 Scheduling Chien-Chung Shen CIS/UD
1 Chapter 5: CPU Scheduling. 2 Basic Concepts Scheduling Criteria Scheduling Algorithms.
CPU SCHEDULING.
Chapter 5a: CPU Scheduling
Uniprocessor Scheduling
Chapter 2 Scheduling.
Process Scheduling B.Ramamurthy 9/16/2018.
Chapter 6: CPU Scheduling
Process Scheduling B.Ramamurthy 11/18/2018.
CPU Scheduling G.Anuradha
Module 5: CPU Scheduling
Chapter 5: CPU Scheduling
Operating System Concepts
Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy
Chapter5: CPU Scheduling
Chapter 6: CPU Scheduling
CPU SCHEDULING.
Chapter 5: CPU Scheduling
Lecture 2 Part 3 CPU Scheduling
Process Scheduling B.Ramamurthy 2/23/2019.
Process Scheduling B.Ramamurthy 4/11/2019.
Process Scheduling B.Ramamurthy 4/7/2019.
Uniprocessor scheduling
Chapter 6: CPU Scheduling
Module 5: CPU Scheduling
Chapter 6: CPU Scheduling
Don Porter Portions courtesy Emmett Witchel
Module 5: CPU Scheduling
Presentation transcript:

Efficient Scheduling of Heterogeneous Continuous Queries Mohamed A. Sharaf Panos K. Chrysanthis Alexandros Labrinidis Kirk Pruhs Advanced Data Management Technologies Lab Department of Computer Science University of Pittsburgh VLDB 2006

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 2 Motivating Example Tell me when there are airplane tickets such that: Itinerary:Pittsburgh -> Korea -> Pittsburgh Dates:September 8 -> September 16 Price < $1200 This is a form of a Continuous Query (CQ): CQs registered ahead of time Arrival of new data triggers execution CQs support monitoring applications:

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 3 Data Stream Management System (DSMS) DSMS = Database system + Online system Our Goal: Improve the online performance of a DSMS Input Data Streams Output Data Stream D 1 Query Scheduler Continuous Query Q n 123 Output Data Stream D n Load Shedder Memory ManagerQuery Optimizer Query Scheduler 123 Continuous Query Q 1

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 4 Need for Query Scheduling The execution order of continuous queries determines the overall behavior of the system e.g., memory usage [Babcock et. al., SIGMOD’03] Traditionally: One operator per thread Resource management done by OS Problems: No objective for optimization Does not exploit query semantics

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 5 Scheduling Multiple Continuous Queries (MCQ) Given: A set of n queries ready to execute (queries with pending updates) A certain metric to optimize Then: The MCQ Scheduler decides the execution order of the n queries so that to optimize the given metric 1 22 33 1 22 33 1 22 33 … CQ 1 CQ 2 CQ n

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 6 Outline Introduction Scheduling for Quality of Service (QoS) Average response time Average slowdown Balancing the trade-off between average and worst case Implementation issues Conclusions

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 7 Response Time The response time of a tuple is the interval of time between its arrival at the DSMS until its departure Tuples that are filtered out (discarded) during query processing do not contribute to the metric Shortest Remaining Processing Time (SRPT) is the policy to optimize response time in Web servers Would SRPT optimize response time for multiple CQs ?! No … because it does not exploit CQs characteristics!

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 8 Impact of Selectivity Selectivity of a query (S): is the probability of producing an output tuple after processing an input tuple (i.e., detecting a related event) S=0.1: 10 input tuples  1 output event S=1.0: 10 input tuples  10 output events If two queries have the same cost then: the one with higher selectivity produces more tuples per time unit (higher Output Rate).

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 9 Impact of Output Rate Q 1 : S 1 =1.0 and C 1 =1 mS then OR 1 =1.0 Q 2 : S 2 =0.2 and C 2 =1 mS then OR 2 =0.2 5 pending tuples arrived at time 0 RT Q 2 then Q Q 1 then Q Q2Q2 Q2Q2 510 Q1Q1 Q1Q1 Q 1 then Q 2 Q1Q1 Q1Q1 Q1Q1 Q2Q2 Q2Q2 Q2Q2 0 Q 2 then Q 1 Q2Q2 Q2Q2 510 Q1Q1 Q1Q1 Q1Q1 Q1Q1 Q1Q1 Q2Q2 Q2Q2 Q2Q2 0

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 10 Highest Rate Policy Assign each query a priority equal to its output rate The output rate of a query = selectivity/cost How to compute the output rate of a query with more than one operator ? 1 22 33 At each scheduling point, schedule the query with the highest global output rate…Highest Rate Policy (HR)

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 11 Simulation Testbed Developed a DSMS simulator in C++ Policies for multi-query scheduling: Round Robin (RR; Aurora) Highest Rate (HR) First Come First Serve (FCFS) Shortest Remaining Processing Time (SRPT) Input traces from Internet traffic Generate 500 continuous queries: select-join-project Uniform distribution of costs and selectivities Assigned costs and selectivities determine the system’s utilization (or load)

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 12 Avg. Response Time (  Sec) Results: Average Response Time 65% 73%

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 13 Outline Introduction Scheduling for Quality of Service (QoS) Average response time Average slowdown Balancing the trade-off between average and worst case Implementation issues Conclusions

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 14 Slowdown Slowdown (or stretch): [Mehta & DeWitt, VLDB’93] Ratio between the tuple’s response time to its ideal processing time if it were the only tuple in the system slowdown is more fair than response time: It relates response time to demand: tuples for an expensive query are expected to stay longer as they contribute more to the load Ideally, slowdown = 1 Slowdown increases with increasing load

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 15 SRPT vs. HR In Web Servers, SRPT is: Optimal for response time, and Near optimal for slowdown Short jobs spend shorter time in the system In DSMSs: HR minimizes average response time but what about average slowdown ? Is it possible under HR for short queries to experience high slowdown leading to an overall high slowdown ? Queries with low selectivity are penalized !

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 16 Example Q 1 : S 1 =1.0 and C 1 =5 mS then OR 1 =0.2 Q 2 : S 2 =0.33 and C 2 =2 mS then OR 2 = pending tuples arrived at time 0 Q2Q2 Q2Q2 Q2Q SD=1SD=2SD=3SD=9.5 Q1Q1 Q1Q1 Q1Q1 HR policy: Q1Q1 Q1Q1 Q1Q SD=2SD=2.2SD=3.2SD=4.2 Q2Q2 Q2Q2 Q2Q2 Another policy: RTSD HR Other132.9

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 17 Parameters for Scheduling S x = s 1 * s 2 * s 3 C x avg = c 1 + (c 2 *s 1 ) + (c 3 *s 1 *s 2 ) C x = cost of detecting an event = c 1 + c 2 +c 3 = ideal processing time W x = the current wait time of the oldest tuple in Q x input queue 11 2∞2∞ 33

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 18 Scheduling for Slowdown (1) Compute slowdown (H) under two policies: Policy X: first Q 1 then Q 2 Policy Y: first Q 2 then Q 1 Probability that t 1 is produced Wait time Extra wait time for Q 1 to finish execution t1t1 Q1Q1 W1W1 C 1 avg S1S1 Q2Q2 W2W2 C 2 avg S2S2 Processing time t2t2 t 1 ’s slowdownt 2 ’s slowdown C2C2 C1C1

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 19 Scheduling for Slowdown (2) Under policy X: first Q 1 then Q 2 t1t1 Q1Q1 W1W1 C1C1 S1S1 Q2Q2 W2W2 C2C2 S2S2 t2t2 Under policy Y: first Q 2 then Q 1 For H X < H Y: t1t1 Q1Q1 W1W1 C 1 avg S1S1 Q2Q2 W2W2 C 2 avg S2S2 t2t2 C2C2 C1C1

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 20 Scheduling for Slowdown (3) S x /C x avg is the output rate (OR x ) of Q x C x is the ideal processing time of a tuple produced by Q x Our Highest Normalized Rate (HNR) policy emphasizes the tuple ideal processing time Inexpensive queries with low productivity are not penalized For equal costs: C i = 1  HNR = HR For selectivity 1: S i = 1  HNR = SRPT Priority of Q x = =

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 21 Avg. Slowdown Results: Average Slowdown 20%

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 22 Outline Introduction Scheduling for Quality of Service (QoS) Average response time Average slowdown Balancing the trade-off between average and worst case Implementation issues Conclusions

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 23 Worst-Case Performance Queries/Events may experience starvation Queries with low selectivity and/or high cost Typically measured using: maximum response time, or maximum slowdown Maximum slowdown (or response time) is: A very sensitive metric It does not consider the average-case performance

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 24 Trade-off between Avgerage Case and Worst Case Maximum slowdown = worst-case performance Average slowdown = average-case performance We need to look at both metrics at the same time L p norm of slowdowns captures both metrics L 2 norm of N tuples = it takes into account all values it penalizes outliers

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 25 Scheduling for the L 2 Norm of Slowdowns Balance Slowdown Policy (BSD) Priority of Q x = A query is scheduled either because: It has a high normalized rate, or Its pending tuples accumulated high slowdown All users are satisfied = Fairness Normalized RateCurrent Slowdown

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 26 Max. Slowdown Results: Balancing the trade-off 77% 31% Avg. Slowdown

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 27 L 2 Norm of Slowdowns Results: L 2 Norm of Slowdowns 24%

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 28 Slowdown per Class (same cost queries)

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 29 Outline Introduction Scheduling for Quality of Service (QoS) Implementation issues Scheduling overhead Shared operators (details in paper) Conclusions

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 30 Optimization Methods L 2 SD of BSD-Logarithmic / L 2 SD of BSD-Hypothetical BSD-Hypothetical = BSD without overhead

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 31 Conclusions In this talk, we presented: QoS metrics for evaluating the performance of a DSMS Scheduling policies that exploit the properties of CQs Policies to improve QoS : Highest Rate (HR) for average response time Highest Normalized Rate (HNR) for average slowdown Balance Slowdown (BSD) for balancing the trade-off between average- and worst-case performance Addressed implementation issues to ensure the applicability of our proposed policies We empirically evaluated the gains provided by the proposed policies compared to existing policies

University of Pittsburgh Sharaf, Chrysanthis, Labrinidis, Pruhs 32 Thank You Questions ? Thanks: NSF IIS (AQSIOS Project)