Presentation is loading. Please wait.

Presentation is loading. Please wait.

PSoup Kevin Menard CS 561 4/11/2005. Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin.

Similar presentations


Presentation on theme: "PSoup Kevin Menard CS 561 4/11/2005. Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin."— Presentation transcript:

1 PSoup Kevin Menard CS 561 4/11/2005

2 Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin VLDB 2002 Slides are modified versions of the following original presentation:

3 Sirish Chandrasekaran Psoup Insight #1 Queries and data are duals Store new queries, apply to data that arrived earlier Store new data, apply to queries that arrived earlier Multiquery Processing = “join” of query and data – Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid Dat a Index Result Queries Query Index

4 Sirish Chandrasekaran Psoup Insight #1 Index Dat a Result Data Queries Queries and data are duals Store new queries, apply to data that arrived earlier Store new data, apply to queries that arrived earlier Multiquery Processing = “join” of query and data – Supports all three types of queries: queries over the past, (landmark and sliding window) continuous, and hybrid

5 Sirish Chandrasekaran Motivation? Why another model for continuous queries? What is wrong with how Aurora and STREAM supply responses?

6 Sirish Chandrasekaran Motivation: Disconnected Operation Previous solutions stream out answers immediately Not feasible/suitable for all applications Intermittent Connectivity: e.g., Applications on hand-held devices (as in this morning’s keynote address) Even if connected: Not always interested in streaming answers

7 Sirish Chandrasekaran Psoup Insight #2 Separate computation from delivery Query answers continuously generated in background Apply windows on-demand to transmit “current” results Efficient support for disconnected operation Low response time, Shared computation and storage across invocations Data IDR.aR.b Query IDPredicate Results Structure Queries Data T TF F TT F FF T FF Register T T F T Invoke }

8 Sirish Chandrasekaran PSoup Query Model S ELECT select_list F ROM from_list W HERE where_clause B EGIN begin_time E ND end_time Where clause: conjunction of boolean factors B EGIN -E ND clause: system clock or sequence numbers (begin_time, end_time): (constant, constant) – snapshot query (constant, variable) – landmark window query (variable, variable) – sliding window query

9 Sirish Chandrasekaran Query Registration S ELECT select_list F ROM from_list W HERE where_clause B EGIN begin_time E ND end_time } } Standing Query Clause (SQC) Windows_Table Symmetric Join to the QueryID: handle for future query invocations

10 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Query Specification Data Store ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b PSoup (a) Initial State Query Store ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3

11 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Query Specification PSoup (b) Arrival of new Query Select * From R Where R.a =3 New query ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 Data StoreQuery Store

12 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Query Specification PSoup (c) Building Query Store 24R.a =3 ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b BUILD Data StoreQuery Store

13 Sirish Chandrasekaran (d) Probing Data Store Selections over Single Stream: Arrival of New Query Specification PSoup match 24R.a =3 ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b PROBE Data StoreQuery Store

14 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Query Specification Results Structure 48 49 50 51 20 ? ? ? ? 52? 21 (e) Inserting Results Results Queries Data 222324 48 50 4 3 3 8

15 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Query Specification Results Structure 48 49 50 51 20 T F T F 52F 21 (e) Inserting Results Results Queries Data 222324 48 50 4 3 3 8

16 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Data Data Store ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b PSoup (a) Initial State Query Store ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 24R.a =3

17 Sirish Chandrasekaran PSoup (b) Arrival of new Data New data 24R.a =3 Query Store ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 Data Store ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b 5336 Selections over Single Stream: Arrival of New Data

18 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Data PSoup (c) Building Data Store 24R.a =3 Query Store ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 Data Store ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b 5336 BUILD

19 Sirish Chandrasekaran (d) Probing Query Store Selections over Single Stream: Arrival of New Data PSoup 24R.a =3 ID 20 21 22 23 Predicate 0<R.a<=5 R.a>4 and R.b=3 0>R.b>4 R.a=4 and R.b=3 Query StoreData Store ID 48 49 50 51 R.a 4 7 3 0 3 3 8 0 5284 R.b 5336 match PROBE

20 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Data Results Structure 48 49 50 51 20 52 21 (e) Inserting Results Results Queries Data 222324 53????? 24R.a =3 200<R.a<=5

21 Sirish Chandrasekaran Selections over Single Stream: Arrival of New Data Results Structure 48 49 50 51 20 52 21 (e) Inserting Results Results Queries Data 222324 53TFFFT 24R.a =3 200<R.a<=5

22 Sirish Chandrasekaran Query Invocation Results Structure 48 49 50 51 20 T F T F 52F 21 Queries 222324 Data 53TFFFT } Current Window BEGIN begin_time END end_time System returns the results corresponding to the current value of the B EGIN -E ND clause

23 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store (a) Initial State PSoup ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b S-Data Store

24 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification 23R.a S.a and S.b>1 (b) Arrival of new Query PSoup New query Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store S-Data Store ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b

25 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification 23R.a S.a and S.b>1 (c) Building Query Store PSoup ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store BUILD S-Data Store ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b Query Store

26 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification (d) Probing R-Data Store PSoup } Matches 23R.a S.a and S.b>1 ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store PROBE S-Data Store ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b Query Store

27 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification ID 20 21 22 23 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 R.a S.a and S.b>1 ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store (e) Constructing Hybrid Structs PSoup } Matches 10 14 31 232>S.a and S.b>1 Query Store 233>S.a and S.b>1 234>S.a and S.b>1 Hybrid Structs R.IDQ.IDQ.Predicate S-Data Store ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b

28 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification (f) Probing S-Data Store PSoup Matches { ID 20 21 22 23 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 R.a S.a and S.b>1 S-Data Store ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store Query Store 10 14 31 232>S.a and S.b>1 233>S.a and S.b>1 234>S.a and S.b>1 Hybrid Structs R.IDQ.IDQ.Predicate PROBE ? ? ? R,S,Q Results ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b

29 Sirish Chandrasekaran Joins over R and S: Arrival of New Query Specification (f) Probing S-Data Store PSoup Matches { ID 20 21 22 23 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 R.a S.a and S.b>1 S-Data Store ID 10 14 31 48 R.a 2 3 4 9 5 3 1 7 R.b R-Data Store Query Store 10 14 31 232>S.a and S.b>1 233>S.a and S.b>1 234>S.a and S.b>1 Hybrid Structs R.IDQ.IDQ.Predicate PROBE 14,21,23 31,21,23 31,25,23 R,S,Q Results ID 21 25 36 49 S.a 2 3 4 5 2 3 4 5 S.b

30 Sirish Chandrasekaran Joins over R and S: Arrival of New Data Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 47 50 51 R.a 4 5 3 3 3 8 R.b R-Data Store (a) Initial State PSoup 23R.a<4 and R.b<S.b ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

31 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (b) Arrival of new Data PSoup New data 5354 Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 47 50 51 R.a 4 5 3 3 3 8 R.b R-Data Store 23R.a<4 and R.b<S.b ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

32 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (c) Building R-Data Store PSoup Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 47 50 51 53 R.a 4 5 3 5 3 3 8 4 R.b 23R.a<4 and R.b<S.b R-Data Store BUILD ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

33 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (c) Probing Query Store PSoup Matches { Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 ID 47 50 51 53 R.a 4 5 3 5 3 3 8 4 R.b 23R.a<4 and R.b<S.b R-Data Store PROBE ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

34 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (d) Constructing Hybrid Structs PSoup Matches { ? 53 ?4<S.b 21? 22? Hybrid Structs ID 47 50 51 53 R.a 4 5 3 5 3 3 8 4 R.b Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 23R.a<4 and R.b<S.b R-Data Store R.IDQ.IDQ.Predicate ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

35 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (d) Constructing Hybrid Structs PSoup Matches { 53 204<S.b 214<S.b and S.a<10 2210>S.a and S.b>2 Hybrid Structs ID 47 50 51 53 R.a 4 5 3 5 3 3 8 4 R.b Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 23R.a<4 and R.b<S.b R-Data Store R.IDQ.IDQ.Predicate ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store

36 Sirish Chandrasekaran Joins over R and S: Arrival of New Data (e) Probing S-Data Store PSoup Matches } Hybrid Structs ID 47 50 51 53 R.a 4 5 3 5 3 3 8 4 R.b ID 48 49 52 S.a 4 5 3 4 3 2 S.b S-Data Store Query Store ID 20 21 22 Predicate R.a=5 and R.b<S.b R.a>4 and R.b<S.b and S.a<10 R.b=4 and R.a+5>S.a and S.b>2 23R.a<4 and R.b<S.b R-Data Store PROBE 53 204<S.b 214<S.b and S.a<10 2210>S.a and S.b>2 R.IDQ.IDQ.Predicate 53,48,22 53,49,22 R,S,Q Results

37 Sirish Chandrasekaran Other Queries N-way Joins Similar to 2-way joins Probe, generate hybrid structs, repeat Can be executed without intermediate tables Aggregations Performed at query invocation Uses n-ary ranked tree, clustered on time

38 Sirish Chandrasekaran Telegraph Background: CACQ CACQ [MSHR02] Shared execution of multiple queries with one Eddy Tuple lineage Query Indices Queries and Data treated very differently Only Landmark Continuous Queries No support for disconnected operation

39 Sirish Chandrasekaran Leverage SteMs to store and index queries Changes to Eddies Encode queries as tuples break Where clause into individual boolean factors (BF) encode each BF as R.a relop [R.b|S.b] [+|-] constant Stream Prefix Consistency A new query or data tuple is completely processed before any other tuple: no holes in Result Structure. Results Structure: to buffer the results. PSoup in Telegraph

40 Sirish Chandrasekaran Experiments and Results Alternatives NoMat – No background processing PSoup-Partial – background processing, apply current window on invocation PSoup-Complete – current windows are also continuously applied in the background Experimental Parameters Unloaded Server with two Intel Pentium III, 666 MHz processors with 768 MB RAM Data arrives as fast as possible, in domain [0,255] Queries of form R.a relop C, where c in [0,255] Join Queries of form R.a relop S.b +/- C.

41 Sirish Chandrasekaran Experiments: Response Time vs. Window Size Interval Predicates, Selection Queries

42 Sirish Chandrasekaran Equality Predicates, Selection Queries Experiments: Response Time vs. Window Size

43 Sirish Chandrasekaran Window Size = 1000 tuples Experiments: Max data arrival rate vs. #SQCs

44 Sirish Chandrasekaran PSoup in traditional query processor PSoup = SQL QUERY over data and client query streams? Joins = expression evaluators Notes Conventional QPs do not have tuple lineage Conventional QPs always use intermediate tables

45 Sirish Chandrasekaran Conclusions Treating Queries and Data the same Combines approaches for previously studied queries Queries over the past and continuous queries Allows new functionality – hybrid queries Separating Result Generation and Delivery Makes disconnected operation feasible Efficient support for repeated query invocations


Download ppt "PSoup Kevin Menard CS 561 4/11/2005. Streaming Queries over Streaming Data Sirish Chandrasekaran UC Berkeley August 20, 2002 with Michael J. Franklin."

Similar presentations


Ads by Google