Download presentation
Presentation is loading. Please wait.
Published byRoland Dennis Modified over 9 years ago
1
StreaQuel Overview Mike Franklin UC Berkeley Language Panel 1 st Octennial SWiM Meeting January 9, 2003
2
Michael J. Franklin2 1 { t1,t2,t3 2 { t2,t3,t4 3 { t3,t4,t5 4 { t4,t5,t6 5 { t5,t6,t7 Time Tuple sets Semantics of data streams (our view) Streams are a mapping from time to sets of tuples Since data streams are unbounded, windows are vital for restricting the data access by a query. A stream can be transformed by: – Moving a window across it – A window can be moved by Shifting its extremities Changing its size
3
Michael J. Franklin3 An example 1 2 3 4 5 Time t1t1 t2t2 t3t3 t4t4 t5t5 Tuple Entry Base Data Stream { t1t1 { t 1,t 2 { t 1,t 2,t 3 { t 1,t 2,t 3,t 4 { t 1,t 2,t 3,t 4,t 5 Sliding Window Transformation { t1t1 { t 1,t 2 { t 2,t 3 { t 3,t 4 { t 4,t 5
4
Michael J. Franklin4 Classification of windowed queries
5
Michael J. Franklin5 The StreaQuel Language An extension of SQL Operates exclusively on streams Is closed under streams Supports different ways to “create” streams – Infinite time-stamped tuple sequence – Traditional stable relations Flexible windows: sliding, landmark and more Supports logical and physical time When used with a cursor mechanism, allows clients to do their own window-based processing. Eventually the target language for TelegraphCQ
6
Michael J. Franklin6 General Form of a StreaQuel Query S ELECT projection_list F ROM from_list W HERE selection_and_join_predicates O RDERED B Y T RANSFORM …T O W INDOW …B Y Windows can be applied to individual streams Window movement is expressed using a “for loop”- type construct in the “transform” clause We’re not completely psyched about our syntax at this point.
7
Michael J. Franklin7 StreaQuel Keywords NOW = “current time” – Eg; wall-clock or latest sequence# ST = “start time” of query On_demand = interrupt from user BoS = 0 = beginning of stream
8
Michael J. Franklin8 Example – Landmark query
9
Michael J. Franklin9 Challenge Queries: TPC-Squirrel Our Opinion… (thanks to Sam Madden for finding this wonderful picture)
10
Michael J. Franklin10 Challenge Queries – Query 1 Stream: Packets(pID, length, time) Query: Generate the stream of packets whose length is greater than twice the average packet length over the last 1 hour. AVGPACKETS: Select AVG(length) as avlen From Packets Window Packets By (NOW - 1hr, NOW) GREATERTHAN2AVG: Select * From Packets p, AVGPACKETS a Where p.length > 2 * avlen; Window p By (NOW, NOW) STREAM: Open a delta-output cursor on GREATERTHAN2AVG.
11
Michael J. Franklin11 Challenge Queries – Query 2 SquirrelSensors(sID, region, time) SquirrelType(sID, type)) Query: Create an alert when more than 20 type 'A' squirrels are in Jennifer's backyard. Option 1: (Twenty Distinct Type 'A' squirrels have been to J's backyard since the beginning of time) Select ALERT() From SquirrelSensors ss, SquirrelType st Where ss.region = JWBackyard AND ss.id = st.id AND st.type = A Having COUNT (DISTINCT(ss.id)) > 20
12
Michael J. Franklin12 Challenge Queries – Query 2…contd. Option 2: (Twenty Distinct Type 'A' squirrels are being sensed in J's backyard this instant) Select ALERT() From SquirrelSensors ss, SquirrelType st Where ss.region = JWBackyard AND ss.id = st.id AND st.type = A Having COUNT (DISTINCT(ss.id)) > 20 Window ss by (NOW, NOW)
13
Michael J. Franklin13 Challenge Queries – Query 2…contd. Option 3: (Individual Squirrel Tracking: Twenty Distinct Type 'A' squirrels have last been sensed in J's backyard) LASTREADING: Select sid, MAX(time) From SquirrelSensors ss GroupBy sid; Select ALERT() From SquirrelSensors ss, SquirrelType st, LASTREADING lr Where ss.region = JWBackyard AND ss.id = st.id AND st.type = 'A' AND ss.id = lr.id AND ss.time = lr.time Having (COUNT DISTINCT(ss.id)) > 20
14
Michael J. Franklin14 Challenge Queries – Query 3 SquirrelChirps(sID, loc, time) Query: Stream an event each time 3 different squirrels within a pairwise distance of 5 meters from each other chirp within 10 seconds of each other. Select time From SquirrelChirps sc1, SquirrelChirps sc2, SquirrelChirps sc3 Where sc1.id <> sc2.id AND sc1.id <> sc3.id AND sc2.id <> sc3.id AND distance(sc1.loc, sc2.loc) < 5 meters AND distance(sc1.loc, sc3.loc) < 5 meters AND distance(sc2.loc, sc3.loc) < 5 meters Window sc1 By (NOW, NOW) Window sc2 By (NOW - 10, NOW) Window sc3 By (NOW - 10, NOW)
15
Michael J. Franklin15 StreQuel Summary Streams are our primitives – Capture both tuple sequences and relations – All operations are closed under streams Flexible window support Other aspects: – support for multiple notions of time – support for events (tuples) that are not totally ordered (distributed systems) – great name A work in progress
16
Michael J. Franklin16 Not just windows… Other Issues TinyDB: “Run this CQ (with min acceptable sample rates) and make sure I don’t have to change any batteries for 3 months” Dealing with dirty/missing/late data Correlating across different time domains Adjusting sample rates to topology of network.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.