Download presentation
Presentation is loading. Please wait.
1
AQuery A Database System for Order Dennis Shasha joint work with Alberto Lerner {lerner,shasha}@cs.nyu.edu
2
Idea Whatever can be done on a table can be done on an ordered table (arrable). Not vice-versa. AQuery – query language on arrables Expresses many queries easily Easy to optimize
3
Query 1 Find the packets whose length is greater than twice the average packet length over the last 1 hour SELECT* FROMPackets ASSUMING ORDER time WHERElen > 2*avgs(range(3600,time),len)) time 3 4 7 10 14 22 23 len 10 5 12 32 5 7 10 pID a b c d e f g Packets vectors ordering vector for this query Semantics are column-oriented as opposed to row-oriented
4
Vector Expressions time 3 4 7 10 14 22 23 len 10 5 12 32 5 7 10 v1=range(3600,time) windows’ ranges v1 0 1 2 3 1 2 v2 10 7.5 9 14.7 13.5 6 7.3 v2=avgs(v1,len) last hour avg len * Using range 10 here for the sake of the example WHERE F T F len > 2*v2 filter out false positions... WHERElen > 2*avgs(range(3600,time),len)) 3 positions and the current *
5
Query 2 Find when more than 20 type ‘A’ squirrels were at Jennifer’s backyard. Suppose a flag +1 signals squirrel entry, and –1, exit. SELECTtime[index(sums(flag)>20)] FROMSquirrelSensor SS, SquirrelType ST ASSUMING ORDER time WHERESS.sID=ST.sID ANDST.type = ‘A’ ANDSS.region = ‘JWyard’ time... region... sID... SquirrelSensor flag... type... sID... SquirrelType
6
Vector Indexing SELECTtime[index(sums(flag)>20)]... flag +1 +1... +1 Jennifer’s backyard squirrel in squirrel out sums 1 2 1 2 3... 20 21 20 19 >20 F... F T F i-th position is true time[i] time
7
Query 3 Find when 3 different squirrels within a pair-wise distance of 5 meters from each other chirp within 10 seconds of each other SELECTS1.sID, S1.loc, S2.sID, S2.loc, S3.sID, S3.loc FROMSquirrelChirps S1, SquirrelChirps S2, SquirrelChirps S3 WHERES1.sID<>s2.sID AND S1.sID<>s3.sID AND s2.sID<>s3.SID ANDS1.time-S2.time < 10 AND S1.time-S3.time < 10 AND S2.time-S3.time < 10 ANDdistance(S1.loc,S2.loc)<5 AND distance(S1.loc,S3.loc)<5 AND distance(S2.loc,S3.loc)<5 time... loc... sID... SquirrelChirps
8
Vector Fields b1234567 b1234567 azyyxzxyazyyxzxy r b 1 5 2 3 7 4 6 azyxazyx gby a (r) b1523746 b1523746 azzyyyxxazzyyyxx Flatten(gby a (r)) 4 6 Non-grouped columns become vector fields respecting order. Flatten brings the arrable back to 1NF
9
Query 4 – use of vector fields Create a log of flow information. A flow from src to dest ends after a 2-minutes silence SELECTsource, dest, count(*), sum(len) FROMPackets ASSUMING ORDER time GROUP BY source, dest, sums(deltas(time)) > 120) dest... src... pID... Packets len... time...
10
And Streams? AQuery has no special facilities for streaming data, but it is expressive enough. Idea for streaming data is to split the tables into tables that are indexed with old data and a buffer table with recent data. Optimizer works over both transparently.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.