Presentation is loading. Please wait.

Presentation is loading. Please wait.

AQuery A Database System for Order Dennis Shasha joint work with Alberto Lerner

Similar presentations


Presentation on theme: "AQuery A Database System for Order Dennis Shasha joint work with Alberto Lerner"— Presentation transcript:

1 AQuery A Database System for Order Dennis Shasha joint work with Alberto Lerner {lerner,shasha}@cs.nyu.edu

2 Idea  Whatever can be done on a table can be done on an ordered table (arrable). Not vice-versa.  AQuery – query language on arrables  Expresses many queries easily  Easy to optimize

3 Query 1  Find the packets whose length is greater than twice the average packet length over the last 1 hour SELECT* FROMPackets ASSUMING ORDER time WHERElen > 2*avgs(range(3600,time),len)) time 3 4 7 10 14 22 23 len 10 5 12 32 5 7 10 pID a b c d e f g Packets vectors ordering vector for this query Semantics are column-oriented as opposed to row-oriented

4 Vector Expressions time 3 4 7 10 14 22 23 len 10 5 12 32 5 7 10 v1=range(3600,time) windows’ ranges v1 0 1 2 3 1 2 v2 10 7.5 9 14.7 13.5 6 7.3 v2=avgs(v1,len) last hour avg len * Using range 10 here for the sake of the example WHERE F T F len > 2*v2 filter out false positions... WHERElen > 2*avgs(range(3600,time),len)) 3 positions and the current *

5 Query 2  Find when more than 20 type ‘A’ squirrels were at Jennifer’s backyard. Suppose a flag +1 signals squirrel entry, and –1, exit. SELECTtime[index(sums(flag)>20)] FROMSquirrelSensor SS, SquirrelType ST ASSUMING ORDER time WHERESS.sID=ST.sID ANDST.type = ‘A’ ANDSS.region = ‘JWyard’ time... region... sID... SquirrelSensor flag... type... sID... SquirrelType

6 Vector Indexing SELECTtime[index(sums(flag)>20)]... flag +1 +1... +1 Jennifer’s backyard squirrel in squirrel out sums 1 2 1 2 3... 20 21 20 19 >20 F... F T F i-th position is true time[i] time

7 Query 3  Find when 3 different squirrels within a pair-wise distance of 5 meters from each other chirp within 10 seconds of each other SELECTS1.sID, S1.loc, S2.sID, S2.loc, S3.sID, S3.loc FROMSquirrelChirps S1, SquirrelChirps S2, SquirrelChirps S3 WHERES1.sID<>s2.sID AND S1.sID<>s3.sID AND s2.sID<>s3.SID ANDS1.time-S2.time < 10 AND S1.time-S3.time < 10 AND S2.time-S3.time < 10 ANDdistance(S1.loc,S2.loc)<5 AND distance(S1.loc,S3.loc)<5 AND distance(S2.loc,S3.loc)<5 time... loc... sID... SquirrelChirps

8 Vector Fields b1234567 b1234567 azyyxzxyazyyxzxy r b 1 5 2 3 7 4 6 azyxazyx gby a (r) b1523746 b1523746 azzyyyxxazzyyyxx Flatten(gby a (r)) 4 6 Non-grouped columns become vector fields respecting order. Flatten brings the arrable back to 1NF

9 Query 4 – use of vector fields  Create a log of flow information. A flow from src to dest ends after a 2-minutes silence SELECTsource, dest, count(*), sum(len) FROMPackets ASSUMING ORDER time GROUP BY source, dest, sums(deltas(time)) > 120) dest... src... pID... Packets len... time...

10 And Streams?  AQuery has no special facilities for streaming data, but it is expressive enough.  Idea for streaming data is to split the tables into tables that are indexed with old data and a buffer table with recent data.  Optimizer works over both transparently.


Download ppt "AQuery A Database System for Order Dennis Shasha joint work with Alberto Lerner"

Similar presentations


Ads by Google