Download presentation
Presentation is loading. Please wait.
Published byEstella Bryan Modified over 9 years ago
1
Blocking, Monotonicity, and Turing Completeness in a Database Language for Sequences and Streams Yan-Nei Law, Haixun Wang, Carlo Zaniolo 12/06/2002
2
Outline Database Application in ATLaS Turing Completeness Data Streams Blocking and Monotonicity
3
User Defined Aggregates (UDAs) Important for decision support, stream queries and other advanced database applications. UDA consists of 3 parts: INITIALIZEITERATETERMINATE
4
Standard aggregate average AGGREGATE myavg(Next Int) : Real {TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE : { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; } TERMINATE : { INSERT INTO RETURN SELECT tsum/cnt FROM state; }}
5
Online aggregation Standard average aggregate returns results at TERMINATE state. Online aggregate: Get results before input all the tuples e.g. return the average for every 200 input tuples. e.g. return the average for every 200 input tuples. RETURN statements appear in ITERATE instead of TERMINATE.
6
Online averages AGGREGATE online_avg(Next Int) : Real {TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE: { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; INSERT INTO RETURN SELECT sum/cnt FROM state WHERE cnt % 200 = 0; } TERMINATE : { } }
7
Table from external source Access tables stored in external source 'C:\mydb\employees'. TABLE employee(Eno Int, Name Char(18), Sal Real, Dept Char(6)) SOURCE 'C:\mydb\employees'; SELECT Sex, online_avg(Sal) FROM employee WHERE Dept=1024 GROUP BY Sex;
8
Calling other UDAs UDAs can call other UDAs, including recursively calling themselves. Example: Computation of Transitive Closure (the graph contains no directed cycle). Find all the nodes that is reachable from ‘000’. Find all the nodes that is reachable from ‘000’.
9
Transitive Closure TABLE dgraph(start Char(10), end Char(10)) SOURCE mydb; AGGREGATE reachable(Rnode Char(10)) : Char(10) {INITIALIZE: ITERATE: { INSERT INTO RETURN VALUES (Rnode) INSERT INTO RETURN SELECT reachable(end) FROM dgraph WHERE start=Rnode; }} SELECT reachable(dgraph.end) FROM dgraph WHERE dgraph.start='000';
10
Turing Completeness Turing Completeness is not a trivial property for a language. SQL is not Turing Complete. A Turing Machine is a tuple M=(Q, , , ,q 0,!,F). TM is determined by 4 elements Transition map ; Transition map ; Accepting states F; Accepting states F; Input tape ; Input tape ; Initial state q 0. Initial state q 0.
11
Turing Machine for L 3eq = {a n b n c n |n≥1} Transition map: Input tape: aaabbbccc Accepting state: {z} Initial state: p State\Symbolabcx! p[marka]q,x,1 q[findb]q,a,1r,x,1q,x,1 r[findc]r,b,1s,x,1r,x,1 s[find!]s,c,1t,!,-1 t[finda]u,a,-1t,b,-1t,c,-1t,x,-1v,x,1 u[findx]u,a,-1p,x,1 v[check]v,x,1z,x,0
12
Turing Machine for L 3eq INSERT INTO transition VALUES ('p','a',1,'q','x'), ('q','a',1,'q','a'), ('t','a',-1,'u','a'), ('u','a',-1,'u','a'), ('q','b',1,'r','x'), ('r','b',1,'r','b'), ('t','b',-1,'t','b'), ('r','c',1,'s','x'), ('s','c',1,'s','c'), ('t','c',-1,'t','c'), ('q','x',1,'q','x'), ('r','x',1,'r','x'), ('t','x',-1,'t','x'), ('u','x',1,'p','x'), ('v','x',1,'v','x'), ('s','!',-1,'t','!'), ('t','!',1,'v','x'), ('v','!',0,'z','x'); INSERT INTO accept VALUES ('z'); INSERT INTO tape VALUES ('a',0),('a',1), ('a',2), ('b',3), ('b',4), ('b',5), ('c',6), ('c',7),('c',8);
13
Turing Machine in ATLaS(1) For each iteration: pass the current state, current symbol, position on the tape to UDA called turing. pass the current state, current symbol, position on the tape to UDA called turing. If is defined, obtain the next state, next symbol and movement of the head. If is defined, obtain the next state, next symbol and movement of the head. 1.current state next state; 2.current symbol next symbol; 3.next position = current position+movement. If is not defined, TM halts. check whether current state is an accepting state.
14
Turing Machine in ATLaS(2) TABLE current(stat Char(1), symbol Char(1), pos Int); TABLE tape(symbol Char(1), pos Int); TABLE transition(curstate Char(1), cursymbol Char(1), move int, nextstate Char(1), nextsymbol Char(1)); TABLE accept(accept Char(1));
15
Turing Machine in ATLaS(3) AGGREGATE turing(stat Char(1), symbol Char(1), curpos Int) : Int {INITIALIZE: ITERATE: { /*If TM halts, return 1 or 0 (accept or reject)*/ INSERT INTO RETURN SELECT R.C FROM (SELECT count(accept) C FROM accept A WHERE A.accept = stat) R WHERE NOT EXISTS ( SELECT * FROM transition T WHERE stat = T.curstate AND symbol = T.cursymbol);
16
Turing Machine in ATLaS(4) /* write tape */ DELETE FROM tape WHERE pos = curpos; INSERT INTO tape SELECT T.nextsymbol, curpos FROM transition T WHERE T.curstate = stat AND T.cursymbol = symbol;
17
Turing Machine in ATLaS(5) /* add blank symbol if necessary */ INSERT INTO tape SELECT '!', curpos + T.move FROM transition T WHERE T.curstate = stat AND T.cursymbol = symbol AND NOT EXISTS ( SELECT * FROM tape WHERE pos = curpos +T.move);
18
Turing Machine in ATLaS(6) /* move head to the next position */ INSERT INTO current SELECT T.nextstate, A.symbol, A.pos FROM tape A, transition T WHERE T.curstate = stat AND T.cursymbol = symbol) AND A.pos=curpos+T.move; }}
19
Turing Machine in ATLaS(7) INSERT INTO current SELECT 'p', A.symbol, 0 FROM tape A WHERE A.pos = 0; SELECT turing(stat, symbol, pos) FROM current;
20
Data Streams(1) For stream processing and continuous queries, results must be returned promptly, without waiting for future input. Non-blocking operators are suitable. No terminate state is needed.
21
Data Streams(2) ATLaS UDAs is a powerful tool for approximate aggregates, synopses, and adaptive data mining. Stream is a source of data instead of table. Data from a stream are time-stamped. Example: Compute the average of long distance calls placed by each customer.
22
Data from a stream STREAM calls(customer_id Int, type Char(6), minutes Int, Tstamp: Timestamp) SOURCE mystream; SELECT periodic_avg(S.minutes) FROM Calls S WHERE S.type = 'Long Distance‘ GROUP BY S.customer_id
23
Average of the last 200 inputs AGGREGATE periodic_avg(Next Int, Time: Timestamp) : (Real, Timestamp) {TABLE state(tsum Int, cnt Int); INITIALIZE : { INSERT INTO state VALUES (Next, 1); } ITERATE : { UPDATE state SET tsum=tsum+Next, cnt=cnt+1; INSERT INTO RETURN SELECT tsum/cnt, Time FROM state WHERE cnt % 200 = 0; SELECT tsum/cnt, Time FROM state WHERE cnt % 200 = 0; UPDATE state SET tsum=0, cnt=0 WHERE cnt % 200 = 0; }}
24
Average on a five-minute window(1) AGGREGATE window_avg(Next Int, Time Timestamp): (Real,Timestamp) {TABLE state(tsum Int); TABLE memo(mNext Int, mTime Timestamp); INITIALIZE : { INSERT INTO state VALUES (Next); INSERT INTO memo VALUES (Next, Time); }
25
Average on a five-minute window(2) ITERATE: { INSERT INTO memo VALUES (Next, Time); UPDATE state SET tsum=tsum+Next - ( SELECT sum(mNext) FROM memo WHERE mTime+ 5 MINUTES < Time); DELETE * FROM memo WHERE mTime+ 5 MINUTES < Time; INSERT INTO RETURN SELECT tsum, Time FROM state; }}
26
Blocking and Monotonicity A blocking query operator is a query operator that is unable to produce the first tuple of the output until it has seen the entire input e.g. myavg in Example 1 e.g. myavg in Example 1 A non-blocking query operator is one that produces all the tuples of the output before it has detected the end of the input. e.g. online_avg in Example 2 e.g. online_avg in Example 2
27
Definition Sequence of Length n: S=[t 1,…,t n ]. [ ] has length 0. Presequence: S k =[t 1,…t k ], 0<k≤n. S L: L k = S for some k. ( is partial order.) For an operator G, G j (S) is the cumulative output produced up to step j by G. blocking: G j (S) = [ ] for j<n. non-blocking: G j (S)= G j (S j )=G(S j ). partial blocking: [ ] < G j (S) < G(S j ).
28
ATLaS UDA is blocking: returns answers only in TERMINATE state; blocking: returns answers only in TERMINATE state; non-blocking: returns answers only in INITIALIZE or ITERATE states; non-blocking: returns answers only in INITIALIZE or ITERATE states; partially blocking: returns answers in TERMINATE and other states. partially blocking: returns answers in TERMINATE and other states.
29
A function F can be computed using a non-blocking operator if F is monotonic wrt . ATLaS is complete wrt non-blocking computations. Every monotonic function can be computed by an ATLaS program with no TERMINATE state.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.