dop d d 1 2 reconst reconst sop P P 1 2
a u c t i o n i t e m i t e m i d : 5 1 d e s c : b i d i d : 4 3 3 d e s c : 1 9 7 1 T r e k 5 9 M a r t i n S u p e r l i g h t G u i t a r b i d d e r : a m t : R o a d B i k e J o e $ 1 5
a u c t i o n i t e m i d : 5 1 b i d b i d d e r : a m t : S u e $ 1 5 5
a u c t i o n i t e m i t e m i d : 5 1 d e s c : b i d b i d i d : 4 3 3 d e s c : 1 9 7 1 T r e k 5 9 M a r t i n S u p e r l i g h t G u i t a r b i d d e r : a m t : b i d d e r : a m t : R o a d B i k e J o e $ 1 5 S u e $ 1 5 5
Differential Nest Old Value New Value Subject, Title Subject, Title Subject, Title Subject, Title produce partial result ( null, null, Google, {Title1}), ( null, null, Microsoft, {Title2, Title3}) (Google, Title1), (Microsoft, Title2), (Microsoft, Title3) (Google, {Title1}, Google, {Title1, Title4}) (Google, Title4) (Google, {Title1,Title4}, Google, {Title1, Title4, Title 5}) (Google, Title5) but what you’d really like to send is: (Google, {Title5}) and “merge” it with: (Google, {Title1,Title4}) Subject: Google Title: Title1 Title: Title4 Title: Title5 Title:Title4 Merge
Merge Example Combined Inserted Used in Match Merged Document auction item item iid:501 desc: Trek Madone 5.9 Bike bid bid iid:433 desc: 1971 Martin Guitar bidder: Dave amt: $1500 bidder: Sue amt: $1550 Merged Document auction auction item item item iid:501 desc: Trek Madone 5.9 Bike bid iid:433 desc: 1971 Martin Guitar iid:501 bid bidder: Dave amt: $1500 bidder: Sue amt: $1550 Auction Document New Bid
Merge Template (MT) (auction, [], NoContentNoAttrs) auction (item, [iid], NoContentNoAttrs) item (iid, [], ExactMatch) (desc, [], ShallowContent - Replace) iid:501 (bid, [bidder, amt], NoContentNoAttrs) bid (bidder, [], ExactMatch) (amt, [], ExactMatch) bidder: Sue amt: $1550 Merge Template is an XML document consisting of a tree of Element Merge Templates (EMT) EMT is a triplet containing: (name, local key, content combine function)
View Merge as Least Upper Bound auction item item iid:501 desc: Trek Madone 5.9 Bike bid bid iid:433 desc: 1971 Martin Guitar bidder: Dave amt: $1500 bidder: Sue amt: $1550 Merged Document (D3) D3 is “smallest” document that “contains” D1 and D2 auction auction item item item iid:501 desc: Trek Madone 5.9 Bike bid id:433 desc: 1971 Martin Guitar iid:501 bid bidder: Dave amt: $1500 bidder: Sue amt: $1550 Auction Document (D1) New Bid (D2)
What can go wrong? No unique result (no Least Upper Bound (LUB)) Keys in Merge Template eliminate ambiguity Know D4 is correct result if we know iid is a key for item auction auction item item item iid:501 iid:433 iid:501 iid:433 D3 D4 auction auction Id as key to eliminate D4 item item iid:501 iid:433 D1 D2
Non-Key-Respecting Documents auction auction (auction, [], NoContentNoAttrs) item item item (item, [iid], NoContentNoAttrs) iid:501 iid:433 iid:501 iid:433 (iid, [], ExactMatch) D3 D4 T means contained in. D is contained in D′ if there is a structure-preserving mapping from D into D′ D3 is not key-respecting with respect to T and should not be in LT. auction auction item item iid:501 iid:433 D1 D2
Merge-Lattice Theorem Overview D3 ρ(D1) ρ(D2) LT D1 D2 ρ(D1) ρ(D2) ρ1 ρ2 Associate each document D with a unique path set ρ(D) ρ(D1) ρ(D2) is a Least Upper Bound (LUB) for ρ(D1) and ρ(D2) ρ(D1) ρ(D2) is the “smallest” set that contains both ρ(D1) and ρ(D2) Intuition: Merge of D1 and D2 should be the document associated with ρ(D1) ρ(D2)
Document and Path Set Use Merge Template + document to create path set auction[]: auction[].item[id:501]: auction[].item[id:501].id[]:501 auction[].item[id:501].desc[]:Trek Madone 5.9 Bike auction[].item[id:501].bid[bidder:Dave,amt:$1500]: auction[].item[id:501].bid[bidder:Dave,amt:$1500]. bidder[]:Dave amt[]:$1500 auction item iid:501 desc: Trek Madone 5.9 Bike bid bidder: Dave amt: $1500 auction[].item[iid:501].desc[]:Trek Madone 5.9 Bike rooted key value element content Use Merge Template + document to create path set One element in path set for each element in document Path comprised of rooted key value and element content Path set order (subset) identical to document containment order
Proof that D3 is in L D3 3 σ σ-1 (=ρ3) T 2 ρ2 ρ(D1) ρ(D2) D2 1 ρ2-1 ρ1 D1 ρ1-1 Construct D3 from ρ(D1) ρ(D2), show D3 is compatible and key-respecting with respect to T
Figure 2: Using Panes to Evaluate Query 1 (item-id, bid-price, timestamp) (10, $9.12, 12:15:52 PM) t1 (11, $8.93, 12:16:42 PM) t2 (11, $9.20, 12:16:49 PM) t3 (pane-max, pane-timestamp) ($9.12, 12:15:52 PM) p1 ($9.20, 12:16:49 PM) p2 (win-max, timestamp) ($9.12, 12:15:52 PM) w1 ($9.20, 12:16:49 PM) w2 streamscan window-max (bid-price) RANGE = 1 min SLIDE = 1 min WATTR = timestamp RANGE = 4 min WATTR = pane-timestamp Figure 2: Using Panes to Evaluate Query 1
Figure 5: Cost Ratio of Pane vs. Original Window-Id Approach
Figure 6: Band Disorder
Figure 7: Block-sorted Disorder
Figure 8: Latency vs. Accuracy Band-Disorder (average error percentage)
Figure 9: Latency vs. Accuracy Block-Sorted-Disorder (average error percentage)
Figure 10: Latency vs. Accuracy Block-Sorted- Disorder (percentage of wrong answers)
Figure 1: Four detection stations in a detection task SUSPECT PRESUMPTIVE CONFIRMED INTERCEPTED Figure 1: Four detection stations in a detection task (from Yonnel Gardes, The Transpo Group, Kirkland, WA, with permission)
Figure 4: Example of insertion, initialization, and update of bins as new tuples arrive.
Figure 8 (b): Execution Time: WID versus Buffering – Zoom-in
Figure 10: Latency vs. Accuracy Block-Sorted- Disorder (percentage of incorrect answer) external punctuation
count(*) count (*) bucket streamscan (group on window-id, (sensor-id, room-id, timestamp, temperature) (2, C, 00:05:58PM, 80°) T0 (2, C, 00:06:05PM, 82°) T1 (item-id, bid-price, auction-site, timestamp, pane-id) (3, 5, 00:05:58PM, 80°, 5-5) T0 (*, C, * , * , 5) P1 (2, C, 00:06:05 PM, 83°, 6-6) T1 streamscan count (*) (group on pane-id, auction-site) bucket RANGE = 1 min SLIDE = 1 min WINATTR = timestamp RANGE = 5 SLIDE = 1 WINATTR = pane-id count(*) (group on window-id, (auction-site, pane-id, count, timestamp) (3, 5, 8, 00:05:42PM) M0 timestamp, window-id) (3, 5, 8, 00:05:42PM, 6-10) M0